Since you mentioned music, can we draw any parallel with the recent Ed Sheeran case? You can't copyright a chord progression, or a rhythm. So can you claim license violation for code which is similar in how it arranges standard design patterns and well-known language idioms? Unless you can find a line-for-line match in a substantial portion of the code, I don't see how you can make a claim for license violation, whether a person or an AI wrote the code in question.
The problem is not "training using GPL", it's stripping the license and providing the resulting code to anyone, for any purpose.
Copilot can provide very specific implementations of problems with "in the style of $NAME" prompts. Consider a case where you put your life's work as GPL licensed code, and someone can reproduce + adapt your highly optimized matrix multiplication code with a simple prompt, without license. Even if it's not a "textbook license violation", you're lifting one's GPL licensed function and landing it to your codebase without its license. If your code base is not licensed under the same GPL version (or later if the repo allows), it's both unethical and license breach at the same time. Adaptation of the code doesn't matter.
Same is true for more strict, source-available licenses. They are open source, but not open to be reused. What will happen if you put a function derived from a codebase with these strict licenses with or without knowledge? You're again in a dangerous grey area from both license and legal perspective.
The issue we discuss is neither straightforward nor simple to navigate. I left GitHub because of this, and may tag my repositories with this badge.
Open source means nothing if copyleft is taken out of the picture, and licenses are simply ignored.
"The problem is not "training using GPL", it's stripping the license and providing the resulting code to anyone, for any purpose."
This.
Since Copilot arrived I thought that most open source developers would be fine with it if github simply even tried to acknowledge the original licences in any way.
I personally would be ok with even a very indirect aggregate group based thing that should be no burden at all for github. They make a big list with everyone's name in it and call it the copilot contributors, and provide some kind of page for it, then when copilot spits out code, it includes a link to that page, and/or a user includes that in the credits/authors for their project like any other credited source.
No excuses about how impractical it would be to cite 3 other authors for every line of output.
But they don't even do that tiny bit. They don't try and fall short, they don't even try. But they still take the goods. The goods are already free, and yet they still manage to steal them.