So I had something similar happen to the OP a couple of days ago. I'm on friendly terms with a competing codebase's developer and have confirmed the following with them, both mine and it are closed source and hosted on github.
Halfway through building something I was given a block of code by copilot, which contained a copyright line with my competitors name, company number and email address.
Those details have never, ever been published in a public repository.
IMO that doesn’t absolve Microsoft at all. If someone uploads ripped MP3s to the internet somewhere, it doesn’t mean you could aggregate them, burn CDs and sell them.
I think that's very unlikely, they said and repeated that they are not using private code. People catching them lying on this would be very bad for GitHub.
Proof: "They said they don't use private code. Either the private code appearing is published somewhere else, or they are using private code. Lying would be bad. Therefore the code is published somewhere else, and they don't use private code".
Proposition: "They either do not use private code or they did something very very stupid."
Proof: "Not using private code is very easy (for example google does not train its models on workspace users' data, which is why they get inferior features) and they promised multiple time not to use private code so doing in would be hard to justify"
They would definitely notice such a bug. This would at least double or triple the amount of data they use. This is not something you can do by mistake.
So I had something similar happen to the OP a couple of days ago. I'm on friendly terms with a competing codebase's developer and have confirmed the following with them, both mine and it are closed source and hosted on github.
Halfway through building something I was given a block of code by copilot, which contained a copyright line with my competitors name, company number and email address.
Those details have never, ever been published in a public repository.
How did that happen?