And one could go a step further: run the software itself and show that it discriminates. One doesn't just have to look at past performance of the software; it can be fed inputs tailored to bring out discriminatory performance. In this way software is more dangerous to the defendant than manual hiring practices; you can't do the same thing to an employee making hiring decisions.
How would you make sure that the supplied version has the same weights as the production version? And wouldn't the weights and architecture be refined over time anyway?
Perjury laws. Once a judge has commanded you to give the same AI, you either give the same AI, or truthfully explain that you can't. Any deviation from that and everyone complicit is risking jail time, not just money.
"this is the June 2020 version, this is the current version, we have no back ups in between" is acceptable if true. Destroying or omitting an existing version is not.
Not that not having backups is something that you can sue the company for as an investor. If you say we have the June 2020 version, but not the july one you asked for you are fine, (it is reasonable to have save daily backups for a month, monthly backups for a year, and then yearly backups). Though even then I might be able to sue you for not having version control of the code.
If a non-hired employee brings a criminal action, this may matter.
For a civil action, the burden of proof is "preponderance of evidence," which is a much lower standard than "beyond a reasonable doubt." "Maybe the weights are different now" is a reasonable doubt, but in a civil case the plaintiff could respond "Can the defendant prove the weights are different? For that matter, can the defendant even explain to this court how this machine works? How can the defendant know this machine doesn't just dress up discrimination with numbers?" And then it's a bad day for the defendant to the tune of a pile of money if they don't understand the machine they use.
> How would you make sure that the supplied version has the same weights as the production version?
You just run the same software (with the same state database, if applicable).
Oh wait, I forgot, nobody knows or cares what software they're running. As long as the website is pretty and we can outsource the sysop burden, well then, who needs representative testing or the ability to audit?