> An (IMO) Interesting question is how to reduce the risks of things like this happening.
I look forward to finding out if this was a “fraud system gone wrong” or a more basic ledger system failing to do sums correctly.
Partially addressing your question though, if you were to insert the words “AI” and “bias” into the sentence we as an industry are starting to figure this out. The certification and testing processes you mentioned are there in cases where a team’s mature enough to have both a data and model lifecycle worked out. You see words like MLOps trying to describe how to do that effectively in production.
For example, my work has both a design approach (in both the product design touchy/feely sense and software architecture sense) that includes questions and practices that will help to reason through data needed to address a problem, what can go wrong with that, and how things look when it goes wrong. The last bit is the most interesting one to me. In terms of practical engineering, inference results generally should have some sense of lineage - of data, model, and training services which explain how you got to a given answer, including what inputs were considered or ignored.
An interesting side topic with this is that poor implementations can result in inexcusable differences that affect downstream systems. For example, if a particular model has predicted something like “this transaction is suspected to be fraud” it better be consistent from run to run, and the input data better be consistent over time. If either of those changed - explaining that to the consumers of the data is essential to them understanding that either the model changed, the data changed, or both.
I look forward to finding out if this was a “fraud system gone wrong” or a more basic ledger system failing to do sums correctly.
Partially addressing your question though, if you were to insert the words “AI” and “bias” into the sentence we as an industry are starting to figure this out. The certification and testing processes you mentioned are there in cases where a team’s mature enough to have both a data and model lifecycle worked out. You see words like MLOps trying to describe how to do that effectively in production.
For example, my work has both a design approach (in both the product design touchy/feely sense and software architecture sense) that includes questions and practices that will help to reason through data needed to address a problem, what can go wrong with that, and how things look when it goes wrong. The last bit is the most interesting one to me. In terms of practical engineering, inference results generally should have some sense of lineage - of data, model, and training services which explain how you got to a given answer, including what inputs were considered or ignored.
An interesting side topic with this is that poor implementations can result in inexcusable differences that affect downstream systems. For example, if a particular model has predicted something like “this transaction is suspected to be fraud” it better be consistent from run to run, and the input data better be consistent over time. If either of those changed - explaining that to the consumers of the data is essential to them understanding that either the model changed, the data changed, or both.