>any sort of ML applied to credit data will run afoul of the equal credit opportunity act.
That can't be right because the existing and very widely used FICO score is already a rudimentary form of "machine learning" applied to credit data.[1] (FICO secret formula correlates payment histories, credit-card balances, income, etc to calculate probabilities of loan default.) Clearly, automated machine analysis of _credit data_ is legal even though minorities have lower FICO credit scores than whites and subsequently get less approvals for loans.
The paper is talking about something else: the application of ML to non-credit data. Examples of datapoints such as:
MacOS vs Windows
iOS vs Android
GMail vs Hotmail
lowercase vs Proper Case when writing
etc.
Those non-credit datapoints are collectively referred to by the paper as the "digital footprint". E.g. the authors conclude that analyzing data revealed by web browser user-agent strings to calculate a "credit risk" can correlate as well as the traditional FICO score.
The issue you're talking about for Equal Credit Opportunity is if those non-financial variables are surreptitiously used to determine "whiteness" or "blackness" (proxies for race) -- or -- if the data was innocently analyzed for debt-default patterns but nevertheless inadvertently correlates with "white"/"black" and therefore punishes minority groups.
Downvoters: please point out what is inaccurate about my comment. If I made a mistake in reading the paper, I'd like to learn what I misinterpreted.
That can't be right because the existing and very widely used FICO score is already a rudimentary form of "machine learning" applied to credit data.[1] (FICO secret formula correlates payment histories, credit-card balances, income, etc to calculate probabilities of loan default.) Clearly, automated machine analysis of _credit data_ is legal even though minorities have lower FICO credit scores than whites and subsequently get less approvals for loans.
The paper is talking about something else: the application of ML to non-credit data. Examples of datapoints such as:
Those non-credit datapoints are collectively referred to by the paper as the "digital footprint". E.g. the authors conclude that analyzing data revealed by web browser user-agent strings to calculate a "credit risk" can correlate as well as the traditional FICO score.The issue you're talking about for Equal Credit Opportunity is if those non-financial variables are surreptitiously used to determine "whiteness" or "blackness" (proxies for race) -- or -- if the data was innocently analyzed for debt-default patterns but nevertheless inadvertently correlates with "white"/"black" and therefore punishes minority groups.
Downvoters: please point out what is inaccurate about my comment. If I made a mistake in reading the paper, I'd like to learn what I misinterpreted.
[1] https://en.wikipedia.org/wiki/Credit_score_in_the_United_Sta...