You wrote a program that rigs the input data in favor of one class in such a way that the classifier results are more or less uncorrelated with membership in that class. Am I understanding the code correctly?
I don't see how you eliminated "human biases present in the data annotations" here. It seems like your program merely rigs the input in order to get the kind of output you like to see.
Obviously you can tamper with the data to get any kind of result you want. You can force equal outcomes for "Never-worked" and "Private" but that doesn't mean you're removing some inherent bias in the original data.
Also, this sentence:
> All of this excludes the fact that companies will probably maximise to profit, and will use "algorithm" as an excuse to turn down people disregarding ethics.
is kind of weird in the context of your comment and program. If you are correct in your fears of biased data, then accounting for the bias in the data maximizes profit for the companies. Ethics shouldn't be necessary if it's just about bias in the input data.
Yes, in the last example you can see it as rigging the input data. I replicated the results of the experiment in the paper, and I made some remarks of this sorts in my own report and discussion.
The idea is that a sensitive parameter should not contribute to a decision, so, for example, the probability of group A or group B having access to a loan should be the same:
P(Loan|A) should be the same as P(Loan|B)
This is dangerous, as the discrimination metric is sensitive to biases present in the data. It can, effectively, make it easier for the discriminated group A to get a Loan in comparison to a person in group B in same situations. This happens if the bias is not in the annotations, but in the demographics of the dataset.
This is a really interesting problem, and I don't have answers for it.
I don't see how you eliminated "human biases present in the data annotations" here. It seems like your program merely rigs the input in order to get the kind of output you like to see.
Obviously you can tamper with the data to get any kind of result you want. You can force equal outcomes for "Never-worked" and "Private" but that doesn't mean you're removing some inherent bias in the original data.
Also, this sentence:
> All of this excludes the fact that companies will probably maximise to profit, and will use "algorithm" as an excuse to turn down people disregarding ethics.
is kind of weird in the context of your comment and program. If you are correct in your fears of biased data, then accounting for the bias in the data maximizes profit for the companies. Ethics shouldn't be necessary if it's just about bias in the input data.