advertisement
Can an algorithm be racist? Can AI be sexist? Can machine learning breed inequality?
The short answer is yes.
As computing technology advances, humans are increasingly handing over decision-making to algorithms on various aspects of our life. From identifying suspects, accessing insurance and bank loans, finding our way around a city, to what we should watch, eat or buy.
In this scenario the question to ask is – are these decisions made by algorithms fair and free of human biases?
The short answer is no.
The human mind has numerous cognitive biases that lead to racial, gender, ethnic, and class prejudices. Big tech companies have framed this as a human shortcoming that can be overcome by handing decision-making to computers. In this context, algorithms are seen as objective facts.
However, Cathy O’Neil, mathematician and data scientist, is among the growing band of technologists and researchers who vehemently disagree with this assumption. O’Neil, in the Netflix docudrama The Social Dilemma defines algorithms as “opinions embedded in code.”
Why though?
ALGORITHMS IMBIBE HUMAN BIASES
AI algorithms are trained to understand, recommend or make predictions based on massive quantities of historical data or ‘big data’. Therefore, AI and machine-learning systems are only as good as the data they are trained on.
However, data itself is often riddled with biases. For example, data used to train AI models for predictive policing has a disproportionate representation of African-American and Latino people.
In other words, algorithms imbibe the social, political, racial, and gender biases that exist within humans and in society.
ALGORITHMS NEED 2 THINGS: HISTORICAL DATA & A DEFINITION OF SUCCESS
To build an algorithm one needs two things essentially: Historical data and a definition of success.
The definition of success depends on the organisation building the algorithm. Every time we build an algorithm, we curate the data, we define success and we embed our values in it.
For an AI model that has been trained to go through historical data of employment in the engineering field, the most qualified candidates for an engineering job would be men. While the algorithm may be “successful” in identifying the most qualified candidates, it ignores the fact that the data has an overrepresentation of males.
BIAS IN DATA
Microsoft’s AI chatbot Tay is a great example of how biases in data get absorbed by AI systems. In 2016, Microsoft launched an experimental chatbot on Twitter aimed at “conversational understanding.”
The more you chat with Tay, said Microsoft, the smarter it gets, learning to engage people through "casual and playful conversation." It took less than 24 hours to turn Tay into a racist, misogynist and holocaust-denying AI.
AI algorithms, be it for machines learning or deep learning, are dependent on data, lots and lots of data to be “trained” and learn to get better as decision making.
The more data an algorithm gets, the “better” it becomes at performing specific tasks.
For example, just as crime data has over-representation of African-American and Latinos (often a result of deep-seated racial prejudices), an Amazon experiment to use AI to screen resumes efficiently had to be scrapped. Why?
Because, it built an algorithm using résumés the company had collected for a decade, but those résumés were mostly from men. That meant the system eventually earned to discriminate against women. It taught itself that male candidates were preferable based on the data it was fed.
ACCURACY VS FAIRNESS
Another crucial factor to observe is who is building the algorithms and who decides how it is deployed.
This leads us to the question of what the exact purpose of an algorithm is.
“Fairness,” isn’t a valid metric that an algorithm can meaningfully measure.
For the builder of the algorithm that is meant to identify the best engineer candidates among tons of resumes and employment data, historical gender inequality may not be a problem that the AI model would be trained to solve.
FACIAL RECOGNITION
Facial recognition has been among the most stark endorsements of algorithmic bias.
A federal study in the US conducted by the National Institute of Standards and Technology in 2019 found many of the world’s top facial recognition systems to be biased on the lines of age, gender, and ethnicity.
NIST says it found “empirical evidence” in an analysis of 189 algorithms currently sold in the market of misidentifying Asians and African Americans up to 100 times more frequently than white males.
Even Amazon got it spectacularly wrong. In a major embarrassment to the company in 2018, a test of its software called ‘Rekognition,’ it incorrectly identified 28 members of the US Congress as other people arrested for crimes.
IBM has announced that it is exiting the facial recognition business while Amazon said it would not sell its software to law enforcement for a year.
A VICIOUS SPIRAL
The most troubling fact is that algorithmic biases reinforce discrimination.
If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his zip code), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues.
Cathy O’Neil, in her seminal book Weapons of Math Destruction explains how increasingly the decisions that affect our lives – where we go to school, whether we get a car loan, how much we pay for health insurance – are being made not by humans, but by mathematical models.
As a data scientist she explains how the biases hidden within big datasets that feed mathematical models disproportionately impact the most vulnerable groups of society.
“It’s a silent war that hits the poor hardest but also hammers the middle class,” writes O’Neill.
BLACK BOX: ALGORITHMS AND ACCOUNTABILITY
Most proprietary algorithms are blackboxes. While sophisticated AI algorithms process copious amounts of granular data on different aspects of our lives, we know nearly nothing about how they work.
However, these models being used to make decisions about our lives are opaque, unregulated, and uncontestable, even when they’re wrong.
Big tech companies are framing the problem as one that they are equipped to solve (with the use of more AI).
(At The Quint, we question everything. Play an active role in shaping our journalism by becoming a member today.)