by Prabir Purkayastha 1
Machine algorithms are taking over decisions that were made by governments, business and even ourselves.
Today, algorithms decide who should get a job, which part of a city needs to be developed, who should get into a college, and in the case of a crime, what should be the sentence. It is not the super intelligence of robots that is the threat to life as we know it, but machines taking over thousands of decisions that are critical to people’s lives and deciding social outcomes.
What decides you getting a loan or not is finally a machine score – not who you are, what you have achieved, how important is your work for the country (or society); for the machine, you are just the sum of all your transactions to be processed and reduced to a simple number. The worst part is that some of the algorithms are not even understandable to those who have written them; even the creators of such algorithms do not know how a particular algorithm came out with a specific score!
Mathematician and data scientist Cathy O’Neil, in recent a book, “Weapons of Math Destruction”, tells us that the apparent objectivity of processing the huge amount of data by algorithms is false. The algorithms themselves are nothing but our biases and subjectiveness that are being coded – “They are just opinions coded into maths.”
What happens when we transform the huge amount of data that we create through our everyday digital footprints into machine ‘opinions’ or ‘decisions’? Google served ads for high-paying jobs disproportionately to men; African Americans got longer sentences as they were flagged as high risk for repeat offences by a judicial risk assessment algorithm. It did not explicitly use the race of the offender, but used where they stayed, information about other family members, education and income to work out the risk, all of which put together, was also a proxy for race.
The problem is not just the subjective biases of the people who code the algorithms, or the goal of the algorithm, but much deeper. They lie in the data and the so-called predictive models we build using this data. Such data and models simply reflect the objective reality of the high degree of inequality that exist within society, and replicates that in the future through its predictions.
What are predictive models? Simply put, we use the past to predict the future. We use the vast amount of data that are available, to create models that correlate the ‘desired’ output with a series of input data. The output could be a credit score, the chance of doing well in a university, a job and so on. The past data of people who have been ‘successful’ – some specific output variables – are selected as indicators of success and correlated with various social and economic data of the candidate. This correlation is then used to rank any new candidate in terms of chances of success based on her or his profile. To use an analogy, predictive models are like driving cars looking only through the rear-view mirror.
A score for success, be it a job, admission to a university, or a prison sentence, reflects the existing inequality of society in some form. An African American in the USA, or a dalit or a Muslim in India, does not have to be identified by race, caste or religion. The data of her or his social transactions are already prejudiced and biased. Any scoring algorithm will end up with a score that will predict their future success based on which groups are successful today. The danger of these models are that race or caste or creed may not exist explicitly as data, but a whole host of other data exist that act as proxies for these ‘variables’.
Such predictive models are not only biased by the opinion of those who create the models, but also the inherent nature of all predictive models: it cannot predict what it does not see. They end up trying to replicate what they see has succeeded in the past. They are inherently a conservative force trying to replicate the existing inequalities of society.
The Artificial Intelligence community is waking up to the dangers of such models taking over the world. Some of these models are even violations of constitutional guarantees against discrimination. There are now discussions of creating a US Algorithm Safety Board, such that algorithms can be made transparent and accountable. We should know what is being coded, and if required, find out why the algorithm came out with a certain decision: the algorithms should be auditable. It is no longer enough to say “the computer did it”.