Racist or sexist algorithms are not a good look, but even global tech giants get it wrong.
Today’s businesses are increasingly reliant on technologies like big data, artificial intelligence (AI) and machine learning (ML) for information and decision-making.
With this comes new challenges, and one that periodically hits the headlines is discrimination arising from biased data, such as Amazon’s sexist CV screening, or Google’s facial recognition that confused black faces with primates.
Even the smallest businesses now have access to advanced data technologies, routinely using AI-powered cloud apps to automate decisions HR, advertising, finance and other core functions.
What’s less common is checking if the way that technology is used has allowed unintentional bias or discrimination to creep into organisations.
We’re not talking about intentional racism or overt misogyny, but a reflection of wider diversity and inclusion challenges into a business through its technology. This generally takes one of two forms: bias in business data stored and used by technology; or bias in the logic designed into business technology.
Bias In Business Data
Biased data has become a problem with the rise of AI and ML, because these technologies “learn” how to do things by analysing relevant historic data, called “training data”. Data scientists design them to use statistics and maths to analyse patterns in training data, and replicate similar results with new data.
For example, an AI fraud prevention system could learn how to spot suspicious card use by analysing past transactions, identifying characteristics that distinguish fraudulent from legitimate activity. The system isn’t told what to look for in historic transactions.
Instead, it examines all possible details, including related data from other sources, to determine how to identify fraud in future transactions.
The problem with historic data is that it may contain biases reflecting past behaviours and practices, which could find their way into the AI being trained. For example, an organisation’s workforce may be predominantly male because of past industry patterns.
An AI recruitment system trained on historic hiring data might mistakenly infer that male employees are more suitable, leading to automated CV screening that discriminates against female candidates.
An obvious-sounding solution is to simply remove references to sources of bias, like gender and race, from training data. But that isn’t always possible, or even advisable. For example, diversity monitoring needs such data, and scrubbing it removes the possibility of investigating bias later.
However, there are other techniques data scientists can use to spot and remove even subtle bias from training data. Examples include choice of learning model, data engineering procedures and combining different algorithms.
The details of these techniques aren’t important, especially if you’re not a data scientist. What matters is ensuring technology teams explicitly review for potential bias any data sets used to develop and train technology.
It’s also important to ensure any steps taken to remove bias can be reviewed in future.
Bias in Systems Logic
There are two main ways discrimination could arise from the design of a system. Both have the same result - technology that behaves in a biased way because it’s been inadvertently designed to do so.
The first is unintended bias in requirements, when a requested feature unintentionally favours one group over another. For example, it may be possible to investigate gaps in a candidate’s career history in an interview (if handled appropriately). But including this as an assessment factor for automated CV screening will create a bias against several groups, such as female applicants.
This kind of problem isn’t necessarily easy to spot, and needs good understanding of both business details and data science. One approach to removing it is to ensure procedures for testing IT systems includes tests for discrimination.
The IT industry is still working out what that means in practice, so there aren’t yet standard ways of doing this. Nevertheless, business and IT teams should be able to develop use cases and test scripts to detect bias, in the same way they check any other requirement is met.
The other source of system design bias only applies to AI, and is called “algorithm bias”. This is when the logic in an AI system - known as its algorithm - contains inadvertent discrimination.
For example, there may be algorithmic bias in an AI loan approval system that includes postcode in its calculations, because of the complex links between ethnicity and where people live.
Again, there are data science techniques to detect and prevent algorithmic bias. Which ones your AI teams choose will vary with circumstances. What should be constant is a requirement for AI teams to explain and demonstrate to business stakeholders how algorithmic bias has been addressed in any AI used in an organisation.
If technology introduces discrimination into an organisation through systems or data, it of course should be removed, and hopefully the tips in this article will help. But it’s important to remember that a degree of bias is inevitable in human behaviour, and that form of bias is addressed by mechanisms such as policies and education.
So if you observe discrimination that seems to come from a system, before exploring systems and data solutions, make sure you’ve first established that it’s not a result of biased human behaviour elsewhere.
Was Rahman is an expert in the ethics of artificial intelligence, the CEO of AI Prescience and the author of AI and Machine Learning. See more at www.wasrahman.com
Thanks for signing up to Minutehack alerts.
Brilliant editorials heading your way soon.
Okay, Thanks!