On Overcoming AI Bias

Published in

AI Sutra

4 min readMar 22, 2018

I have been thinking about AI bias for long and today’s IBM Research Science Slam session at IBM Think 2018 conference gave me another chance to think about it again. Francesca Rossi gave a talk about AI Ethics from IBM Research point of view (see the video above). I also talked about it in a podcast today with Val Berovici of Pencil Data. In this post, I am going to set the context for future discussions by touching upon factors that influence bias in AI and how, in certain cases, we have to forcibly introduce bias.

The problem of AI bias is a tricky problem. Left unchallenged, this could lead to situations that will make the society look more tribalistic but, at the same time, letting Silicon Valley define the values could be problematic. What is needed is an open conversation on how we can ensure that AI driven society is a reflection of our value system. Also, our value system continuously evolves and we need to ensure that this evolution in our value system is taken care off through data or algorithmic tweaks. This is not an easy problem to solve but it is easy to start talking about this problem now so that we are in a good position to handle this when it is time to act.

Sources of AI Bias

Even though data is a critical source for AI bias, it is just one of the many that we need to consider. I am listing four sources of bias from my point of view and would love to hear your thoughts on this.

Biased Data: This is the biggest source of AI bias. Let us face the basic facts. The world has been evolving in their value systems and discriminatory behavior is still part and parcel of every society. The dataset produced by such a world will encompass all kind of biases from past to present. Such a data will skew AI and noticing the cause of bias and fixing it is not easy, with all the AI systems being a blackbox
Algorithmic Bias: Even though computers are getting better at writing code, humans are still responsible for developing the algorithm. Human biases are going to be pushed into the code either knowingly or unknowingly. Even the algorithms generated by machines are done by “intelligent systems” whose training data is from the humans
Geography Bias: I have long been arguing in social channels that both AI and Genomics are going to suffer big time due to lack of data from diverse geographical locations. Yes, world wide web and social media has flattened the world a bit but there is not enough data from across the world to make the training data sets representative of the world’s population. The data from certain Asian, African or South American countries are still limited compared to North America and Western Europe
Language Bias: The foundation for most of the data used in training sets is English language. China is taking a lead on Chinese language but there are many other languages that are not represented in the AI data sets

Biases related to geography and language will result in serious cultural biases which will have dramatic impact as AI becomes the underlying framework for life.

Inducing Bias

Even if we magically remove any bias from data and algorithms, we will face problems from the other side. A bias free data/algorithm will not represent our world. There are certain biases we need to have in our algorithms for AI to mimic our value systems. One good example is the role of gender in determining the punishment for violent crimes. Data clearly shows that men commit more violent crimes than women. In an AI free society, we will be giving harsher punishment for men for violent crimes than women as a way of deterrent. If we remove any bias in AI, it will give the same punishment to both men and women for violent crimes. This is not ideal for the society (without considering the advances in neuroscience and medicine which could fix some of these issues) and we need to induce a bias against men involved in violent crimes. This is ok for many people and a quick (and very unscientific) twitter poll I conducted confirms the support for biasing AI against men who commit violent crimes.

But let us now consider another scenario where data shows that certain group (based on race, religion or other human induced segregation) commits more violent crimes than other groups (this will vary by country). If I had ran a poll on this question, the result will not support inducing bias against certain groups of people. Personally, I would fight against inducing such biases because it goes against my value system. But, for many others, inducing discriminatory biases may be acceptable or a necessary evil. Dealing with such situations are not easy. If a wrong bias is induced into AI systems (remember, they are blackboxes), the consequences can be unpredictable and, often, devastating.

Inducing bias is important for the very functioning of AI based societies but the margin of error is almost zero. It is important for us to figure out how we can induce the necessary biases but still retain a way to evolve them out when it is no longer necessary. It is not an easy social problem to solve and it is definitely not an easy technology problem to solve.

It is time we, as an industry, start thinking about these socio-political issues and figure out how we are going to let AI systems handle these issues. This situation is not very far away. Already AI systems are used for punishing people in the judicial systems. AI is also creeping into our lives through many different dimensions. We have to start debating these issues NOW!!

On Overcoming AI Bias

Sources of AI Bias

Inducing Bias

Written by Krish