Mitigating AI Bias With Open Source?

Krish
AI Sutra
Published in
3 min readOct 2, 2019

--

As AI enters the society in full speed, we face the problem of AI bias. Human society is full of biases which come to hurt marginalized sections in every type of bias. As we move towards an AI driven society with orders of magnitude change in the impact, it is critical for us to ensure that these systems have no or limited bias. AI bias could creep in because of multiple factors but the most important ones are:

  • Algorithms: The unfortunate effect of the past social ills has led to a situation where there is very little diversity among people building these AI algorithms (goes beyond just the coders). This lack of diversity is a definitive source of AI bias and it is just not limited to certain social bias we all know of and pushing back hard. The potential for bias goes beyond the top checklist items in social bias and the lack of diversity in algorithm builders
  • Data: The data we feed to train the AI models comes from the real world. This data is a representation of bias that exists in the real world. As these data train the AI models, the bias will creep into the models as we have seen in some publicly available examples
  • Access: The third important contribution to the bias is access to AI or, rather, the lack thereof. There is definitely an inequality in access to AI, driven by lack of availability of talent and data. This is another major source for AI bias because of lack of representation across all social and geographical divisions

Many people are suggesting open source as an antidote against AI bias. Most of their worries stem from the fact that we will have very little clue on how the algorithms work if they are proprietary. Even though I agree that open source can have some impact in reducing the AI bias, it is very limited for reducing the bias. Let me explain why.

The problem with deep neural networks is not the logic embedded in the code per se. It is about how data is used to create these models. There is limited visibility into how the application code is used to generate these models. Even if the source code of the application is available, we may not predict how models are generated or how the trained models will evolve. I am worried that focussing more on the license might give a false sense of confidence and create blind spots that could lead to bias. We need to think beyond the source code to ensure transparency that will limit the bias that could enter either through the algorithm or data.

However, open source can play a significant role in reducing the bias that could come because of lack of access. Platforms like Tensorflow and other open source machine learning / deep learning / AI platforms makes the technology available for people from all over the world. This ease of availability of these platforms can help mitigate some of these biases because diverse groups can take part in creating the AI models. Open source licenses can level the playing field, making AI available for more people from different parts of the world. But, there is no evidence that this will reduce the bias as we have a little clue on the net impact of various social biases from across the globe on AI.

In short, I just want to make a case that combating AI bias is a complex problem and let us not approach it with a tunneled vision of open source being a cure for this bias. We need to think outside the box and think of solutions that we can borrow from other fields of study. The war against AI bias should take into account that the underlying landscape is complex and simple solutions that worked in the previous IT era may not be enough.

--

--

Future Asteroid Farmer, Analyst, Modern Enterprise, Startup Dude, Ex-Red Hatter, Rishidot Research, Modern Enterprise Podcast, and a random walker