There is general agreement that artificial intelligence (AI), data science and analytics have potential to change the world. AI is transforming our society already, and although I see tremendous potential in AI there are pitfalls and important ethical implications that needs to be considered.
Science and ethics: Shades of grey
Just because something is possible does not mean that it is either right, or acceptable. There are many shades of grey involved. Acceptability will vary with the precise use of the technology, and also by the time and place: What is acceptable now might not have been so ten years ago. Different people also have very different views. Ethics is not a science, and there are no ‘right’ answers, only opinions.There are many shades of grey involved in science and ethics #DataScience #AI #sasacademic Click To Tweet
It is important to remember that even though many of the AI systems, per definition, are self-learning, they still cannot think for real. They have no clue what is wrong or right or good behavior, and they do not know anything about ethics and values. It is up to us, as humans, to teach them.
But how do you teach an algorithm ethics and values? Well, a good start is to consider diversity in the team working with the training and development of these algorithms! Diversity in terms of e.g. gender, ethnicity, age, background, and skills will increase our possibilities to make sure that various aspects, perspectives and values are reflected in the data and thereby the systems learning from this data.
Bias is the real AI danger
A diverse team is also key to avoid and detect bias, discriminatory views and prejudiced opinions reflected in the data the systems are learning from. Googles head of AI, John Giannandrea, recently said “forget killer robots — bias is the real AI danger”. I certainly agree with him. The AI systems learn from data. It all starts with data. All they know come from data. If we feed them biased data they will be biased. Considering that these systems can be used to make millions of decisions every minute, one can easily understand what damage a biased decision could do in very short time, by for instance amplifying discrimination. Awareness around this is crucial and data scientists have a very important task here to ensure data quality.
Interpretation is a challenge
The machine learning models behind the AI systems are, per definition, very complex and not very good at telling their secrets. They are many times a black-box where the output is extremely difficult to explain due to the nature of the algorithm. When using these types of models, you always need to consider what is most important for you – to be able to understand/explain the output or to take advantage of the incredible precision they offer. Some situations are just not suitable for black-box models.
The upcoming EU’s General Data Protection Regulation (GDPR), for instance, will put a demand on companies and organizations to be able to explain their algorithmic decision to the customer. This limits the way black-box models can be used in certain situations. By ensuring that people have a right to understand the reasoning behind important decisions affecting them, the GDPR is setting out a clear mark in ethical territory. EU legislators clearly wanted to be certain that future decisions would be both transparent and ‘fair’ to individuals.
Will there be more regulation in future? I think so. I suspect that the GDPR has set out the tone for the future by emphasizing individual control of personal data. The pendulum has swung away from organizations, and towards individuals, and it will not swing back for some time, if at all. Much is possible in data science and analytics, but what is ‘right’ remains untested, and in many cases, undiscussed. Organizations like the Alan Turing Institute are beginning to talk about detailed issues around ethics, to enable data scientists to set out some broad principles. But individual data scientists or organizations using analytics techniques are often left to pick their way through the minefield alone, making decisions case-by-case.
Educating future data scientists
To my mind, this suggests that it is important to include ethics elements in data science courses. Life science researchers or medical students need to understand the ethics of working with humans or animals. Data science students need to understand the implications of what they do for individuals and organizations. The earlier that ethical questions are raised, the more time available for discussion and thought.
I am not alone in thinking that data science students should be formally taught about ethics. A poll by the website KDnuggets back in 2015 found that over three quarters (76%) of the 324 respondents felt that data science courses should include ethics training. Of course, any website poll has issues, not least that only those who feel strongly are likely to respond, but I think the numbers are telling.
It was interesting, though, that responses varied considerably by location. In Asia, 90% thought including ethics was important, and only 5% said no. In Europe, by contrast, only 64% were positive about including ethical education, with 26% opposed. In Africa, opinion was equally divided between yes and no, with 45% for each, and 10% saying ‘don’t know’.
Do European data scientists place less value on ethics, or do they simply think that working ethically is a matter of common sense that does not need to be taught? I suspect this probably simply shows once again that time and place are an essential element of any ethical discussion, and that opinions will vary. It also shows, however, that it is important to debate these questions, and where better than on data science courses?
With raised awareness around ethical AI, I believe that man (read data scientist) and machine will result in an unbeatable team, able to transform the world and fulfil the AI promise.