IIT-H researchers develop method to understand what goes on inside AI programs

The DL algorithms are trained on a limited amount of data that are most often different from real-world data
The IIT Hyderabad team (Pic: IIT-H)
The IIT Hyderabad team (Pic: IIT-H)

Indian Institute of Technology (IIT)- Hyderabad researchers have developed a method by which the inner workings of Artificial Intelligence (AI) models can be understood in terms of causal attributes.

'Artificial Neural Networks' (ANN) are AI models and programs that mimic the working of the human brain so that machines can learn to make decisions in a more human-like manner.

Modern ANNs, often also called Deep Learning (DL), have increased tremendously in complexity such that machines can train themselves to process and learn from data that has been supplied to them as input, and almost match human performance in many tasks.

However, how they arrive at decisions is unknown, making them less useful when the reason for decisions is necessary. This work has been performed by Dr Vineeth N Balasubramanian, Associate Professor, Department of Computer Science and Engineering, IIT Hyderabad, and his students Aditya Chattopadhyay, Piyushi Manupriya, and Anirban Sarkar.

Their work has recently been published in the Proceedings of 36th International Conference on Machine Learning, considered worldwide to be one of the highest-rated conferences in the area of Artificial Intelligence and Machine Learning.

Speaking about this research, Dr Balasubramanian said, "The simplest applications that we know of Deep Learning (DL) is in machine translation, speech recognition or face detection. It enables voice-based control in consumer devices such as phones, tablets, television sets and hands-free speakers. New algorithms are being used in a variety of disciplines including engineering, finance, artificial perception and control and simulation. Much as the achievements have wowed everyone, there are challenges to be met."

A key bottleneck in accepting such Deep Learning models in real-life applications, especially risk-sensitive ones, is the 'interpretability problem'.

"The DL models, because of their complexity and multiple layers, become virtual black boxes that cannot be deciphered easily. Thus, when a problem arises in the running of the DL algorithm, troubleshooting becomes difficult, if not impossible," said the professor.

The DL algorithms are trained on a limited amount of data that are most often different from real-world data. Furthermore, human error during training and unnecessary correlations in data can result in errors that must be corrected, which becomes hard. "If treated as black boxes, there is no way of knowing whether the model actually learned a concept or a high accuracy was just fortuitous," added Dr Balasubramanian.

Related Stories

No stories found.
logo
EdexLive
www.edexlive.com