Unveiling the "Why": Empowering AI to Explain its Predictions in Clear Language

Human-understandable explanations for machine learning models: A new approach

Large language models (LLMs) are being used to develop a new approach to explanation methods for machine learning models. These explanations aim to help users understand when and how they should trust a model’s predictions.

The challenge of complex explanations

Machine learning explanations are often complex and difficult for users to understand, especially those who lack expertise in the field. This can be a barrier to building trust in these models.

A new approach: Transforming explanations into plain language

To address this challenge, researchers at MIT have developed a two-part system called EXPLINGO. This system uses LLMs to transform complex explanations into plain language narratives.

How EXPLINGO works

NARRATOR: The first component of EXPLINGO uses an LLM to create narrative descriptions of explanations. Users can customize the style of these narratives by providing examples of the type of explanation they want to see.

GRADER: The second component uses an LLM to evaluate the quality of the narrative explanation on four metrics: conciseness, accuracy, completeness, and fluency. Users can customize the weights of these metrics depending on their needs.

Benefits of this approach

  • Makes explanations more understandable and accessible to users who lack expertise in machine learning.
  • Allows users to customize the explanations to meet their specific needs.
  • Helps users to build trust in machine learning models.

Future work

The researchers hope to expand EXPLINGO by adding rationalization to the explanations and enabling users to ask follow-up questions about the model’s predictions. This could further improve the understanding and usability of machine learning models.