AI Explainability Explained: When the Black Box Matters and When It Doesn’t

22 July 2025
Key Takeaways:
  • In this Debevoise Data Blog Post, we identify several categories of GenAI decision explainability and attempt to provide a specific name for each (e.g., model explainability vs. process explainability vs. data explainability).
  • Next, we explore which kinds of explainability are knowable for certain common GenAI decisions and which are not and compare that to human decision-making, which, in some ways, is also largely unknowable.
  • We then argue that, for most GenAI decisions, a level of explainability on par with what is expected with human decision-making is currently achievable.

No one really knows how the large language models (“LLMs”) that power generative AI (“GenAI”) tools like ChatGPT actually come up with their answers to our queries. This is referred to as the “black box” or the “explainability” problem, and it is often given as a reason why GenAI should not be used for making certain kinds of decisions like who should get a job interview, a mortgage, a loan, insurance, or admission to a college. For those important decisions, the thinking goes, impacted individuals are entitled to know precisely how the decisions were made so they can correct errors, appeal adverse outcomes, or improve their application for future consideration, which is something GenAI models cannot do. But that thinking rests on two assumptions that merit further discussion: (1) that we don’t know how GenAI decisions are made and (2) that we do know how human decisions are made.

In this Debevoise Data Blog Post, we identify several categories of GenAI decision explainability and attempt to provide a specific name for each (e.g., model explainability vs. process

explainability vs. data explainability). Next, we explore which kinds of explainability are knowable for certain common GenAI decisions and which are not and compare that to human decision-making, which, in some ways, is also largely unknowable. We then argue that for most GenAI decisions, a level of explainability on par with what is expected with human decision making is currently achievable.

Why LLM Decision-Making Is Hard to Understand

Decisions made by non-AI algorithms, even very complex ones, can be understood because there is a deterministic “if-X then-Y nature” to the way those models work. They follow a predetermined path, and each branch in the decision tree is inspectable, so even though the decision process may be quite complex, with enough time, a reviewer can trace exactly how each specific input yielded a specific output. For these non-AI models, the same input should always lead to the same output.

LLM decisions, by contrast, are made by deep neural networks, with billions of parameters that convert text into high dimensional vectors and transform those vectors across dozens of hidden layers. The resulting outputs are not predetermined but instead are probabilistic and therefore defy step by step tracing. For these neural network systems, the same input will most often result in similar but different outputs.

It is helpful to think of how we understand LLM decision-making in the same way as how we understand the functioning of our immune systems. At a high level, the question “how did you get sick?” can be answered accurately with something like “I was run down, wasn’t sleeping well, and I went to that house party where I must have caught a virus from someone.”

However, at a more granular level, the true answer involves an almost infinitely complex interplay of genes, immune cells, bacteria, viruses, bodily fluids, and environmental factors, the interactions of which are virtually impossible to trace or explain. But a true granular answer isn’t required for you to seek medical attention or for a doctor to make an accurate diagnosis and prescribe an effective treatment. Knowing the true answer might, in some circumstances, make these steps easier, but it is certainly not required.

Categories of Explainability

Before we get into the various kinds of explainability, we should first distinguish explainability from two related but separate concepts: transparency and interpretability.

Transparency vs. Interpretability vs. Explainability

When discussing the risks associated with AI use cases, the term explainability is often conflated with transparency and interpretability. Although there are no widely accepted official definitions for each of them, and many of the official definitions are inconsistent, we think it is helpful to separate these three concepts as follows:

  • Transparency: Disclosure to an individual that they are interacting with an AI tool or with content that was generated in whole or in part by AI, so they are not misled into thinking that they are interacting with a human or with content that was generated entirely by a human.
  • Interpretability: For any particular decision made using AI or any AI output, how understandable, meaningful, and helpful is the decision or the output for the intended user?
  • Explainability: For any particular AI decision or AI output, how easy is it for someone to understand the process by which the decision or the output was generated?

For example, we used ChatGPT-4o to generate the cover art for this blog post. To be transparent, we disclose at the end of this blog post that cover art was generated by AI. To make sure that the final cover art is interpretable, Diane went through several drafts before selecting a version that conveys the desired message to the reader and connects properly to content of the blog. But whether generative cover art using GPT-4o is explainable depends on what kind of explainability we are referring to and why it matters.

Five Different Kinds of Explainability

Staying with the cover art example, we can achieve what is referred to as process explainability by disclosing our entire image-generating methodology, including the fact that Diane worked with ChatGPT-4o over the course of 20 minutes on July 16, 2025, to generate the image. We can show the initial prompt that Diane used, the initial results she received, all the subsequent iterations, as well as any changes resulting from Avi’s final review and approval.

We can also achieve what is referred to as design explainability by providing information about the AI model we used. The GPT-4o model is currently considered the best ChatGPT model for image generation because of its native image generation capabilities, its ability to accurately render text within images and follow complex prompts, and the user’s ability to create detailed and realistic images directly within the chat interface.

We would struggle, however, to provide data explainability or model explainability. For data explainability, we know Diane’s inputs, and we have some sense of the data used to train GPT-4o that the model considers when it creates a new image in response to a prompt, but we don’t know what particular dataset it relied on in creating the cover art.

For model explainability, we know almost nothing about why exactly the model chose that particular font or color scheme for the cover art, or how it settled on the size and positioning of the circles containing the five explainability categories. That would require visibility into the model’s prompt to token embeddings, use of transformers, vectorization, attention weights, and token-to-pixel diffusion.

But does our lack of explainability matter here? For this particular use case, almost certainly not. Rather, what matters is interpretability and transparency. The lack of data explainability or model explainability could, however, matter for a different kind of decision.

When Explainability Matters

Suppose you apply for a mortgage. You might receive a letter in a week stating that your application was denied because your credit score was low, your debt-to-income ratio was high, and the value of your collateral was insufficient. It might also state that your credit score was low because of two delinquent accounts and high credit utilization. In practice, your application was likely processed by an algorithmic (not GenAI) model that gave your application a score and a recommendation. A human underwriter likely reviewed the file and made a decision, either agreeing or disagreeing with the model’s recommendation. The creditor is required by regulation to make a reasonable and good-faith determination that the borrower will be able to repay the mortgage, which is a judgment call.

So, in terms of explainability, there is process explainability and data explainability. You also are entitled to know the most important factors that impacted the decision, which you need so that you can make sure the key information impacting the decision is accurate and decide whether to appeal the decision for some reason. It’s also important to know this so that you know exactly what you can do to improve your application for next time. This is what we call rationale explainability—the identification of the most substantial factors that impacted the outcome or the primary drivers of the decision. That is what is required by the U.S. Federal Equal Credit Opportunity Act (“ECOA”); you are entitled to a statement of the specific reasons why you were denied the mortgage.

What is not required is the equivalent of model explainability—all of the factors that were considered in making the decisions and how each of those factors was weighed and came together to produce the output or make the decision because that is not really possible in many instances and is not necessary.

Model Explainability and the Limits of Human Decision-Making Explainability

As we said at the outset, there are certain aspects of LLM decision-making that are not yet knowable. If you ask the LLM why it gave you a particular output, it will provide an answer, but it almost certainly won’t be accurate because the LLM doesn’t actually know how it makes its decisions. The LLM is not conscious and self-aware of its decision-making process. How it makes decisions at a granular level is not in its training materials because that is not something that we fully understand. But is that so different than human decision-making?

When the loan officer made her final decision to deny you credit, did she really know exactly all the factors that went into her decision and how important they all were relative to one another? Did her overall view of the economy and where she thinks it is headed factor into the decision? Does she know exactly why she was skeptical that your side hustle would maintain its current level of income?

Similarly, if asked why a law firm decided to make an offer to a particular law student to join its summer program, each lawyer who interviewed the candidate may give a different explanation, and the hiring committee may not all fully agree as to which factors were determinative. So, the firm cannot provide model explainability for those decisions. For that, they would need to show exactly how much weight, if any, was given to the candidate’s GPA, their particular law school, how many other offers we’ve made to students from that school, their proficiency in Arabic, their time spent at Teach for America, their body language during the interview, and so forth—something they cannot do, but again, they don’t need to do.

What they can do is provide rationale explainability: every interviewer was impressed by the candidate’s intellectual capability and communication skills, agreed the candidate would be a good cultural fit for the firm, and felt that the candidate’s impressive academic record and prior experience clearly exceeded the standard we have for extending an offer.

They can also provide data explainability and process explainability (i.e., what data was considered in making the decision, who considered it, and how was the decision made). That, combined with the rationale explainability, is likely more than sufficient explainability for their summer associate offer process.

None of this is to say that companies could or should use GenAI alone to make hiring decisions. But suppose a well-trained AI resume review tool is used to provide a preliminary score of 1-10 for a selection of resumes based on a blend of different factors like grades, previous work experience, and public service. And suppose the machine could tell you the two factors that weighed most heavily in its decision, but it could not tell you exactly how each of the factors were weighed against each other. Is that really any different than what one could reasonably expect from a human who had reviewed and scored the resumes?

Appendix – The Five Categories of Explainability


Process Explainability
helps users understand the role of AI and the role of humans in creating the content by explaining the various steps taken in the process.


Design Explainability
helps users understand the purpose, training, key features, and functionality of the AI model or models used to generate a specific output.


Data Explainability
helps users understand the data used to train and operate the model, including where it came from and why it is suitable for a particular task.


Model Explainability
requires an understanding of the inner workings of the model, including the specific mechanisms or factors that caused the model to produce a specific output.


Rationale Explainability
requires an understanding of the factors that the model weighed most heavily or the primary drivers of a particular AI decision or output.

 

 

This publication is for general information purposes only. It is not intended to provide, nor is it to be used as, a substitute for legal advice. In some jurisdictions it may be considered attorney advertising.