- Lumiera
- Posts
- 🔆 AI Decisions Exposed: Trading Mystery for Trust
🔆 AI Decisions Exposed: Trading Mystery for Trust
Generative search tools don't give the correct answers, big funding for biomedical breakthroughs, and explaining trained AI models.
🗞️ Issue 64 // ⏱️ Read Time: 7 min
Hello 👋
When an AI system denies your loan application (say adieu to your new apartment), approves a medical treatment (is that second round of antibiotics even good for your body?), or flags content for removal (shadow-ban, anyone?), should you be entitled to know why?
As AI increasingly influences high-stakes decisions in our lives, the divide between explainable and "black box" AI models raises profound questions about transparency, trust, and accountability.
In this week's newsletter
What we’re talking about: The fundamental distinction between explainable AI models with transparent decision-making processes and black box models whose internal workings remain largely opaque, even to their creators. Recent interpretability research is beginning to reveal how modern AI systems actually process information.
How it’s relevant: As AI systems make more consequential decisions in healthcare, finance, hiring, and beyond, the ability (or inability) to understand how these systems reach their conclusions directly impacts user trust, regulatory compliance, and ethical implementation. New interpretability tools are helping researchers detect when models might be fabricating explanations or susceptible to manipulation.
Why it matters: The tension between model performance and explainability presents one of the central challenges in responsible AI deployment. Organizations that effectively balance these competing priorities will be better positioned to build sustainable, trusted AI systems that deliver value while mitigating risks. As AI capabilities advance, understanding these systems becomes even more critical for ensuring safety and alignment with human values.
Big tech news of the week…
🔡 AI search engines are generally bad at declining to answer questions they can’t answer accurately, and generative search tools fabricates links and cites syndicated and copied versions of articles, according to a Report from Tow Center/Columbia Journalism Review. In other words, consider carefully if you really trust the information you get from search engines like Perplexity.
🩺 Apple is developing an AI-powered health coach and virtual "doctor." This initiative is focused on the Health app with personalized recommendations and insights derived from users' health data. The model is currently being trained on data from staff physicians and specialists across various medical fields.
👋 Joelle Pineau, Meta's head of AI research, has announced her departure from the company. Pineau was instrumental in advancing Meta's AI ethics and transparency efforts and championed the importance of sharing research findings and tools with the broader scientific community. Time will tell how her departure will impact Meta’s AI narrative.
🫠 OpenAI removed the free-tier access to its GPT-4o image generation tool just 24 hours after launch due to a combination of legal, ethical, and technical concerns. The decision followed a viral surge of AI-generated images mimicking the distinctive style of Studio Ghibli. CEO Sam Altman stated that demand was so high it caused their GPUs to "melt."
💰️ Isomorphic Labs, an AI-first drug design and development company, raised $600 million in its first external funding round. They aim to apply their pioneering AI drug design engine to deliver biomedical breakthroughs. Their breakthrough model, AlphaFold 3, was developed and released in May 2024 together with Google DeepMind, with the ability to accurately predict the structure and interactions of all of life’s molecules.
The main tension: Performance vs Explainability
Exciting developments in AI interpretability research are offering unprecedented glimpses into how these systems actually "think", meaning that we can understand the reasons as to why one outcome is prioritised over others. This represents a significant leap forward in our ability to understand AI systems that have historically been opaque black boxes.
This week, we're exploring the critical balance between model performance and interpretability that organizations must navigate.
Here's the central tension: The most powerful AI models today, those capable of remarkable feats in language understanding, image recognition, and pattern detection, are often the least explainable.
This creates a fundamental dilemma for organizations: Do you choose a more transparent, explainable model that might offer less impressive performance? Or do you opt for a state-of-the-art black box model that delivers superior results but can't explain its reasoning? Let’s start from scratch and look at the main differences between explainable models and non-explainable models.
Explainable vs. Non-Explainable AI: Back to Basics
Not all AI is created equal when it comes to transparency. To understand the landscape, we need to distinguish between two broad categories: Explainable AI Models and Black Box Models. The distinction isn't binary. There's a spectrum of interpretability, with trade-offs at each point. Generally, as model complexity and performance increase, explainability tends to decrease.
Explainable AI Models
These are systems whose decision-making processes can be understood, interpreted, and explained in human terms. See the table below for some examples of explainable models.

How do you balance the trade-off between model complexity and explainability in your AI applications?
Black Box AI Models
Black Box Models, or trained models, are complex systems whose internal workings remain largely opaque, even to their creators.
When we talk about training a machine learning model, we're describing a process where the model develops its own ability to make decisions by examining many examples, rather than following explicitly coded rules.
Here's what happens during training:
Starting point: A machine learning model begins with randomly initialized parameters (weights and biases) that connect its inputs to outputs. These parameters are essentially the "knobs" that determine how the model processes information.
Learning from examples: The model is shown training data - pairs of inputs and their correct outputs. For example, images of handwritten digits and their corresponding labels (0-9).
Making predictions: Using its current parameters, the model makes predictions on these inputs.
Measuring error: The model compares its predictions to the correct answers and calculates how far off it was (the loss or error).
Adjusting parameters: Through optimization algorithms like gradient descent, the model automatically adjusts its internal parameters to reduce this error. It shifts weights up or down to improve future predictions.
Iteration: This process repeats thousands or millions of times, with the model gradually improving its predictions by fine-tuning its parameters.

The Transparency Dilemma
Looking at these two tables above, you might have made the connection already: The most powerful AI models are often the least explainable. So, when do you choose which model, considering there’s a tradeoff between understandability and performance?
The answer depends on context:
High-risk applications in healthcare, criminal justice, or financial services may legally require explainability
Consumer applications might prioritize performance over transparency when the stakes are lower
Regulated industries face external constraints that may limit their options regardless of performance considerations
Recent breakthroughs in AI interpretability research are beginning to challenge this paradigm. For example, Anthropic researchers have made progress in peering inside large language models like Claude to understand their internal mechanisms. They've developed techniques that reveal surprising insights, like the fact that multilingual models operate in a shared conceptual space across languages, that models plan rhymes in poetry many words ahead, and that models sometimes make up plausible-sounding but false reasoning.
Check out our previous newsletter where we explore practical techniques for understanding AI's decision-making processes.
Why Explainability Matters
Explainability isn't just a technical consideration: It's fundamental to responsible AI implementation for several critical reasons. We’re listing seven of these reasons here:
Trust and adoption: Users are more likely to trust and accept AI systems when they understand how decisions are made. In a recent industry survey, 65% of potential AI users cited "inability to understand how the system works" as their primary concern.
Detecting and mitigating bias: Transparent models make it easier to identify when AI systems are reproducing or amplifying existing social biases. Explainable AI helps ensure fairness across different demographic groups.
Regulatory compliance: Emerging regulations like the EU AI Act and existing frameworks like GDPR increasingly require that high-risk AI systems provide explanations for their decisions.
Debugging and improvement: When models make mistakes, understanding why is essential for fixing them. Opaque models make troubleshooting nearly impossible.
Scientific advancement: Interpretable models can reveal new insights about the domains they're applied to, contributing to knowledge discovery rather than just making predictions.
Alignment with human values: As models become more powerful, ensuring they act in ways aligned with human intentions becomes crucial. Understanding model behavior helps confirm it's pursuing the objectives we actually want.
Detecting fabricated reasoning: New research shows that AI models sometimes provide plausible-sounding but made-up rationales for their conclusions. Interpretability tools can help distinguish when an AI is genuinely reasoning versus "bullshitting," as one research paper colorfully puts it.
Real-World Applications: Where Explainability Counts
Different domains have different explainability requirements. Here's how various sectors are navigating the transparency challenge:
Financial Services 💹
Banks using AI for credit decisions must generally provide reasons for denials. This has led to a two-tier approach: using complex models for initial analysis, then converting insights into more explainable models for final decisions and customer explanations.
Healthcare 🏥
Medical AI exists on a spectrum. Diagnostic support tools generally require high explainability so doctors can verify the reasoning, while back-office applications like scheduling optimization might accept less transparency in exchange for efficiency.
Content Moderation 👮🏾
Social media platforms face unique challenges with scale. While users demand transparency about why content was removed, explaining every automated moderation decision at scale remains technically challenging. Do you want to know more about content moderation? Read this!
Autonomous Systems 🚕
Self-driving vehicles use a mix of explainable and black-box approaches, with critical safety functions typically requiring more transparent algorithms that can be verified.
Breakthroughs in AI Interpretability
Recent research from Anthropic (the company behind the LLM Claude) reveals fascinating insights into how large language models actually "think." By developing techniques to visualize the internal circuits of these models, researchers discovered:
Models plan ahead: When writing poetry, Claude plans rhymes many words in advance and constructs lines to reach those planned words.
Models use multiple languages simultaneously: When answering questions in different languages, Claude activates the same core conceptual features regardless of the language, suggesting a universal "language of thought."
Models sometimes fabricate explanations: When asked to compute difficult mathematical problems, Claude can produce plausible-sounding reasoning that doesn't match its actual internal calculations.
Models use unexpected strategies: For simple addition problems, Claude employs parallel computational paths that work simultaneously - one for rough approximation and another for precise calculation of the final digit.
The Path Forward
As AI becomes increasingly woven into the fabric of society, the tension between performance and explainability will only grow more pronounced. Organizations that proactively address this challenge, selecting the right level of transparency for each context and investing in techniques to enhance explainability, will be better positioned to build sustainable, trusted AI systems.
The fundamental question isn't whether explainability matters, but rather how we balance it against other priorities in different contexts. By being thoughtful about these trade-offs, we can harness AI's potential while maintaining the transparency and accountability essential to responsible deployment.
Until next time.
On behalf of Team Lumiera
Lumiera has gathered the brightest people from the technology and policy sectors to give you top-quality advice so you can navigate the new AI Era.
Follow the carefully curated Lumiera podcast playlist to stay informed and challenged on all things AI.
What did you think of today's newsletter? |

Disclaimer: Lumiera is not a registered investment, legal, or tax advisor, or a broker/dealer. All investment/financial opinions expressed by Lumiera and its authors are for informational purposes only, and do not constitute or imply an endorsement of any third party's products or services. Information was obtained from third-party sources, which we believe to be reliable but not guaranteed for accuracy or completeness. |