Smart Isn’t Enough: Why AI Needs to Understand Like a Domain Expert

In every organisation we work with, we see the same tension playing out. On one side: business teams wrestling with complex challenges, where critical knowledge is fragmented, hard to access, and difficult to transfer. On the other side: AI engineers building powerful models and systems, often without enough visibility into the messy, nuanced context where these systems are supposed to create value. This gap is what drives us at Faktion, where we live at the intersection of domain expertise and cutting-edge AI. Our mission is to turn complex knowledge problems into scalable, intelligent, and intuitive solutions that even business users with no technical AI knowledge can use and benefit from.

Built not as flashy proof-of-concepts, but as real systems that solve real problems in the real world.

How LLMs Can Help… and Why They Often Don’t (Yet)

Let me illustrate this with two cases: one at Ferranti, the other at Sweco. Ferranti develops ERP systems for the energy sector. Sweco works on large-scale infrastructure and environmental projects. Both depend heavily on deep domain expertise. In these environments, delivering high-quality service depends entirely on people having timely, accurate knowledge at their fingertips.

And therein lies the problem. Knowledge is scattered across documentation, legacy systems, and most critically, in the heads of a few seasoned experts. Onboarding new staff is slow and expensive. Scaling operations is a challenge. Knowledge management repeatedly shows up as a bottleneck for growth.

But recent progress in AI and more specifically, in large language models (LLMs) offers a way forward. Today's LLMs are remarkably good at parsing large information sources, understanding question context, and surfacing relevant answers. We’ve taken that to the next level by building research agents at Faktion: autonomous AI systems that break complex queries into subquestions, explore different knowledge sources in parallel, and synthesise their findings into structured, source-backed answers. This isn’t a keyword search. It’s reasoning, like a team of experts collaborating on a shared problem.

And yet… despite how promising it sounds, many organisations get stuck in the prototype phase. The technology works, but it never reaches production. Why? Because too many teams rely on raw model capabilities, without aligning them deeply with domain expertise.

LLMs only create real value when they operate at the level of the domain expert. And to get there, we need more than just computation; we need a strategy. Ours is based on three pillars.

The Three Keys to Expert-Level AI Systems

1. Deep understanding of the knowledge base.
We give our models a structured, semantic understanding of documentation—not just chunks and embeddings. We train tools to detect taxonomies, interpret metadata, and segment content by topic, audience, or document type. All of it is reviewed and validated by human experts. The result? A context-aware foundation that LLMs can reason with.

2. Understanding user intent.
What are users actually trying to find out? How do they phrase questions? By clustering and analyzing user intents, we can map real-world queries to the most relevant context, enabling more accurate and consistent answers—tailored to true business needs.

3. Understanding the system itself.
Our agents don’t just read the knowledge—they learn from the systems behind it. That means interpreting source code, functional logic, and known use cases. Why is this important? Because it allows us to evaluate output not just on relevance, but on its actual correctness within a real operational environment.

Building Trust Through Iteration and Evaluation

Armed with this context, we move into what we call evaluation-driven development. Domain experts continuously validate system outputs. We mine every trace of feedback, user queries, logs, and reviews and structure those signals using analytics tools. Specialised agents then derive insights that help us refine our test cases and models. This lets us move beyond vague metrics like "answer relevance" and shift toward domain-specific evaluation standards, criteria defined by what actual experts care about.

Once our models hit those benchmarks consistently, we use them to set guardrails in production. That means we don’t just hope the system works, we know it does. This is the difference between an impressive prototype and a dependable, production-grade AI system.

Evaluation-Driven Development Loop: Continuous collaboration between developers, domain experts, and AI agents to iteratively refine knowledge, autonomy, and system performance.

The Final Ingredient: User Experience

Even then, one piece remains critical: the experience itself.

Technology only becomes transformative when it's packaged in a way that users trust and understand. That means transparent reasoning, source traceability, intuitive interfaces, and clear feedback loops.

That’s why we embed UX designers, AI engineers, and domain experts together from day one. It’s only through that collaboration that we can build systems that are not only intelligent but also usable.

Because at the end of the day, the real power of AI lies not just in algorithms, but in the symbiosis between human and machine, between business context and technical capability, between scale and quality.

And that’s what we’re building at Faktion.

Part of sequence:

No items found.