The Agent Lab.
We build AI agents. In the safest sandboxes for regulated industries.
Vertical agents that run in real operations. Auditable, evaluted before Go-Live, built for real bottlenecks in health insurance, regular insurance, and banks.
Clients & partners










Where AI actually delivers and where the effort isn't worth it.
In regulated industries (insurance, banking, public health funds) complexity grows faster than your team can. More cases, data, and compliance pressure. The first reflex is to plug in AI somewhere. The better path is to figure out together where AI sustainably delivers results and where a script, a process change, or no automation at all is the right answer. Not every customer knows up front where the real problem sits. That's where we start.
Frontier Research
Custom development at the state of the art.
The Agent Lab - Vertical agents as a finished product.
What fits a packaged product comes from The Agent Lab. Production-ready vertical agents from claims processing to dispute hearings to product data maintenance.
What changes when AI agents are deployed in the real world?
Only 14% of agent pilots reach production (Gartner 2026). The gap is rarely the model, but data quality, backlog pressure, workflow complexity, and audit readiness. Here are the four bottlenecks where Agent Lab delivers.
Heterogeneous input formats
Problem
Operations staff translate between formats and systems.
Entry errors.
Time lost before the actual work begins.
Solution
Emails, PDFs, scans, and voice messages are normalized.
Fed directly into the case workflow.
Growing Compliance Complexity
Problem
The EU has hard deadlines for how AI should act.
And until when this needs to be proven.
Additional documentation requirements.
Solution
Discover our compliance assistantComplex case volumes no team can oversee
Problem
Long processing times.
Partial decisions on incomplete context.
Knowledge loss at handovers.
Solution
Full case context (records, policies, history) is evaluated.
The agent proposes a decision for the caseworker to approve.
Poor data quality & inconsistency
Problem
Wrong decisions and weak customer experience.
Constant manual corrections.
Growing compliance risk.
Solution
Data is normalized, validated, and classified in the workflow.
Before it flows into core operations.
When does an agent make sense for your unit?
Four questions we clarify at the start:
More than 1 month of processing backlog in the process?
Stable process (no major overhaul planned in 12 months)?
Can you define today what "correctly processed" means?
Is there an owner in Operations (not IT) who will drive the rollout?
3 of 4 with "Yes" → the agent pays off. We clarify the exact fit in the ellaverse workshop.
Start with a workshopWhere Agent Lab works today.
GOÄ-based claim review
End-to-end review of medical claims against the German GOÄ fee schedule, no human in the loop.
- Visual verification against the original PDF scan instead of blind trust in OCR
- Complex GOÄ exclusion and capping rules (e.g. code 70 excludes codes 1 and 2 on the same day)
- Full reasoning traces for every individual decision
Accuracy doubled from 40% to 93% case correctness on a 30-claim evaluation. Same model, structured domain instructions.
Three vertical agents with an anchor customer
Three vertical agents in joint development with a GKV anchor customer. Prototypes are running; production go-live in the coming months.
- Hearing procedures: Document analysis and automated responses for disputes with hospitals
- Self-payer claims: Contribution calculation and notice generation
- WUV automation: Efficiency and inefficiency review procedures
Concrete output numbers will be published together with the customer once production goes live.
Multimodal product-listing remediation
Key account managers face supplier catalogs with missing attributes: Material, dimensions, GPSR mandatory disclosures. Buried in manufacturer texts and product images.
- Several hundred SKUs blocked from sale per mid-sized vendor portfolio
- Several hundred more products held back by visibility defects
- Dozens fail GPSR mandatory disclosures (new EU rule since Dec 2024)
Manufacturer texts and product images are evaluated multimodally; missing attributes are extracted and normalized before the KAM ever steps in.
What our agents are built on.
We're not a sales funnel with an AI wrapper. We do the research that EU and federal programs fund and put the same engineering depth into every agent we ship.
Every agent ships through ellarun.
AI Act Art. 12 audit trail, credential brokering, policy enforcement. The video shows how an Agent Lab agent runs securely inside an ellarun sandbox.
Where does your organization stand on production AI? Find out in 2 minutes.
Take the AI readiness assessmentEU AI Act ready
Full audit trails, compliance documentation, and evidence-based reporting. Ready for August 2026.
Open-source foundation
Built on NVIDIA OpenShell and open standards. No vendor lock-in, no black boxes.
Made in Germany
German company, German data centers. Your data never leaves the EU. Built and operated under European data protection standards.
Model agnostic
Works with Claude, GPT, Mistral, Llama, or your own models. Switch providers anytime without retooling your workflows or losing evaluation history.
Careers at ellamind
We're hiring people who want to build AI that stands up to reality: regulation, scale, and responsibility. Open positions available across engineering, AI, product, and sales.
Most asked questions
Find answers to frequently asked questions. If you can't find your question here, feel free to contact us.
What is the The Agent Lab? +
What kinds of agents has ellamind already built? +
How does an engagement with ellamind start? +
Can your agents help with EU AI Act and other compliance? +
Do I need technical expertise to work with ellamind agents? +
Unlock the power of AI
See how our products can help you evaluate, deploy, and monitor AI agents with confidence.