What Is It
Oura announced on February 24, 2026 its first-ever proprietary large language model designed for women’s health. It lives inside Oura Advisor, their existing AI chatbot within the Oura app, and supports questions spanning the full reproductive health spectrum, from early menstrual cycles through menopause.
The key distinction: this proprietary model means Oura no longer relies on LLMs from Anthropic or OpenAI for this use case, but rather draws on its own foundation model.
When a user asks Oura Advisor a women’s health question, the model references its curated research and knowledge sources while also analyzing the user’s relevant biometric signals across sleep, activity, cycle, pregnancy data, and stress. The model has been intentionally designed to be non-dismissive, reassuring, and emotionally supportive — a deliberate tone choice for the sensitivity of the domain.
Why Build It
The general-purpose LLM gap in women’s health is well-documented. A dedicated women’s health study by Lumos and academic partners tested 13 large language models across 96 women’s health scenarios, finding a 60% failure rate. General-purpose models routinely miss the nuance of hormonal health, cycle-phase context, and reproductive complexity. As their clinical AI lead Dr. Jayaraman put it: the model interprets questions through the lens of what’s happening in that person’s body, not just a generic symptom search.
Their fastest-growing segment demands it. Oura’s fastest-growing user segment isn’t fitness enthusiasts — it’s women in their early twenties. Building a domain-specific model is a retention and differentiation play for this demographic.
Privacy as a competitive moat. Reproductive health data carries higher privacy stakes than general wellness data. Most consumer health apps fall outside HIPAA, and 78% of FemTech apps failed GDPR consent audits. In the post-Dobbs landscape — following the U.S. Supreme Court’s 2022 decision overturning federal abortion protections — reproductive health data carries legal risk beyond just privacy compliance. By owning the model end-to-end and hosting it on their own infrastructure, conversations are never sold, shared, or used to train public or third-party AI systems. This removes the third-party LLM provider as a data-handling risk vector entirely.
How It’s Built
Partnership with webAI
Oura’s clinical and technical staff developed the women’s health model, leveraging the knowledge-graph technology of webAI, which is their key technical partner.
webAI’s Knowledge Graph RAG combines retrieval-augmented generation with a structured knowledge graph, blending vision and language models directly within a proprietary knowledge graph. Rather than treating documents as linear text or separate image snippets, their graph-driven approach creates rich, high-dimensional relationships between textual and visual entities. In benchmarks, webAI’s Knowledge Graph RAG achieved 95% accuracy in head-to-head tests against ChatGPT o3’s 60% on complex manufacturing documentation — demonstrating why structured graph retrieval beats vanilla RAG for domain-specific accuracy.
Possible Architecture Stack
Note: Oura has not published detailed architecture documentation. The following is an informed analysis based on publicly available information about Oura’s announcements, webAI’s technology, and common patterns in domain-specific LLM deployments.
Layer 1 — Proprietary LLM (likely fine-tuned, not trained from scratch). They call it a “proprietary LLM,” but given the timeline and team size, it’s almost certainly a fine-tuned open-weight base model (think Llama-class or Mistral-class) rather than a pre-trained-from-scratch foundation model. The “proprietary” label likely refers to the fine-tuning data, RLHF/DPO alignment for tone (the non-dismissive, emotionally supportive behavior), and the fact that it runs on their own infrastructure. webAI’s platform supports building, training, and deploying custom AI models on private infrastructure with full transparency and control, and their runtime delivers optimized inference and training with adaptive quantization and intelligent batching.
Layer 2 — Knowledge Graph RAG for clinical grounding. This is the differentiator. Rather than a simple vector-search RAG pipeline, they use webAI’s graph-structured retrieval to encode relationships between clinical concepts (e.g., “luteal phase” → “progesterone rise” → “HRV drop” → “normal”). The model draws from a broad foundation of established medical standards, research, and knowledge sources reviewed by Oura’s in-house team of board-certified clinicians. These are ingested into the knowledge graph as structured nodes and edges, not just chunked text. This is why the model can reason about relationships between symptoms, cycle phases, and biometric patterns rather than just doing keyword matching.
Layer 3 — Biometric context injection. When a query comes in, the system retrieves the user’s personal time-series data (HRV, temperature, sleep stages, etc.) and serializes it as context alongside the KG-RAG retrieval results. This allows the model to analyze relevant biometric signals and longitudinal trends across sleep, activity, cycle and pregnancy data, stress, and more. The “Memories” feature stores information based on Oura data and conversations to provide personalized tips — acting as a persistent user-profile context layer.
Layer 4 — Tone and safety alignment. The model was specifically tuned for emotional supportiveness — this is likely a combination of RLHF/DPO on women’s health conversation data and a system-prompt-level instruction set curated by their clinical team (led by Dr. Chris Curry, board-certified OB/GYN, and Dr. Tanvi Jayaraman, clinical lead of health AI).
Putting It All Together
User question ("Why is my cycle irregular?")
↓
Intent classification → routes to women's health model
↓
Parallel retrieval:
├── Knowledge Graph RAG → pulls clinically relevant nodes
│ (cycle irregularity causes, age-related patterns, etc.)
└── Biometric context → pulls user's cycle history, HRV trends,
temperature shifts, sleep patterns, stress data
↓
Context assembly → merged prompt with clinical knowledge + personal data
↓
Fine-tuned LLM generates response
(non-dismissive tone, evidence-based, personalized)
↓
Safety/clinical guardrails → ensures no diagnostic claims
Key Takeaways
Knowledge Graph RAG outperforms vanilla RAG for health domains. The graph structure encodes relationships like “this symptom in this cycle phase with this biometric pattern means X” — something flat vector search fundamentally can’t do. For anyone building domain-specific health AI, this architecture pattern is worth studying closely.
The “proprietary model” framing is mostly a privacy and differentiation play. The core intelligence likely comes from the KG-RAG grounding and the clinical curation, not from the base model itself. This suggests that comparable results don’t require training a foundation model from scratch — a well-curated knowledge graph combined with domain-specific fine-tuning on a strong open-weight base can get there.
Self-hosting is becoming table stakes for reproductive health data. The regulatory and trust landscape following the Dobbs decision makes third-party API calls for cycle and fertility data a genuine liability risk. Oura’s move to self-hosted infrastructure with webAI is forward-looking here.
References
- Introducing Our First Proprietary AI Model to Deliver Personalized, Clinically Grounded Women’s Health Guidance — Oura’s official announcement (February 24, 2026)
- Oura launches a proprietary AI model focused on women’s health — TechCrunch coverage
- A Women’s Health Benchmark for Large Language Models — Lumos AI Labs benchmark study: 13 LLMs tested across 96 women’s health scenarios with ~60% failure rate
- Femtech Apps and Wearables and GDPR Compliance: An Empirical Study — SSRN study on FemTech GDPR consent audit failures
- webAI’s Knowledge Graph RAG beats ChatGPT o3 in Real World Manufacturing Test — webAI benchmark: 95% vs 60% accuracy on complex documentation