ShifaMindShifaMind · A Roshan AI product

Concept-grounded ICD-10 coding for clinicians and coders.

ShifaMind reads a clinical note and returns ranked ICD-10 codes with the concept evidence behind each one. Interpretability is enforced architecturally, not bolted on after training.

0.71
Macro-F1
50-code MIMIC-IV
160
Clinical concepts
Grounded vocabulary
113K
Linked admissions
MIMIC-IV labeled corpus
Try it

See ShifaMind code a clinical note.

Pick a scenario, run the coder, and inspect the concepts behind every code. Prerendered for the demo; the platform runs against your own notes.

scenarioAcute on chronic systolic heart failure
clinical note5 sections

Chief Complaint

Progressive dyspnea on exertion and orthopnea over 1 week.

History of Present Illness

72M with known HFrEF (LVEF 30%) on lisinopril, carvedilol, and spironolactone presents with one week of progressive shortness of breath, bilateral lower-extremity edema, and a 10-lb weight gain. Sleeping in a recliner due to orthopnea. No chest pain or syncope.

Exam

BP 152/88, HR 96, RR 22, SpO2 91% on room air. JVD 12 cm. Bibasilar crackles. 3+ pitting edema in both lower extremities.

Labs & Imaging

BNP 1850 (markedly elevated). Troponin negative. CXR: bilateral pleural effusions and pulmonary vascular congestion.

Assessment & Plan

Acute on chronic HFrEF exacerbation. Started IV furosemide 80 mg, continue home GDMT, daily weights, fluid restriction 1.5 L.

Let's code this note

with ShifaMind, concept-grounded ICD-10 coding.

Accuracy comparison

Macro-F1 on MIMIC-IV top-50 (higher is better)
MIMIC-IV
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.164
0.343
0.417
0.435
0.649
0.712
Vanilla CBM
Claude 4.6
GPT-5.4
Gemini 2.5 Pro
GKI-ICD
ShifaMind
ShifaMind
Benchmarks

Ranked #1 on automated medical coding.

Highest Macro-F1 across frontier general-purpose LLMs and the latest published clinical-coding work.

0.712
Macro-F1, highest in class.

Beating Anthropic (0.343), OpenAI (0.417), and Google (0.435) on the same MIMIC-IV top-50 ICD-10 evaluation.

>60%
Improvement vs. the best general-purpose LLM.

Frontier models stretch into healthcare; ShifaMind is built for it. Same evaluation, ~0.28 absolute Macro-F1 lead.

#1
Concept-grounded by construction.

The only architecture in the comparison that produces verifiable clinical-concept evidence alongside every prediction.

Interpretability

Metrics general LLMs can't produce.

ShifaMind exposes the concept layer to evaluation, not just the final code. Three metrics quantify how honest the explanations are.

CSTPR0.704

Concept-Supported True Positive Rate

Of all truly positive diagnoses, the fraction the model both predicted correctly AND grounded in at least one correctly activated relevant concept. Tests that correct predictions come with correct evidence.

CIM1.314

Concept Influence Magnitude

Gradient-norm sensitivity of the diagnosis logits with respect to the representation feeding the diagnosis head. A larger value means the concept-grounded representation carries more signal at the prediction boundary.

CCR0.836

Concept-Conditioned Recall

Diagnosis recall restricted to samples where the relevant concept is actually present. Tests whether the bottleneck recovers the diagnosis when the right concept evidence is there.

What ShifaMind does

Three primitives, designed to be defended.

Predict

Ranked ICD-10 codes from free-text discharge summaries. Each code carries a confidence and a list of alternatives the model considered.

Explain

Concept activation shows why each code was assigned: the same clinical concepts a coder would reach for, surfaced as verifiable evidence.

Discuss

Grounded chat lets clinicians and coders interrogate any prediction in context of the note. No off-topic generation, no hallucinated concepts.

Breadth

Coverage across code systems, note types, and specialties.

Code systems
  • ICD-10-CMLive
  • ICD-10-PCSRoadmap
  • CPTRoadmap
  • SNOMED CTRoadmap
Note types
  • Discharge summariesLive
  • Inpatient progress notesLive
  • ED notesLive
  • Outpatient encountersRoadmap
Specialties
  • CardiologyLive
  • PulmonologyLive
  • Emergency medicineLive
  • EndocrinologyLive
  • OncologyRoadmap
  • PsychiatryRoadmap
For developers

Integrate ShifaMind via API.

One endpoint. Concept-grounded predictions in the response, with evidence and alternatives. No bespoke fine-tuning required.

Developer overview
Requestbash
curl -X POST https://api.roshan-ai.com/v1/shifamind/predict \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"note": "72M with HFrEF presents with dyspnea, edema, BNP 1850..."}'
Responsejson
{
  "codes": [
    {
      "code": "I50.23",
      "description": "Acute on chronic systolic heart failure",
      "confidence": 0.94,
      "concepts": [
        { "label": "orthopnea",            "activation": 0.93 },
        { "label": "lower_extremity_edema", "activation": 0.91 },
        { "label": "bnp_elevation",         "activation": 0.90 }
      ],
      "evidence": ["BNP 1850", "bilateral lower-extremity edema"]
    }
  ]
}
Security & compliance

Clinical AI you can deploy without holding your breath.

HIPAA-ready

BAA-eligible deployments. PHI never leaves the customer perimeter without explicit consent.

Encryption everywhere

TLS 1.3 in transit. AES-256 at rest. Customer-managed keys available on enterprise plans.

Auditable by design

Every prediction logs its activated concepts and evidence. No opaque inference.

No training on customer data

Customer notes are never used to train base models. Opt-in only, contract-bound.

FAQ

Questions we get a lot.

ShifaMind · ready to evaluate

Bring your notes. See the concepts behind every code.

The platform accepts pasted notes today. API access for production integration is gated. Talk to us about your workflow and we'll fit the path.