A newsletter on the latest in AI for healthcare.

Welcome back,

In this issue’s top paper, we highlight the Breast cancer Intelligent Non-invasive Diagnosis System (BINDS), a multimodal AI model for pre-biopsy breast cancer assessment published in Nature Biomedical Engineering. In top health AI news, MHRA leadership wants regulation to help proven AI tools reach the UK’s National Health Service faster. And χ-Bench, a benchmark for healthcare agents, shows today’s systems still struggle with long, policy-heavy workflows.

SUMMARY

Top Research Paper

BINDS (Breast cancer Intelligent Non-invasive Diagnosis System) is a multimodal breast imaging model, published in Nature Biomedical Engineering, that reached 0.973 AUC and could cut benign biopsies by up to 32.4%.

Top AI News

MHRA CEO Lawrence Tallon says AI regulation should become a route into care, not a barrier to innovation, especially as healthcare moves toward large medical language models and adaptive AI systems.

Top Healthcare AI Benchmark

χ-Bench is a test for healthcare AI agents. It checks whether they can handle real admin work, such as approvals, care review and patient follow-up, and finds that today’s agents still struggle.

Bedside Bets

Healthcare AI rounds, partnerships, and market moves.

Aspira and Cleveland Clinic expand their AI women’s health diagnostics collaboration. Aspira builds non-invasive, AI-powered tests for gynecologic disease, and the partnership will support biomarker discovery, validation and multiomic diagnostic models. Deal value not disclosed.
Knit Health launches from stealth with $11.6M in seed funding. Knit builds clinical intelligence AI that learns from real care decisions, using Truveta electronic medical record data from 130M+ patients across 30 U.S. health systems.
Swoop acquires Nimble to add prescription fulfilment to AI healthcare engagement. Swoop uses AI and privacy-safe health data for life sciences engagement, while Nimble connects independent pharmacies in all 50 U.S. states and supports 16M patients. Deal value not disclosed.
Grundium acquires Visiopharm to build an integrated AI pathology platform. Grundium makes compact digital pathology scanners, and Visiopharm adds AI precision pathology software with 750+ customer accounts in 40+ countries. Deal value not disclosed.

Pulse Check

Quick reads across health AI.

Clinical AI needs independent auditors and post-deployment monitoring. A Nature Medicine correspondence uses Utah’s 12-month Doctronic prescribing sandbox, covering about 190 medications, to call for access to safety data and living oversight.
Caris says its AI helped correct a cancer misdiagnosis from stage 4 breast cancer to Hodgkin lymphoma. Caris analyses cancer samples to help doctors choose treatment, and its GPSai tool predicts where a cancer started.
NEJM AI says language models will change how medical knowledge is used. The editorial looks at what this means for clinicians, health systems and the rules around AI-enabled care.

TOP PAPER

🧬 BINDS helped radiologists cut benign breast biopsies by up to 32.4%

Source: Nature Biomedical Engineering · 19 May 2026

BINDS, or Breast cancer Intelligent Non-invasive Diagnosis System, is a deep learning system for breast cancer imaging. It combines ultrasound, mammography and magnetic resonance imaging (MRI) to estimate cancer risk and classify cancer subtype.

The key idea is workflow fit. BINDS starts with the scans most clinics already use first: ultrasound and/or mammography. It then adds MRI only when the early result is uncertain. That matters because MRI is more sensitive, but also more expensive and harder to use as a default test.

Research question

Can a breast imaging AI model improve diagnosis and help avoid unnecessary needle biopsies, while still working when hospitals have different combinations of scans available?

Source Li et al

Approach

Built a two-stage model that follows routine breast imaging workflow.
Used ultrasound, mammography and MRI, with flexible inputs when one scan type is missing.
Trained and validated the system on 27,048 participants from 8 centres and 7 public datasets.
Used pathology images during training to help the model learn features linked to real tissue diagnosis.
Compared BINDS with junior and senior radiologists in a reader study of 208 BI-RADS 4 lesions.
Released PyTorch code, preprocessing scripts and model weights on GitHub.

Results

The two-stage BINDS workflow reached 0.973 AUC for cancer risk assessment on the internal test cohort.
It reached 0.941 AUC on an external cancer risk assessment cohort.
BINDS outperformed junior radiologists in trimodal diagnosis, with 0.933 accuracy versus 0.894.
With BINDS support, senior radiologists cut benign biopsies by 32.4%, from 37 to 25.
Junior radiologists cut benign biopsies by 22.5%, from 40 to 31.
The reduction focused on benign lesions, while biopsy rates for malignant lesions were maintained.

Caveats

The work was retrospective, so prospective clinical testing is still needed.
The in-house data came from medical centres in China, which may limit generalisability.
Paired radiology and pathology data came from one centre, which may affect the alignment method.
The system used B-mode ultrasound only, not Doppler or elastography.

Potential impact: If validated prospectively, models like BINDS could help clinicians reach a confident diagnosis through multimodal imaging before moving to invasive biopsy. That could reduce unnecessary procedures, lower costs and patient burden, and reserve biopsies for cases where tissue confirmation is still needed.

READ PAPER >>

TOP NEWS

Smarter MHRA regulation could give healthcare AI companies a faster route into the NHS

Source: Politics UK

MHRA Chief Executive Lawrence Tallon argued that AI regulation should become a catalyst for safe adoption, not only a barrier to entry. His case has three parts:

The current pathway is too narrow. It was built mainly around image-recognition tools, especially in radiology, while the next wave includes large language models, large medical language models and adaptive systems.
Regulation should count the risk of inaction. Tallon argued that proven tools should not be kept from clinicians and patients when they are demonstrably better than current practice.
Approval should become an ongoing process. Tallon wants less “high jump” and more “hurdles race,” with proportionate checks, real-world evidence, post-market monitoring and repeated assessment in NHS settings.

That matters because public trust, clinical safety and commercial adoption are now tied together.

Why it matters: Good regulation gives useful, well-tested AI models the best chance of being implemented in healthcare. For companies, evidence can translate into adoption, not just another pilot. For the NHS, the best tools get a clearer route to changing care while still being monitored after deployment.

NEWS SOURCE »

Top Healthcare AI Benchmark

χ-Bench shows healthcare AI agents still fail most long, policy-heavy workflows

Source: arXiv · Hugging Face dataset

χ-Bench, or Clinical Healthcare In-Situ Benchmark, is a healthcare agent benchmark. It tests whether frontier AI agents can complete realistic, end-to-end healthcare operations.

The benchmark covers provider prior authorisation, payer utilisation management and care management. These are exactly the kinds of policy-heavy workflows where hospitals, payers and vendors want automation, but where errors can create delays, denials or unsafe handoffs.

What stands out

Tests agents in high-fidelity healthcare software environments.
Covers long workflows with policy retrieval, multi-role handoffs and multi-turn interactions.
Uses simulated healthcare apps exposed through Model Context Protocol (MCP) tools.
Includes a managed-care operations handbook of 1,279 documents.
Evaluates 30 agent harness/model configurations.
Best performance reached only 28.0% task resolution when agents got one attempt at each task.
No agent cleared 20% when asked to complete the same task successfully three times in a row.
Long sequences of connected tasks dropped to 3.8%, showing agents became much less reliable across extended workflows.

Developer value: For healthcare AI builders, χ-Bench is a stress test for product readiness. It shows where to harden agents before deployment: policy lookup, tool use, handoffs, consent checks and recovery from mistakes.