4th Edition | Open Source Wars Bring New Models

Campbell Arnold
Oct 22, 2024
6 min read

Open source wars: Microsoft releases foundational models

Microsoft is the latest tech giant to release open source foundational models. Earlier this month Microsoft Research announced the release of healthcare AI models, a new collection of multimodal medical imaging foundation models. The models were released as open source software and are available in the Microsoft Azure AI model catalog. Currently, the collection comprises 3 models: an image embedding model, a generalized segmentation model, and a chest x-ray report generation model. The idea is that institutions and researchers, who may not have the resources to train foundation models, can leverage these pretrained models and fine-tune them for specific medical imaging tasks. Each of the models released by Microsoft has an accompanying arXiv article, which we’ll review here:

MedImageInsight is a medical imaging embedding model trained across various modalities (e.g. x-ray, CT, MRI, US) that can be used for downstream tasks (e.g. image classification, image analysis, similarity search, anomaly detection). Using MedImageInsight embeddings, the authors achieved state-of-the-art or human expert-level performance for several tasks, including bone age estimation, image similarity search, and disease classification on chest X-rays, dermatology, and OCT imaging. Additionally, the authors touted regulatory-friendly features, such as adjustable sensitivity and specificity as well as improved fairness across age and gender compared to other publicly available models.

MedImageParse is a broadly trained segmentation model designed to generalize across numerous imaging modalities (e.g. MRI, CT, US, x-ray, dermatology photos, & pathology slides). The MedImageParse model (called BiomedParse in the arXiv article) is based on the SEEM (segment everything, everywhere, all at once) architecture and is designed to be further fine-tuned for specific applications. This approach is similar to Meta’s SAM 2 (Segment Anything Model), though MedImageParse was trained specifically on medical imaging data from 9 modalities with 82 target structures. Additionally, by leveraging joint learning and harmonizing textual data with biomedical ontologies, the authors were able to further improve individual task accuracy and perform text-prompted object segmentation. The model outperformed other state-of-the-art methods in segmentation, detection, and recognition.

CXRReportGen is a multimodal AI model that generates detailed, structured chest x-ray reports by incorporating both current and prior images along with patient information. CXRReportGen (called MAIRA-2 in the arXiv article) uses a vision transformer to encode current and prior images into a format a large language model can parse. The authors used a task they termed "grounded report generation," whereby report accuracy was improved by incorporating localized image findings. They also used an LLM-based evaluation framework called RadFact to quantify report correctness and completeness. The authors achieved state-of-the-art performance on report generation benchmarks.

These three models from Microsoft are the latest in a string of recent open source foundational models from large tech companies. Hopefully, while companies like Microsoft, Google, Open AI, and Meta continue to battle each other for model dominance, the medical imaging community can continue to reap the benefits of improved model access. With these companies releasing foundation models as open source, researchers now have access to far more advanced pertained models than were available to them a few years ago.

Squeezing SNR out of low-field scanners

In a not-to-be-missed NMR in Biomedicine review, several titans of low-field MR and image enhancement collaborated to cover recent advances in these overlapping fields. For all MRI sequences, there is a trade-off between SNR, resolution, and scan time. However, lower magnetic field strengths already result in inherently low SNR, which makes finding a clinically acceptable trade-off all the more difficult on low-field systems. In this review, the authors cover advances in four topic areas: k-space sampling, reconstruction strategies, image processing, and electromagnetic interference cancellation. Additionally, the authors provide valuable insights on the relative importance of each for the future of low-field systems. In the rest of this section, we’ll cover additional related articles that were released in the last few weeks.

Expanding sequence offerings at 0.55T

In two recent Magnetic Resonance in Medicine studies, researchers expanded sequence availability for Siemens 0.55T low-field systems. In both cases, researchers leveraged denoising methods to boost image SNR, thus allowing clinically acceptable sequences to be collected in a reasonable time frame.

In the first Magnetic Resonance in Medicine study, the researchers developed and evaluated a free-breathing, non-contrast 3D Cardiac Magnetic Resonance Angiography (CMRA) sequence for a Siemens 0.55T MAGNETOM Free.Max scanner. CMRA methods used on high-field scanners could not be applied at low-field due to tissue property difference and the decreased SNR of the system. To address this, the authors optimized pulse sequences and acquisition parameters using Bloch simulations. They incorporated image navigators for respiratory motion-correction and a patch-based low-rank denoising method to improve SNR. Their method was tested on 11 healthy subjects and demonstrated excellent image quality and vessel sharpness, with comparable results to 1.5T studies. Additionally, the sequence was condensed into a clinically reasonable 6-minute scan.

In a second Magnetic Resonance in Medicine study, the authors aimed to improve liver proton density fat fraction (PDFF) and R2* quantification at 0.55T by both optimizing an acquisition protocol and evaluating the capabilities of two denoising methods: robust locally low-rank and random matrix theory. In phantom data and scans of 11 subjects collected on a Siemens 0.55T MAGNETOM Aera, both denoising techniques significantly improved accuracy and precision. PDFF and R2* measurement variability was reduced by over 67% compared to conventional methods.

Assessing neurodevelopment with super-resolution low-field MRI

Two recent studies, with some overlapping authors, highlight the potential of portable ultra-low-field MRI systems to expand access to critical imaging in resource-limited settings, particularly for pediatric and neonatal patients. However, while these scanners may be more accessible, they are still affected by the same SNR trade-off. Can image enhancement techniques be leveraged to provide accessiblility and high image quality?

The first study, out in Developmental Cognitive Neuroscience, introduces the UNITY project. UNITY is a Gates Foundation funded initiative that aims to leverage low-field MRI for assessing neurodevelopment in children from sub-Saharan Africa and South Asia, where traditional neuroimaging tools are scarce. The paper also had notable industry collaborations, including Hyperfine for portable scanners, CaliberMRI for phantom development, and Flywheel.io for data storage and analysis. The work outlines plans to collect a massive neurodevelopmental cohort across more than 50 scanners. This dataset is likely the largest collected to date on the 64mT system, making it also extremely valuable for image enhancement efforts.

The second study, which was published in MICCAI proceedings, introduced a novel AI-based technique called Super-Field Network (SFNet) that was developed by some of the same authors as the previous article. SFNet aims to enhance image quality of ultra-low-field scans to match those acquired on higher-field systems. The authors compared SFNet to other state-of-the-art methods and found their network provided better segmentation of key neurodevelopmental biomarkers. These efforts underscore both the potential of portable, low-field MRI to improve healthcare equity and the need to continue improving the SNR efficiency of low-field systems.

Resource Highlight

Are you interested in increasing radiology access? Then you’ve got to attend RAD-AID 2024. It’s an excellent one-day conference that focuses on global radiology outreach and bringing medical imaging to medically underserved populations. In the spirit of access, they keep registration costs low ($45-115). This year, RAD-AID is in Washington D.C. on November 2nd, so register soon! It’s an eye opening experience for anyone interested in radiology access, you’ll gain a whole new perspective on the complexities of expanding medical imaging to everyone, everywhere.

Feedback

As a new resource, we're all ears and eager to know your thoughts on how to improve our newsletter. Don't see an article you thought should be included? Send us an email, or reach out to use via X.

References

Matthew Lungren. “Unlocking next-generation AI capabilities with healthcare AI models.” Microsoft Blog (October 2024). https://www.microsoft.com/en-us/industry/blog/healthcare/2024/10/10/unlocking-next-generation-ai-capabilities-with-healthcare-ai-models/
Codella, Noel CF, et al. "MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging." arXiv preprint arXiv:2410.06542 (2024).
Zhao, Theodore, et al. "BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once." arXiv preprint arXiv:2405.12971 (2024).
https://microsoft.github.io/BiomedParse/
Bannur, Shruthi, et al. "MAIRA-2: Grounded Radiology Report Generation." arXiv preprint arXiv:2406.04449 (2024).
Ayde, Reina, et al. "MRI at low field: A review of software solutions for improving SNR." NMR in Biomedicine (2024): e5268.
Castillo‐Passi, Carlos, et al. "Highly efficient image navigator based 3D whole‐heart cardiac MRA at 0.55 T." Magnetic Resonance in Medicine (2024).
Shih, Shu‐Fu, et al. "Improved liver fat and R2* quantification at 0.55 T using locally low‐rank denoising." Magnetic Resonance in Medicine (2024).
Abate, F., et al. "UNITY: A low-field magnetic resonance neuroimaging initiative to characterize neurodevelopment in low and middle-income settings." Developmental Cognitive Neuroscience 69 (2024): 101397.
Tapp, Austin, et al. "Super-Field MRI Synthesis for Infant Brains Enhanced by Dual Channel Latent Diffusion." International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2024.

Disclaimer: There are no paid sponsors of this content. The opinions expressed are solely those of the newsletter authors, and do not necessarily reflect those of referenced works or companies.