Characterizing the risks of medical software: what should we learn from IMDRF guide N88 (2025)?

MD, AI and Cybersecurity

29 January 2025

Medical AI outpaces guidelines

Medical devices incorporating artificial intelligence (AI) are progressing at an impressive rate. Assisted diagnosis, anomaly detection, automated sorting, personalized treatment: the use cases are multiplying. But regulations are struggling to keep pace.

With this in mind, IMDRF Guide N88, published in January 2025, proposes a set of guiding principles to promote the development of safe, effective and high-quality AI-based medical devices. This document is aimed at both manufacturers and regulatory authorities, with a clear ambition: to lay the foundations for "Good Machine Learning Practice" (GMLP).

It's neither a standard, nor a regulatory obligation. But it is a strong signal. And a valuable working basis, particularly for those who want to structure their technical file or anticipate the expectations of an auditor or notified body.

What the IMDRF N88 guide says: 10 principles you need to know

The core of the guide is based on 10 fundamental principles to guide the lifecycle of a medical device incorporating AI, from design to post-market follow-up.

The 10 principles of GMLP (Good Machine Learning Practice):

1.Clear understanding of the device's purpose and mobilization of multidisciplinary skills.
2.Implementation of best practices in software design, safety and quality management.
3.Use of data representative of the target population.
4.Rigorous separation of training and test datasets.
5.Use of appropriate reference standards.
6.Choice of model aligned with available data and destination.
7.Evaluation of performance in human-IA interaction, not just of the model alone.
8.Clinically relevant tests carried out under realistic conditions.
9.Clear communication to users (professionals or patients).
10.Post-deployment monitoring and re-training risk management.

The 10 principles of GMLP (Good Machine Learning Practice) diagrammed :

Pasted Graphic 2.png

What stands out across the board :

The importance of theintended purpose at every stage,
The role of the human factor (interactions, interface, foreseeable errors),
The need toanticipate risks linked to model drift,
The need to document, trace and explain each technical choice.

Case study 1 - Software for detecting dermatological lesions

A manufacturer develops a mobile application that automatically identifies suspicious lesions based on photos taken by the patient. The model is trained on thousands of images annotated by dermatologists.

Key principles to be applied :

Principle 3: the training base must reflect the diversity of phototypes, ages, genders, geographies... to avoid bias.
Principle 4: the images used for training must never be included in the test dataset.
Principle 6: the model must be robust to variations in image quality (blur, brightness, etc.) typical of consumer use.
Principle 9: the user must be clearly informed of the model's limitations (e.g. "does not replace medical advice").

Risks identified :

False negatives on certain misrepresented skin tones,
User overconfidence in the application ("magic AI effect").

Control measures:

Creation of sub-populations for performance analysis,
Integration of an educational module in the user interface.

Case study 2 - An AI solution embedded in a connected device

Another context: a start-up designs a sensor embedded in an infusion device. It analyzes physiological signals in real time to automatically adapt the flow rate.

Challenges faced :

Principle 7: the algorithm's performance depends closely on the responsiveness of the human user, particularly in the event of an alert.
Principle 10: after launch, the model must be monitored. A change in patient type (e.g., pediatric population) can lead to drift.

What the manufacturer has put in place :

Simulation scenarios involving the care team to test the human-IA synergy.
A clear model versioning policy, with regulatory validation before each significant update.

Examples of GMLP risks vs. control measures

Identified risk	GMLP principle	Example of control implemented
Non-representative dataset	#3	Multicentric inclusion, subgroup follow-up
Overfitting after re-training	#10	External cross-validation, freezing of validated model
User interpretation error	#9	Simplified interface, integrated FAQ, warning message
Wrong model selection	#6	Documented justification of algorithmic choice

Mini FAQ - What manufacturers often ask themselves

Is this IMDRF guide mandatory for CE or FDA marking?

No, it is not a binding text. But it is highly recommended by the authorities (including the FDA), and is aligned with expected auditing practices.

Do I have to comply with these principles to obtain a mark?

Not as such, but incorporating them into your QMS or technical documentation considerably enhances your credibility.

What's the difference with ISO 13485 requirements?

ISO 13485 provides a general framework for quality management. N88 specifically targets the challenges of AI and machine learning.

What if I use a generative model (LLM type)?

This is one of the issues raised. The guide highlights the difficulty of validating the performance of models derived from a "foundational model" not controlled by the manufacturer.

Does this guide replace N81?

No, it complements it. N81 helps to characterize software as a device, while N88 focuses on good AI development practices.

Anticipating, structuring and documenting: the keys to mastering AI

IMDRF charts a course. It's not yet the norm, but it will most likely be tomorrow's evidence.

If you're developing a medical device incorporating AI, every algorithmic decision, every dataset, every user interaction can have a clinical, ethical and regulatory impact. The N88 guide gives you a framework to secure your choices.

And if you don't know where to start? Then contact us. CSDmed will work with you, from design to deployment, to structure your AI approach, secure your regulatory strategy, and transform your models into market-ready devices.

Related resources

Source