Analysis and Enhancement of a Medical E-Learning Database

Collaborators:

This research was conducted in partnership with UNESS (Université Numérique en Santé et Sport), which developed and maintains a French national platform for medical e-learning.
Olivier Palombi – Physician and Vice-President of UNESS
Noha Ibrahim – Researcher in Computer Science at LIG
Sihem Amer-Yahia – Researcher in Computer Science at LIG
3 interns : Cyprien Michel-Deletie, Ayman Lmimouni and Guilherme Piffer Christo

Scientific and Technological Objectives

UNESS provides medical students with access to educational materials and assessments to prepare for national exams such as ECN (Épreuves Classantes Nationales) and EDN (Épreuves Dématérialisées Nationales).
This project aimed to enhance the UNESS database using AI to improve medical education, with two main research areas:

Large Language Models (LLMs) for E-Learning : Leveraging LLMs to automate the generation, classification, and refinement of multiple-choice questions (MCQs), reducing instructor workload and improving content relevance.
Learning Pattern Analysis : Using data science techniques to analyze how student learning behaviors vary across universities, gender, and performance levels.

1. LLMs for E-Learning

Role: Supervisor of 3 interns

Challenge: Evaluating LLM-generated MCQs without standardized assessment methods, requiring the development of validation techniques to ensure question quality and pedagogical relevance.

i. Automatic MCQ Generation from Course Materials

🧑‍🎓 Intern: Cyprien Michel-Deletie (Fourth-year student, ENS de Lyon – equivalent to M2)

Project: “Large Language Models (LLM) for Automated Question Generation and Evaluation in Medical Education”

Objective: Developed an AI-based pipeline to generate isolated MCQs from LISA educational sheets.

Realization:

Conducted an literature review on LLMs for question generation and evaluation.
Developed a question generation pipeline, integrating self-refinement strategy.
Development of evaluation criteria closely reflecting human judgment.
Compared LLaMA3, GPT 3.5 Turbo, and GPT 4o, assessing question quality and self-evaluation accuracy.

ii. MCQ Classification and Alignment with Course Content

🧑‍🎓 Intern: Ayman Lmimouni (Second-year student, ENSIMAG, Grenoble INP – equivalent to M1)

Project: “Classification of UNESS Questions Using LLMs” Objective: Automatically classify e-learning questions by linking them to the correct LISA educational sheets (among 4,800) to enhance medical exam preparation.

Realization:

Developed an automatic classification pipeline while handling token limitations in LLMs.
Creation of an ensemble learning technique on the same instance to improve the prediction accuracy.
Evaluated and comparison of LLaMA 3-8b, LLaMA 3.1-70b, Phi-3-mini, and TinyLlama, with 93% accuracy for LLaMA 3-8b.

iii. Generation of Critical Article Review Questions

🧑‍🎓 Intern: Guilherme Piffer Christo (Second-year student, Ense3, Grenoble INP – equivalent to M1)

Project: “Generation of Critical Article Reading Questions Using LLMs” Objective: Develop automated Critical Article Reading (LCA) questions from scientific publications to aid medical students in exam preparation.

Realization:

Analyzed existing LCA structures to understand their design principles.
Implemented a question generation pipeline, adapting Cyprien’s approach by integrating document summarization, key-point extraction, and multi-step question synthesis to enhance the quality and relevance of generated critical article review questions.
Optimized question diversity using a cosine similarity measure to reduce redundancy in generated questions..

2. Data Analysis on Gender, University and Learning Behaviors

Role: Postdoctoral researcher in Computer Science at LIG (Laboratoire Informatique de Grenoble)

Objective: To explore disparities in online medical learning through data-driven analysis of UNESS platform usage.

Challenges:

Ensuring data complexity management while maintaining representative models across diverse student cohorts.
Accounting for temporal and geographical differences that may influence gender-related learning behaviors.
Addressing individual variability within genders to avoid overgeneralized conclusions.

Methodology: Subgroup Discovery and Shapley Value analysis to identify learning behavior patterns.

Results: Gender disparities were observed in engagement and performance, confirming prior studies suggesting MCQ formats tend to favor male students.

These insights contribute to rethinking medical assessment strategies to promote inclusive education.

Dissemination

Journal Publications

📄 Moranges, M., Ibrahim, N., Amer-Yahia, S., & Palombi, O. (2025). A Study of Gender-Specific Learning Patterns in Online Medical Education. Under review