Targeted discovery of pharmacological targets using language models

Mar 1, 2025·
Maëlle Moranges
Maëlle Moranges
· 2 min read

Role: Supervisor of research internship with William Peoc’h, in collaboration with Hugues Berry

Collaborators :

  • William Peoc’h — Research Intern
  • Hugues Berry — Co‑supervisor
  • Jan-Michael Rye — Research engineer
  • Theranexus members

Scientific and Technological Objectives

Theranexus develops technology that modulates the expression of specific genes by targeting their RNA with antisense oligonucleotides. To accelerate drug discovery, this project aimed to harness language models to explore biomedical literature and identify candidate gene targets associated with beneficial or deleterious effects.

Specifically, the project sought to:

  • Use language models to predict relevant gene targets from the titles and abstracts of PubMed articles.
  • Distinguish genes withbeneficial, detrimental, or irrelevant (“off‑topic”) associations for therapeutic development.
  • Facilitate the discovery of novel targets for neurological diseases to support pharmacological research.

Challenges

  • Biomedical text complexity: PubMed abstracts contain domain‑specific terminology and abbreviations that require robust language understanding.
  • Label ambiguity: Articles may not clearly categorize a gene’s effect; models must infer subtle contextual signals.
  • Class imbalance: Beneficial and deleterious cases may be unevenly represented, complicating classifier training.

Contributions

  • Developed fine‑tuned classifiers based on BERT architectures and large language models (LLMs) to differentiate beneficial vs. deleterious gene associations from PubMed texts.
  • Built named‑entity recognition (NER) pipelines to extract gene mentions and contextual information from biomedical literature.
  • Scraped and curated biomedical databases to assemble a labeled dataset of PubMed articles linked to gene outcomes.
  • Created Gene Explorer PubMed, an interactive web application enabling:
    • Interactive visualization of gene–article associations.
    • Gene‑centric exploration with direct links to PubMed records.
    • Dynamic charts showing beneficial vs. deleterious ratios.
    • Custom filtering by disease type and article count.

Dissemination and Publications

Software and Tools

🛠 Gene Explorer PubMed — Interactive web application for browsing PubMed genetic data and exploring candidate gene targets.