Unsupervised alignment for bio-medical data

Master 1 research project
From November 2021 to July 2022
Supervision: École des Mines de Nancy & LORIA by Félix Gaschi, Parisa Rastin and Yannick Toussaint.

Multilingual information retrieval in bio-medical context, leading to the writing of a survey on multilingual language models properties. In this project I had to get familiar with supervised and unsupervised word alignment methods in a multilingual context, along with some knowlegde about Transformer-based multilingual models such as mBERT or XLM-R. The aim of this work was to describe and discuss both language-agnostic and language-specific properties of multilingual words representations.