Talks

	Language beyond the Standard: NLP for Low-Resource Varieties
	⏱ January, 2026 📍 ZurichAI, ETH AI Center
	📝 English
	This talk focuses on NLP challenges beyond standardized language varieties, highlighting issues in modeling low-resource and non-standard forms. It presents recent advances and discusses future directions for more inclusive and robust language technologies.
	Slides
	Lexical Borrowing in Modern NLP: How Do Models Handle Loanwords?
	⏱ December, 2025 📍 Archimedes, Athens, Greece
	📝 English (partially ελληνικά!)
	This talk explores how modern NLP models process lexical borrowing and loanwords across languages. It discusses challenges in multilingual settings and presents insights from recent work on evaluating and improving model behavior in handling borrowed vocabulary.
	Slides
	The Silent Cipher: Navigating the Hidden Complexities of Writing Low-Resourced Languages in NLP
	⏱ 2 July, 2024 📍 Hamburg University
	📝 English
	This talk explores the hidden challenges in processing low-resourced languages in NLP, focusing on languages lacking standardized writing systems and primarily spoken in bilingual communities. The presentation will highlight script normalization complexities, especially when non-conventional scripts are used to represent these languages. I discuss innovative approaches to these challenges and how addressing these issues can enhance various NLP tasks.

	Interview by Voice of America
	⏱ 15 June, 2023 📍 Voice of America (Kurdish)
	📝 Sorani Kurdish
	In an interview with Voice of America in Kurdish, we address the latest progress in AI and its impact on language. A few interesting issues were raised during the interview about the future of less-resourced languages like Kurdish and the ethical concerns of AI.
	Video
	Technology for Minoritized Language Communities: An Overview of Language and Speech Technology for Kurdish
	⏱ 25 May, 2023 📍 University of Toronto, Canada (online)
	📝 English
	I had the pleasure to be an invited lecturer in Prof. Sheyholislami's course "Kurdish Studies: a critical introduction" at University of Toronto. I talked about language technology for minoritized language communities and some of the challenges with a special focus on Kurdish.
	Slides \| Slides (handout)
	The Pandora’s Box of Low-Resource Language Technology: Script and Orthographic Normalization
	⏱ 12 May, 2023 📍 ContribuLing - Inalco, Paris, France
	📝 French
	I was an invited speaker at the ContribuLing conference which focused on language technology for minority languages. I talked about the importance of unconventional script normalization and how widespread and unsolved this phenomenon is in multilingual communities.
	Slides \| Slides (handout) \| Video
	Aim for the Stars! The Importance of Semantic Technologies in Electronic Lexicography and Language Technology
	⏱ 21 April, 2023 📍 University of Cologne, Cologne, Germany
	📝 English
	I was an invited speaker in a workshop at University of Cologne to talk about some practices in electronic lexicography to make data more accessible and inter-operable. I broadly discussed why giving a zest of linked data to linguistic data can go a long way!
	Slides (handout) \| Slides
	Language Technology and Kurdish Languages: What has been done? What should be done?
	⏱ 6 July, 2022 📍 Department of Information Technology - Erbil, Kurdistan
	📝 Kurdish (Sorani)
	I gave a talk on the current status of language technology for the Kurdish language. I focused on what has been done in the field and what should be done. The talk was held in both virtual and physical ways.
	Slides (handout) \| Slides \| Video
	Ontolex-Lemon and Conversion of TLFi
	⏱ 13 December, 2021 📍 ATILF (CNRS - Université de Lorraine) - Nancy, France
	📝 French
	Le Trésor de la Langue Française est une des plus importantes ressources lexicographiques du français. Il contient 100 000 entrées, 270 000 définitions et 430 000 exemples du XIVème au XXème siècle. La version informatisée de ce dictionnaire, appelée le TLF informatisé (TLFi), est disponible sous format XML avec une DTD associée. Les standards actuels de données liées permettent d’augmenter l’inter-opérabilité et l’accessibilité des données langagières. Pour faciliter l’utilisation du TLFi, je l’ai donc converti au modèle Ontolex-Lemon. Par conséquent, une version du TLFi en Ontolex-Lemon pourrait en permettre une meilleure intégration au sein des applications de TAL. Je vais aussi présenter le point d'accès SPARQL (SPARQL endpoint) https://tlfi-sparql.atilf.fr qui permet de lancer des requêtes SPARQL sur les données du TLFi.
	Slides
	Automatic Alignment of Lexicographical Data
	⏱ 13 July, 2021 📍 LORIA - Nancy, France
	📝 English
	Dictionaries are fundamental resources for people to learn and document languages as well as for computers to process natural languages. A dictionary provides a fine-grained structure and description of the vocabulary of a language. In this talk, I address the problem of data linking in the context of electronic lexicography and present the results of some of my experiments.
	Slides
	L^AT_EX pour les linguistes (workshop)
	⏱ 14 September, 2021 📍 ATILF (CNRS - Université de Lorraine) - Nancy, France
	📝 French
	Pour les utilisateurs intéressés, un atelier a été animé destiné aux linguistes. Cet atelier contienait les thématiques suivantes : formatage de base, insérer tableau, figure et formule, références croisées et glossaire, gestion de la bibliographie avec BibTeX, importation des packages pour réaliser différentes tâches comme l'arbre syntaxique de dépendance, exemple numéroté et gloses grammaticales et une introduction à TikZ pour la création d’éléments graphiques.
	Recurrent Neural Networks for Spelling and Grammatical Error Correction
	⏱ 21 December, 2017 📍 LIACS, Leiden University, Netherlands
	📝 English
	Automatic spelling and grammatical correction systems are one of the most widely used tools within natural language applications. In this talk, the task of error correction is described as a type of monolingual machine translation where the source sentence is potentially erroneous and the target sentence should be the corrected form of the input.
	Slides

Sina Ahmadi

Researcher

Talks

Sina Ahmadi

Researcher