GitHub Email ORCID Google Scholar Stack Overflow Twitter FOAF Zurich, Switzerland RSS
The Silent Cipher: Navigating the Hidden Complexities of Writing Low-Resourced Languages in NLP | |
⏱ 2 July, 2024 📍 Hamburg University | |
📝 English | |
This talk explores the hidden challenges in processing low-resourced languages in NLP, focusing on languages lacking standardized writing systems and primarily spoken in bilingual communities. The presentation will highlight script normalization complexities, especially when non-conventional scripts are used to represent these languages. I discuss innovative approaches to these challenges and how addressing these issues can enhance various NLP tasks. | |
Interview by Voice of America | |
⏱ 15 June, 2023 📍 Voice of America (Kurdish) | |
📝 Sorani Kurdish | |
In an interview with Voice of America in Kurdish, we address the latest progress in AI and its impact on language. A few interesting issues were raised during the interview about the future of less-resourced languages like Kurdish and the ethical concerns of AI. | |
Video | |
Technology for Minoritized Language Communities: An Overview of Language and Speech Technology for Kurdish | |
⏱ 25 May, 2023 📍 University of Toronto, Canada (online) | |
📝 English | |
I had the pleasure to be an invited lecturer in Prof. Sheyholislami's course "Kurdish Studies: a critical introduction" at University of Toronto. I talked about language technology for minoritized language communities and some of the challenges with a special focus on Kurdish. | |
Slides | Slides (handout) | |
The Pandora’s Box of Low-Resource Language Technology: Script and Orthographic Normalization | |
⏱ 12 May, 2023 📍 ContribuLing - Inalco, Paris, France | |
📝 French | |
I was an invited speaker at the ContribuLing conference which focused on language technology for minority languages. I talked about the importance of unconventional script normalization and how widespread and unsolved this phenomenon is in multilingual communities. | |
Slides | Slides (handout) | Video | |
Aim for the Stars! The Importance of Semantic Technologies in Electronic Lexicography and Language Technology | |
⏱ 21 April, 2023 📍 University of Cologne, Cologne, Germany | |
📝 English | |
I was an invited speaker in a workshop at University of Cologne to talk about some practices in electronic lexicography to make data more accessible and inter-operable. I broadly discussed why giving a zest of linked data to linguistic data can go a long way! | |
Slides (handout) | Slides | |
Language Technology and Kurdish Languages: What has been done? What should be done? | |
⏱ 6 July, 2022 📍 Department of Information Technology - Erbil, Kurdistan | |
📝 Kurdish (Sorani) | |
I gave a talk on the current status of language technology for the Kurdish language. I focused on what has been done in the field and what should be done. The talk was held in both virtual and physical ways. | |
Slides (handout) | Slides | Video | |
Ontolex-Lemon and Conversion of TLFi | |
⏱ 13 December, 2021 📍 ATILF (CNRS - Université de Lorraine) - Nancy, France | |
📝 French | |
Le Trésor de la Langue Française est une des plus importantes ressources lexicographiques du français. Il contient 100 000 entrées, 270 000 définitions et 430 000 exemples du XIVème au XXème siècle. La version informatisée de ce dictionnaire, appelée le TLF informatisé (TLFi), est disponible sous format XML avec une DTD associée. Les standards actuels de données liées permettent d’augmenter l’inter-opérabilité et l’accessibilité des données langagières. Pour faciliter l’utilisation du TLFi, je l’ai donc converti au modèle Ontolex-Lemon. Par conséquent, une version du TLFi en Ontolex-Lemon pourrait en permettre une meilleure intégration au sein des applications de TAL. Je vais aussi présenter le point d'accès SPARQL (SPARQL endpoint) https://tlfi-sparql.atilf.fr qui permet de lancer des requêtes SPARQL sur les données du TLFi. | |
Slides | |
Automatic Alignment of Lexicographical Data | |
⏱ 13 July, 2021 📍 LORIA - Nancy, France | |
📝 English | |
Dictionaries are fundamental resources for people to learn and document languages as well as for computers to process natural languages. A dictionary provides a fine-grained structure and description of the vocabulary of a language. In this talk, I address the problem of data linking in the context of electronic lexicography and present the results of some of my experiments. | |
Slides | |
LATEX pour les linguistes (workshop) | |
⏱ 14 September, 2021 📍 ATILF (CNRS - Université de Lorraine) - Nancy, France | |
📝 French | |
Pour les utilisateurs intéressés, un atelier a été animé destiné aux linguistes. Cet atelier contienait les thématiques suivantes : formatage de base, insérer tableau, figure et formule, références croisées et glossaire, gestion de la bibliographie avec BibTeX, importation des packages pour réaliser différentes tâches comme l'arbre syntaxique de dépendance, exemple numéroté et gloses grammaticales et une introduction à TikZ pour la création d’éléments graphiques. | |
Recurrent Neural Networks for Spelling and Grammatical Error Correction | |
⏱ 21 December, 2017 📍 LIACS, Leiden University, Netherlands | |
📝 English | |
Automatic spelling and grammatical correction systems are one of the most widely used tools within natural language applications. In this talk, the task of error correction is described as a type of monolingual machine translation where the source sentence is potentially erroneous and the target sentence should be the corrected form of the input. | |
Slides |