Publications

Publications in reversed chronological order in two categories:

  1. Main
  2. Kurdish language processing

* indicates equal contribution.


Main

2024
  1. Language and Speech Technology for Central Kurdish Varieties
    Sina Ahmadi, Daban Q Jaff, Md Mahfuz Ibn Alam, Antonios Anastasopoulos Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
    [Paper] [Slides] [Resource] [bib]

  2. CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation
    Md Mahfuz Ibn Alam, Sina Ahmadi and Antonios Anastasopoulos
    17th Conference of the European Chapter of the Association for Computational Linguistics (EACL)
    [Paper] [Poster] [Resource] [bib]

  3. A Morphologically-Aware Dictionary-based Data Augmentation Technique for Machine Translation of Under-Represented Languages
    Under review
    [Preprint] [bib]

2023
  1. Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities
    Sina Ahmadi and Antonios Anastasopoulos
    The 61st Annual Meeting of the Association for Computational Linguistics (ACL)
    [Paper] [Slides] [Presentation] [Code] [⚙️ Demo] [bib]

  2. When Ontolex Meets Wikibase: Remodeling Use Cases
    David Lindemann, Sina Ahmadi and Fahad Khan and Francesco Mambrini
    Under review at the 4nd Conference on Language, Data and Knowledge (LDK 2023)
    [Paper] [Poster] [Service] [bib]

  3. PALI: A Language Identification Benchmark for Perso-Arabic Scripts
    Sina Ahmadi, Milind Agarwal and Antonios Anastasopoulos
    Proceedings of the 10th Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)
    [Paper] [Slides] [Presentation] [Poster] [Code] [bib]

  4. Approaches to Corpus Creation for Low-Resource Language Technology: the Case of Southern Kurdish and Laki
    Sina Ahmadi, Zahra Azin, Sara Belelli and Antonios Anastasopoulos
    Proceedings of the second workshop on NLP applications to field linguistics
    [Paper] [Slides] [Presentation] [Poster] [Code] [bib]

2022
  1. Monolingual Alignment of Word Senses and Definitions in Lexicographical Resources
    (Thesis) Sina Ahmadi
    National University of Ireland Galway
    [Thesis] [bib]

  2. Cross-Lingual Link Discovery for Under-Resourced Languages
    Michael Rosner, Sina Ahmadi, Elena-Simona Apostol, Julia Bosque-Gil, Christian Chiarcos, Milan Dojchinovski, Katerina Gkirtzou, Jorge Gracia, Dagmar Gromann, Chaya Liebeskind, Giedrė Valūnaitė Oleškevičienė̇, Gilles Sérasset and Ciprian-Octavian Truică
    The 13th International Conference on Language Resources and Evaluation (LREC 2022)
    [Paper] [bib]

  3. CoFiF Plus: A French Financial Narrative Summarisation Corpus
    Nadhem Zmandar, Tobias Daudert, Sina Ahmadi, Mahmoud El-Haj and Paul Rayson
    The 13th International Conference on Language Resources and Evaluation (LREC 2022)
    [Paper] [Resource] [bib]

  4. Towards an Integrative Approach for Making Sense Distinctions
    John P. McCrae, Theodorus Fransen, Sina Ahmadi, Paul Buitelaar and Koustava Goswami
    Frontiers in Artificial Intelligence
    [Paper] [bib]

2021
  1. Convertir le Trésor de la Langue Française en Ontolex-Lemon : un zeste de données liées
    Sina Ahmadi, Mathieu Constant, Karën Fort, Bruno Guillaume and John P. McCrae
    (accepted) LIFT 2021 : Journées scientifiques “Linguistique informatique, formelle & de terrain”
    [Paper] [Resource] [Poster] [Code] [bib]

  2. NUIG at TIAD 2021: Cross-lingual Word Embeddings for Translation Inference
    Sina Ahmadi, Atul Kr. Ojha, Shubhanker Banerjee and John P. McCrae
    In Proceedings of the Translation Inference Across Dictionaries Workshop (TIAD 2021)
    [Paper] [Slides] [bib]

  3. The ELEXIS system for monolingual sense linking in dictionaries
    John P. McCrae, Sina Ahmadi, Seung-bin Yim and Lenka Bajčetić
    In Proceedings of the Seventh Biennial Conference on Electronic Lexicography (eLex 2021)
    [Paper] [Demo] [bib]

  4. An Evaluation of Definition Paradigms in Lexicography for Word Sense Alignment
    Sina Ahmadi, John P. McCrae
    In Proceedings of the Seventh Biennial Conference on Electronic Lexicography (eLex 2021)
    [Paper] [bib]

  5. Word Sense Alignment as a Classification Problem
    Sina Ahmadi and John P. McCrae
    The 11th International Global Wordnet Conference (GWC2021)
    [Paper] [Slides] [Presentation] [bib]

2020
  1. Globalex Workshop on Linked Lexicography
    Ilan Kernerman, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi and Besim Kabashi
    European Language Resources Association (ELRA) - LREC 2020 Workshop Language Resources and Evaluation Conference
    [Paper] [bib]

  2. A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
    Sina Ahmadi, John P. McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S. Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, Thomas Troelsgård, Sussi Olsen, Simon Krek, Veronika Lipp, Tamás Váradi, László Simon, András Győrffy, Carole Tiberius, Tanneke Schoonheim, Yifat Ben Moshe, Maya Rudich, Raya Abu Ahmad, Dorielle Lonke, Kira Kovalenko, Margit Langemets, Jelena Kallas, Oksana Dereza, Theodorus Fransen, David Cillessen, David Lindemann, Mikel Alonso, Ana Salgado, José Luis Sancho, Rafael-J. Ureña-Ruiz, Kiril Simov, Petya Osenova, Zara Kancheva, Ivaylo Radev, Ranka Stanković, Andrej Perdih, and Dejan Gabrovšek
    The 12th International Conference on Language Resources and Evaluation (LREC)
    [Paper] [Resource] [bib]

  3. Towards Automatic Linking of Lexicographic Data: the case of a historical and a modern Danish dictionary
    Sina Ahmadi and Sanni Nimb and Thomas Troelsgård and John P. McCrae and Nicolai H. Sørensen
    The XIX EURALEX International Congress
    [Paper] [Slides] [bib]

  4. Challenges of Word Sense Alignment: Portuguese Language Resources
    Ana Salgado, Sina Ahmadi , Alberto Simões, John McCrae, Rute Costa
    the 7th Workshop on Linked Data in Linguistics: Building tools and infrastructure at the 12th International Conference on Language Resources and Evaluation (LREC)
    [Paper] [Slides] [bib]

  5. Defying Wikidata: Validation of Terminological Relations in the Web of Data
    Patricia Martín-Chozas, Sina Ahmadi and Elena Montiel-Ponsoda
    The 12th International Conference on Language Resources and Evaluation (LREC)
    [Paper] [Code] [bib]

2019
  1. Creating a Multilingual Terminological Resource using Linked Data: the case of archaeological domain in the Italian language
    Speranza Giulia, Carola Carlino and Sina Ahmadi
    The Sixth Italian Conference on Computational Linguistics - CLiC-it 2019
    [Paper] [Poster] [Code] [bib]

  2. The ELEXIS Interface for Interoperable Lexical Resources
    John P. McCrae, Carole Tiberius, Anas Fahad Khan, Ilan Kernerman, Thierry Declerck, Simon Krek, Monica Monachini and Sina Ahmadi
    In Proceedings of the Sixth Biennial Conference on Electronic Lexicography (eLex 2019)
    [Paper] [bib]

  3. CoFiF: A Corpus of Financial Reports in French Language
    Sina Ahmadi* and Tobias Daudert
    In Proceedings of the First Workshop on Financial Technology and Natural Language Processing
    [Paper] [Slides] [Resource] [bib]

  4. NUIG at the FinSBD Task: Sentence Boundary Detection for Noisy Financial PDFs in English and French
    Tobias Daudert and Sina Ahmadi
    In Proceedings of the First Workshop on Financial Technology and Natural Language Processing
    [Paper] [Poster] [bib]

  5. Inferring translation candidates for multilingual dictionary generation with multi-way neural machine translation
    Mihael Arcan, Daniel Torregrosa, Sina Ahmadi and John P. McCrae
    In Proceedings of the Translation Inference Across Dictionaries Workshop (TIAD 2019)
    [Paper] [Slides] [Poster] [bib]

  6. TIAD 2019 Shared Task: Leveraging knowledge graphs with neural machine translation for automatic multilingual dictionary generation
    Mihael Arcan, Daniel Torregrosa, Sina Ahmadi and John P. McCrae
    Shared Task on Translation Inference Across Dictionaries
    [Paper] [Code] [bib]

  7. Lexical sense alignment using weighted bipartite b-matching
    Sina Ahmadi, Mihael Arcan and John P. McCrae
    In Proceedings of the LDK 2019 Workshops
    [Paper] [Poster] [Code] [bib]

2018
  1. On lexicographical networks
    Sina Ahmadi, Mihael Arcan and John McCrae
    In Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence
    [Paper] [Poster] [bib]

  2. Learning Noun Cases Using Sequential Neural Networks
    Sina Ahmadi arXiv preprint arXiv:1810.03996
    [Paper] [bib]

2017
  1. Attention-based encoder-decoder networks for spelling and grammatical error correction
    Sina Ahmadi arXiv preprint arXiv:1810.00660
    [Paper] [Slides] [Poster] [Code] [bib]

Kurdish language processing

2024
  1. Part-of-Speech Tagging for Northern Kurdish
    Peshmerge Morad, Sina Ahmadi and Lorenzo Gatti
    Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024) @LREC-COLING-2024
    [Paper] [Poster] [Resource] [bib]
2023
  1. Revisiting and Amending Central Kurdish Data on UniMorph 4.0
    Sina Ahmadi and Aso Mahmudi
    The 20th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology at ACL2023
    [Paper] [Slides] [Resource] [bib]

  2. A Corpus-based Study of Endoclitic =îş in Kurdish
    Sina Ahmadi, Antonios Anastasopoulos and Géraldine Walther
    Book of abstracts of the the 56th Annual Meeting of the Societas Linguistica Europaea
    [Paper] [Poster] [bib]

  3. Transfer Learning for Low-Resource Sentiment Analysis
    Razhan Hameed, Sina Ahmadi and Fatemeh Daneshfar
    ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
    [Paper] [preprint] [Code] [bib]

2022
  1. Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus
    Sina Ahmadi and Hossein Hassani and Daban Q. Jaff
    ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
    [Paper] [preprint] [Resource] [bib]
2021
  1. Hunspell for Sorani Kurdish Spell Checking and Morphological Analysis
    Sina Ahmadi
    arXiv preprint arXiv:2109.06374
    [Paper] [Code] [bib]

  2. A Formal Description of Sorani Kurdish Morphology
    Sina Ahmadi
    arXiv preprint arXiv:2109.03942
    [Paper] [bib]

  3. Creating an Electronic Lexicon for the Under-resourced Southern Varieties of Kurdish Language
    Zahra Azin and Sina Ahmadi
    In Proceedings of the Seventh Biennial Conference on Electronic Lexicography (eLex 2021)
    [Paper] [Poster] [Resource] [bib]

  4. On the Current State of Kurdish Language Processing
    Sina Ahmadi
    Proceedings of the 5th International Conference on Kurdish Linguistics (ICKL-5) Conference
    [Paper] [Slides] [bib]

2020
  1. KLPT - Kurdish Language Processing Toolkit
    Sina Ahmadi
    Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS) - EMNLP 2020
    [Paper] [Poster] [Slides] [Presentation] [Code] [bib]

  2. Building a Corpus for the Zaza–Gorani Language Family
    Sina Ahmadi
    Proceedings of the Seventh Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2020)
    [Paper] [Poster] [Resource] [bib]

  3. A Tokenization System for the Kurdish Language
    Sina Ahmadi
    Proceedings of the Seventh Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2020)
    [Paper] [Poster] [Code] [bib]

  4. Towards Machine Translation for the Kurdish Language
    Sina Ahmadi and Mariam Masoud
    Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020) at AACL-IJCNLP
    [Paper] [Slides] [Presentation] [Code] [bib]

  5. A Corpus of the Sorani Kurdish Folkloric Lyrics
    Sina Ahmadi and Hossein Hassani and Kamaladdin Abedi
    In Proceedings of the 1st Joint Spoken Language Technologies for Under-resourced languages ({SLTU}) and Collaboration and Computing for Under-Resourced Languages (CCURL) Workshop at the 12th International Conference on Language Resources and Evaluation (LREC)
    [Paper] [Resource] [bib]

  6. Towards Finite-State Morphology of Kurdish
    Sina Ahmadi and Hossein Hassani
    arXiv preprint arXiv:2005.10652
    [preprint] [bib]

2019
  1. Towards Electronic Lexicography for the Kurdish Language
    Sina Ahmadi, Hossein Hassani and John P. McCrae
    In Proceedings of Sixth Biennial Conference on Electronic Lexicography (eLex 2019)
    [Paper] [Poster] [Resource] [bib]

  2. Developing a Fine-grained Corpus for a Less-resourced Language: the case of Kurdish
    Roshna Abdulrahman, Hossein Hassani and Sina Ahmadi
    WiNLP ACL 2019
    [Paper] [Poster] [Resource] [bib]

  3. A Rule-Based Kurdish Text Transliteration System
    Sina Ahmadi
    ACM Transactions on Asian and Low-resource Language Information Processing (TALLIP)
    [Paper] [Code] [bib]

2017
  1. Building a Lemmatizer and a Spell-checker for Sorani Kurdish
    Shahin Salavati and Sina Ahmadi
    In Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics
    [Paper] [bib]
2014
  1. Towards building Kurdnet, the Kurdish Wordnet
    Purya Aliabadi, Sina Ahmadi, Shahin Salavati and Kyumars Sheykh Esmaili
    In Proceedings of the Seventh Global Wordnet Conference
    [Paper] [Resource] [bib]