Implementation of paper "Improving Arabic Diacritization with Regularized Decoding and Adversarial Training" at ACL-2021
@inproceedings{qin-etal-2021-improving,
title = "Improving Arabic Diacritization with Regularized Decoding and Adversarial Training",
author = "Qin, Han and Chen, Guimin and Tian, Yuanhe and Song, Yan",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
month = aug,
year = "2021",
address = "Online",
pages = "534--542",
}
Our code works with python 3.8
and requires the following packages: sklearn, pytorch.
It also require the PyTorch version of pre-trained language models: multi-lingual BERT and AraBERT.
See the commands in run.sh
to train a model on the small sample data.