Deps-SAN: Neural Machine Translation with Dependency-Scaled Self-Attention Network [Paper]
- Build running environment (two ways)
1. pip install --editable .
2. python setup.py build_ext --inplace
- pytorch==1.7.0, torchvision==0.8.0, cudatoolkit=10.1 (pip install is also work)
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.1 -c pytorch
- Python 3.7.6
The vanilla dataset used in this work is IWSLT'14 German to English dataset, it can be found at here. You can use the following script to preprocess raw text:
bash expriments/prepare-iwslt14-sdsa.sh
You can let this code works by run the scripts in the directory expriments.
-
preprocess dataset into torch type
bash pre_sdsa.sh
-
train model
bash train_sdsa.sh
-
generate target sentence
bash gen_sdsa.sh
-
RS-Sparsing and Wink-Sparsing variants can be run by appending the suffix _rs or _wink
bash train_sdsa_rs.sh train_sdsa_wink.sh gen_sdsa_rs.sh gen_sdsa_wink.sh
If you use the code in your research, please cite:
@inproceedings{peng2022deps,
title={Deps-SAN: Neural Machine Translation with Dependency-Scaled Self-Attention Network},
author={Peng, Ru and Lin, Nankai and Fang, Yi and Jiang, Shengyi and Hao, Tianyong and Chen, Boyu and Zhao, Junbo},
booktitle={International Conference on Neural Information Processing},
pages={26--37},
year={2022},
organization={Springer}
}