Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do you need alignfile for our implementation? Please document better. #38

Open
sabetAI opened this issue Aug 9, 2020 · 1 comment

Comments

@sabetAI
Copy link

sabetAI commented Aug 9, 2020

An alignfile is needed to train the model on new data, however in the alignfile I see:

source ./config.sh
mkdir data_align

trainpref='data/train_merge'
trainpref='data/valid'

python scripts/build_sym_alignment.py --fast_align_dir ~/software/fast_align/build/ --mosesdecoder_dir fakkk --source_file $trainpref.src --target_file $trainpref.tgt --output_dir data_align 

cp data_align/align.forward $trainpref.forward
cp data_align/align.backward $trainpref.backward

rm -rf data_align

I'm assuming setting trainpref twice is a bug, so should I remove it? Do I need a validation alignfile as well? Why is an alignfile even necessary? When I actually do run it I get:

sh: 1: /h/user/software/fast_align/build/fast_align: not found
Traceback (most recent call last):
  File "scripts/build_sym_alignment.py", line 101, in <module>
    main()
  File "scripts/build_sym_alignment.py", line 75, in main
    assert os.system(fwd_fast_align_cmd) == 0
AssertionError
cp: cannot stat 'data_align/align.backward': No such file or directory

Why aren't your paths relative instead of absolute? Which commit of the fast_align implementation are you using, I'm assuming just https:/clab/fast_align master?

Please actually test your code for the use-cases you're advertising before releasing. For reference, https:/kanekomasahiro/bert-gec is also based on the fairseq library and manages to get up and running painlessly in minutes. Please improve your documentation if you actually want people to adopt your work!!!!!

@sabetAI
Copy link
Author

sabetAI commented Aug 9, 2020

Once I download, install, and run fast_align, I get this

sh: 1: fakkk/scripts/ems/support/symmetrize-fast-align.perl: not found
Traceback (most recent call last):
  File "scripts/build_sym_alignment.py", line 101, in <module>
    main()
  File "scripts/build_sym_alignment.py", line 97, in main
    assert os.system(sym_cmd) == 0
AssertionError

with symmetrize-fast-align.perl being an additional missing dependancy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant