Can't reproduce SRL result with allennlp==1.0.0 #4392

edchengg · 2020-06-22T14:17:30Z

System (please complete the following information):

OS: Ubuntu 18.04.3 LTS
Python version: 3.7
AllenNLP version: v1.0.0
PyTorch version: 1.5
Allennlp-models: v1.0.0

Question
Hi @DeNeutoy , I try to reproduce the results on the OntoNotes dataset (conll 2012) in the Shi et al., 2019 paper used in the SRL demo. However, I could only get F1 around 0.79.

Command I used:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz /data/conll-formatted-ontonotes-5.0/conll-formatted-ontonotes-5.0-12/conll-formatted-ontonotes-5.0/data/conll-2012-test/data/english

I also tried to train the model with config file in allennlp-models but only get to F1=79.

I found a related issue #4220 and was able to reproduce the result (86.5) with allennlp==0.9 and an old checkpoint https://s3-us-west-2.amazonaws.com/allennlp/models/bert-base-srl-2019.06.17.tar.gz. But I guess it might be worth reporting the issue since 1.0 is a stable version now.

Any help would be appreciated!

The text was updated successfully, but these errors were encountered:

epwalsh · 2020-06-26T16:14:54Z

Possibly related to #4216

edchengg · 2020-06-27T19:05:09Z

Possibly related to #4216

I guess there are some problems with pytorch_transformer and transformer lib.

Riccorl · 2020-07-22T09:16:42Z

I saw in #4457 that the BERT SRL model is trained again. I evaluated it with allennlp==1.1.0rc2.dev20200721 and it still produce F1 below 80. Is it normal to have this performance since the 1.1 is not stable yet or it should work by now?

Command

allennlp evaluate "https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.07.14.tar.gz" [data]

Riccorl · 2020-07-23T20:25:36Z

By changing the indexer in the dataset reader from SingleIdTokenIndexer to PretrainedTransformerIndexer the model seems to work as intended. I cannot complete a full train at the moment but the scores are higher after few epochs than the previous full train.

edchengg · 2020-07-24T14:30:34Z

By changing the indexer in the dataset reader from SingleIdTokenIndexer to PretrainedTransformerIndexer the model seems to work as intended. I cannot complete a full train at the moment but the scores are higher after few epochs than the previous full train.

Thanks! I will test it asap

edchengg added the bug label Jun 22, 2020

edchengg mentioned this issue Jun 22, 2020

Can't reproduce results moving away from bert_pretrained #4216

Closed

edchengg closed this as completed Jun 27, 2020

edchengg reopened this Jun 27, 2020

edchengg mentioned this issue Jun 28, 2020

AllenNLP crf-tagger with BERT #4287

Closed

matt-gardner mentioned this issue Jul 27, 2020

Retrain transformer-based models in allennlp_models.pretrained #4457

Closed

5 tasks

matt-gardner added this to the 1.1 milestone Jul 29, 2020

dirkgr self-assigned this Aug 24, 2020

dirkgr mentioned this issue Aug 26, 2020

Fixes the BERT SRL model allenai/allennlp-models#124

Merged

dirkgr closed this as completed in allenai/allennlp-models#124 Sep 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce SRL result with allennlp==1.0.0 #4392

Can't reproduce SRL result with allennlp==1.0.0 #4392

edchengg commented Jun 22, 2020 •

edited by schmmd

Loading

epwalsh commented Jun 26, 2020

edchengg commented Jun 27, 2020 •

edited

Loading

Riccorl commented Jul 22, 2020

Riccorl commented Jul 23, 2020 •

edited

Loading

edchengg commented Jul 24, 2020

Can't reproduce SRL result with allennlp==1.0.0 #4392

Can't reproduce SRL result with allennlp==1.0.0 #4392

Comments

edchengg commented Jun 22, 2020 • edited by schmmd Loading

epwalsh commented Jun 26, 2020

edchengg commented Jun 27, 2020 • edited Loading

Riccorl commented Jul 22, 2020

Riccorl commented Jul 23, 2020 • edited Loading

edchengg commented Jul 24, 2020

edchengg commented Jun 22, 2020 •

edited by schmmd

Loading

edchengg commented Jun 27, 2020 •

edited

Loading

Riccorl commented Jul 23, 2020 •

edited

Loading