The paper mentions that an ensemble model achieved the state of the art results however there is no mention of how the seperate models were trained #33

Chhokra · 2020-03-26T12:36:57Z

The readme only mentions of a training one single model if I'm not wrong. How to go about training 4 models as mentioned by the results table of your paper?

kevin-parnow · 2020-04-01T06:30:36Z

I was also looking into this and came to the conclusion that they likely just used 4 different random initializations as was done in (Chollampatt, Ng, 2018), a paper they reference.

You can look at Table 1 and its footnote to see their gains from ensembling and that just different random initializations is what they use.
https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137

Note I am not affiliated with either paper's authors, and this is just speculation though.

zhawe01 · 2020-12-08T02:45:23Z

Thanks @kevbp5. We did use 4 different random initializations for the models without DA.
For the models with DA, we also used pre-trained checkpoints from different pre-training stages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The paper mentions that an ensemble model achieved the state of the art results however there is no mention of how the seperate models were trained #33

The paper mentions that an ensemble model achieved the state of the art results however there is no mention of how the seperate models were trained #33

Chhokra commented Mar 26, 2020

kevin-parnow commented Apr 1, 2020

zhawe01 commented Dec 8, 2020

The paper mentions that an ensemble model achieved the state of the art results however there is no mention of how the seperate models were trained #33

The paper mentions that an ensemble model achieved the state of the art results however there is no mention of how the seperate models were trained #33

Comments

Chhokra commented Mar 26, 2020

kevin-parnow commented Apr 1, 2020

zhawe01 commented Dec 8, 2020