Digital-Peter solution

Overview

This is our team’s 2nd place solution for Artificial Intelligence Journey Junior 2020 Competition (Digital Peter track). This contest was about line-by-line recognition of Peter the Great’s manuscripts. The task is related to several AI technologies (Computer Vision, NLP, and knowledge graphs). Competition data was prepared by Sberbank of Russia, Saint Petersburg Institute of History (N.P.Lihachov mansion) of Russian Academy of Sciences, Federal Archival Agency of Russia and Russian State Archive of Ancient Acts.

Files

custom_functions.py - custom functions, classes and augmentations we used for training and prediction.
dict.json - dictionary of 9k words for post processing.
large_dict.json - dictionary of 160k words for post processing, we were forced to give it up due to the submission runtime limit.
hparams.py - module containing hyperparameters and some functions.
metadata.json - file for docker-format submission.
model1train.py - code for training the first model (DenseNet161 with Smart Resize).
model2train.py - code for training the second model (ResNext101 with Smart Resize).
model3train.py - code for training the third model (ResNext101 with Default Resize).
ocr.py - ensemble inference.

Set up

You can start using it by installing:

$ git clone https:/t0efL/Digital-Peter.git

All the files containing code and hyperparameters from the original training. One thing you might want to change is working directory - the folder where the logs and the weights saved will be saved. You can do it in hparams.py.

Training

We've used 3 different models for the final ensemble. So we have three different trainings. To run each of them use the following commands:

$ python model1train.py // DenseNet161 with Smart Resize

$ python model2train.py // ResNext101 with Smart Resize

$ python model3train.py // ResNext101 with Default Resize

By default, all the logs and weights will be saved in the "log/" folder. If you want to change this working directory, you can do it in hparams.py. We recommend you to train each model appoximatly 100 epochs; our training loop contains early stopping, so if the loss stops decreasing, the training will stop. Besides, by default the logs from each training are saved in the same folder, so we recommend you to clean log folder after each training or change working directory in hparams.py. Finally, we recommend you to take the weights from each training according the CER (this will be indicated in the name of the weights), not number of epochs.

Approximate time for each training session - 10 hours (Google Colab Pro).

Inference

Put your weights in the folder as "weights1.pt", "weights2.pt" and "weights3.pt" or just download ours (link below) and run the following command:

$ python ocr.py

You'll find predictions in stdout and "/output" folder as well.

Quick start

Check out quickstart.ipynb notebook for quick start.

Results

Finally, we implemented ensemble technique for 3 backbones. We didn’t use our best model (CER: 5.011) with smart resize for this ensemble as its submission failed due to the time limit (moreover we kept our 9k dictionary instead of 160k dictionary due to the same problem).

Final Ensemble:

DenseNet161 (with Smart Resize, CER 5.025, Val CER 4.553)
ResNeXt101 (with Smart Resize, CER 5.047, Val CER 4.750)
ResNeXt101 (with Default Resize, CER 5.286, Val CER 4.711)

Public scores: CER - 4.861, WER - 23.954, String Accuracy - 46.27

Private scores: CER - 4.814, WER - 24.72, String Accuracy - 45.942

More information

Our article on the Medium (5 min read with solution explained): Link
Our full competition report (information about all of submissions and approaches): Link
We've used OCR-transformer pipeline by Vladislav Kramarenko as the baseline.
The dataset: Link.
Our weights for each model: Link

Team

Vadim Timakin
Maksim Zhdanov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Digital-Peter solution

Overview

Files

Set up

Training

Inference

Quick start

Results

More information

Team

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
README.md		README.md
custom_functions.py		custom_functions.py
dict.json		dict.json
hparams.py		hparams.py
img.jpg		img.jpg
large_dict.json		large_dict.json
metadata.json		metadata.json
model1train.py		model1train.py
model2train.py		model2train.py
model3train.py		model3train.py
ocr.py		ocr.py
quickstart.ipynb		quickstart.ipynb

vadimtimakin/2nd-place-solution-Digital-Peter

Folders and files

Latest commit

History

Repository files navigation

Digital-Peter solution

Overview

Files

Set up

Training

Inference

Quick start

Results

More information

Team

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages