TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

Liang Zhang*, Anwen Hu*, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin†, Ji Zhang, Fei Huang

* Equal Contribution † Corresponding Author

Spotlights

Support chart question answering with both simple direct answers and step-by-step Python programs.
Support chart-to-table extraction, chart summary generation, and chart redrawing.
Opensource:
- ✅ Model: TinyChart
- ✅ Inference code.
- ✅ Code of launching a local demo.
- ✅ Online demo on HuggingFace.
- ✅ Evaluation code.
- ✅ Training data and code.

Examples

Online Demo

🤗 Huggingface Space

Models

Model Card

Model	Download Link
TinyChart@768	🤗 mPLUG/TinyChart-3B-768 🤖 iic/TinyChart-3B-768
TinyChart@768-SigLIP	🤗 mPLUG/TinyChart-3B-768-siglip 🤖 iic/TinyChart-3B-768-siglip

Note that to use TinyChart@768, you should load the vision transformer with token merging from TinyChart@768-SigLIP. If you download the model into local directory, you should change mm_vision_tower in config.json of TinyChart-3B-768 to make sure it can find TinyChart-3B-768-siglip.

Quick Start

You can load the model with the following code.

from tinychart.model.builder import load_pretrained_model

model_path = "mPLUG/TinyChart-3B-768"
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path, 
    model_base=None,
    model_name=get_model_name_from_path(model_path),
    device="cuda"
)

Model Inference

We provide an example script to perform inference in inference.ipynb.

Model Training & Evaluation

Data preparation

The training and evaluation data of TinyChart is released at 🤗 mPLUG/TinyChartData. Samples with id contains tempatepot and gptpot are the two subsets of the proposed ChartQA-PoT dataset. To perform training and evaluation, you should download and organize the data directory as follows:

data
├── tinychart_images
├── train.json
├── test.json

Then download bczhou/TinyLLaVA-3.1B-SigLIP into pretrained_models, and run this script to add arguments about token merging. Note that this script will change the config.json of the model inplace, please backup in advance.

python scripts/vit_add_tome.py --path pretrained_models/TinyLLaVA-3.1B-SigLIP

After that, run the following scripts to start training. It will automatically load the last checkpoint to perform evaluation.

bash scripts/train.sh

Local Demo

You can run a local demo with the following scrit:

python app.py --model-path <your_model_path>

Citation

If you find this work useful, consider giving this repository a star ⭐️ and citing 📝 our paper as follows:

@misc{zhang2024tinychart,
    title={TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning}, 
    author={Liang Zhang and Anwen Hu and Haiyang Xu and Ming Yan and Yichen Xu and Qin Jin and Ji Zhang and Fei Huang},
    year={2024},
    eprint={2404.16635},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Acknowledgement

The code is based on the TinyLLaVA, LLaVA, and ToMe. Thanks for these great works and open-sourcing!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

Spotlights

Examples

Online Demo

Models

Model Card

Quick Start

Model Inference

Model Training & Evaluation

Data preparation

Local Demo

Citation

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

Spotlights

Examples

Online Demo

Models

Model Card

Quick Start

Model Inference

Model Training & Evaluation

Data preparation

Local Demo

Citation

Acknowledgement