Chart Summarization Models
Collection
Models for Chart Sumarrization task.
•
3 items
•
Updated
Autochart: Zhu, J., Ran, J., Lee, R. K. W., Choo, K., & Li, Z. (2021). AutoChart: A Dataset for Chart-to-Text Generation Task. arXiv preprint arXiv:2108.06897.
Gitlab Link for the data: https://gitlab.com/bottle_shop/snlg/chart/autochart
Train split for this model: Train 8000, Validation 1297, Test 1296
Append C2T:
before every input to the model
tokenizer = AutoTokenizer.from_pretrained(saadob12/t5_C2T_autochart)
model = AutoModelForSeq2SeqLM.from_pretrained(saadob12/t5_C2T_autochart)
data = 'Trade statistics of Qatar with developing economies in North Africa bar_chart Year-Trade with economies of Middle East & North Africa(%)(Merchandise exports,Merchandise imports) x-y1-y2 values 2000 0.591869968616745 3.59339030672154 , 2001 0.53415012207203 3.25371165779341 , 2002 3.07769793440318 1.672796364224 , 2003 0.6932513078579471 1.62522475477827 , 2004 1.17635914189321 1.80540331396412'
prefix = 'C2T: '
tokens = tokenizer.encode(prefix + data, truncation=True, padding='max_length', return_tensors='pt')
generated = model.generate(tokens, num_beams=4, max_length=256)
tgt_text = tokenizer.decode(generated[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
summary = str(tgt_text).strip('[]""')
#Summary: This barchart shows the number of trade statistics of qatar with developing economies in north africa from 2000 through 2004. The unit of measurement in this graph is Trade with economies of Middle East & North Africa(%) as shown on the y-axis. The first group data denotes the change of Merchandise exports. There is a go up and down trend of the number. The peak of the number is found in 2002 and the lowest number is found in 2001. The changes in the number may be related to the conuntry's national policies. The second group data denotes the change of Merchandise imports. There is a go up and down trend of the number. The number in 2000 being the peak, and the lowest number is found in 2003. The changes in the number may be related to the conuntry's national policies.
You can use the model to generate summaries of data files. Works well for general statistics like the following:
Year | Children born per woman |
---|---|
2018 | 1.14 |
2017 | 1.45 |
2016 | 1.49 |
2015 | 1.54 |
2014 | 1.6 |
2013 | 1.65 |
May or may not generate an okay summary at best for the following kind of data:
Model | BLEU score | BLEURT |
---|---|---|
t5-small | 25.4 | -0.11 |
t5-base | 28.2 | 0.12 |
t5-large | 35.4 | 0.34 |
Kindly cite my work. Thank you.
@misc{obaid ul islam_2022,
title={saadob12/t5_C2T_autochart Hugging Face},
url={https://huggingface.co/saadob12/t5_C2T_autochart},
journal={Huggingface.co},
author={Obaid ul Islam, Saad},
year={2022}
}