File size: 854 Bytes
7175bb2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
language:
- en
license: apache-2.0
tags:
- image-to-text
---
# ViTSTR small v1.0
ViTSTR model pre-trained on various real [STR datasets](https://github.com/baudm/parseq/blob/main/Datasets.md) at image size 128x32 with a patch size of 8x4.
Disclaimer: this model card was not written by the original author.
## Model description
*TODO*
## Intended uses & limitations
You can use the model for STR on images containing Latin characters (62 case-sensitive alphanumeric + 32 punctuation marks).
### How to use
*TODO*
### BibTeX entry and citation info
```bibtex
@InProceedings{atienza2021vision,
title={Vision transformer for fast and efficient scene text recognition},
author={Atienza, Rowel},
booktitle={International Conference on Document Analysis and Recognition},
pages={319--334},
year={2021},
organization={Springer}
}
```
|