--- license: apache-2.0 tags: - generated_from_trainer metrics: - precision - recall - f1 - accuracy model-index: - name: bert-finetuned-ner results: [] language: - en pipeline_tag: token-classification --- # bert-finetuned-AAVE-PoS This model is a version of [bert-base-cased](https://huggingface.co/bert-base-cased) fine-tuned on a [dataset](https://bitbucket.org/soegaard/aave-pos16/src/master/data) of African American Vernacular English (AAVE) which was published alongside [Jørgensen et al. 2016](https://aclanthology.org/N16-1130.pdf). It achieves the following results on the evaluation set: - Loss: 0.2582 - Precision: 0.8632 - Recall: 0.8730 - F1: 0.8681 - Accuracy: 0.9356 ## Model description More information needed ## Intended uses & limitations This model is intended to help close the gap in part-of-speech tagging performance between Standard American English (SAE) and African American English (AAVE) which differ liguistically in many [well-documented](http://www.johnrickford.com/portals/45/documents/papers/Rickford-1999e-Phonological-and-Grammatical-Features-of-AAVE.pdf) ways. It was fine-tuned on data gathered from Twitter, and is thus ingrained with what linguists call 'register bias'. ## Training and evaluation data Code hosted at [GitHub](https://github.com/DrewGalbraith/AAE-PoS/tree/main). ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 (this amount of data overfits on 3+) ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:| | No log | 1.0 | 223 | 0.2982 | 0.8196 | 0.8350 | 0.8272 | 0.9216 | | No log | 2.0 | 446 | 0.2625 | 0.8599 | 0.8680 | 0.8640 | 0.9326 | | 0.4647 | 3.0 | 669 | 0.2582 | 0.8632 | 0.8730 | 0.8681 | 0.9356 | ### Framework versions - Transformers 4.29.2 - Pytorch 1.13.1+cpu - Datasets 2.12.0 - Tokenizers 0.13.3