metadata

language: tr
datasets:
  - SUNLP-NER-Twitter

berturk-sunlp-ner-turkish

Introduction

[berturk-sunlp-ner-turkish] is a NER model that was fine-tuned from the BERTurk-cased model on the SUNLP-NER-Twitter dataset.

Training data

The model was trained on the SUNLP-NER-Twitter dataset (5000 tweets). The dataset can be found at https://github.com/SU-NLP/SUNLP-Twitter-NER-Dataset Named entity types are as follows: Person, Location, Organization, Time, Money, Product, TV-Show

How to use berturk-sunlp-ner-turkish with HuggingFace

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("busecarik/berturk-sunlp-ner-turkish")
model = AutoModelForTokenClassification.from_pretrained("busecarik/berturk-sunlp-ner-turkish")

Model performances on SUNLP-NER-Twitter test set (metric: seqeval)

Precision	Recall	F1
85.08	84.46	84.77

Classification Report

Entity	Precision	Recall	F1
LOCATION	0.75	0.80	0.78
MONEY	0.74	0.59	0.65
ORGANIZATION	0.82	0.86	0.84
PERSON	0.94	0.91	0.92
PRODUCT	0.52	0.44	0.48
TIME	0.88	0.87	0.87
TVSHOW	0.65	0.58	0.61