en_ner_job_postings / README.md
DaFull's picture
Update README.md
76760cf
|
raw
history blame
4.66 kB
metadata
tags:
  - spacy
  - token-classification
  - ner
language:
  - en
license: mit
model-index:
  - name: en_ner_job_postings
    results:
      - task:
          name: NER
          type: token-classification
        metrics:
          - name: NER Precision
            type: precision
            value: 0.8516398746
          - name: NER Recall
            type: recall
            value: 0.8569711538
          - name: NER F Score
            type: f_score
            value: 0.8542971968
      - task:
          name: TAG
          type: token-classification
        metrics:
          - name: TAG (XPOS) Accuracy
            type: accuracy
            value: 0.9734810915
      - task:
          name: UNLABELED_DEPENDENCIES
          type: token-classification
        metrics:
          - name: Unlabeled Attachment Score (UAS)
            type: f_score
            value: 0.9208198801
      - task:
          name: LABELED_DEPENDENCIES
          type: token-classification
        metrics:
          - name: Labeled Attachment Score (LAS)
            type: f_score
            value: 0.9027174273
      - task:
          name: SENTS
          type: token-classification
        metrics:
          - name: Sentences F-Score
            type: f_score
            value: 0.907098331
library_name: spacy
pipeline_tag: text-classification

Custom spaCy NER Model for "Profession," "Facility," and "Experience" Entities

Overview

This spaCy-based Named Entity Recognition (NER) model has been custom-trained to recognize and classify entities related to "profession," "facility," and "experience." It is designed to enhance your text analysis capabilities by identifying these specific entity types in unstructured text data.

Key Features

Custom-trained for high accuracy in recognizing "profession," "facility," and "experience" entities. Suitable for various NLP tasks, such as information extraction, content categorization, and more. Can be easily integrated into your existing spaCy-based NLP pipelines.

Usage

Installation

You can install the custom spaCy NER model using pip:
pip install https://huggingface.co/DaFull/en_ner_job_postings/resolve/main/en_ner_job_postings-any-py3-none-any.whl

Example Usage

Here's how you can use the model for entity recognition in Python:


import spacy

# Load the custom spaCy NER model
nlp = spacy.load("en_ner_job_postings")

# Process your text
text = "HR Specialist needed at XYZ Corporation, Dallas, TX, with expertise in employee relations and a minimum of 4 years of HR experience."
doc = nlp(text)

# Extract named entities
for ent in doc.ents:
    print(f"Entity: {ent.text}, Type: {ent.label_}")

Entity Types

The model recognizes the following entity types:

  • PROFESSION: Represents professions or job titles.
  • FACILITY: Denotes facilities, buildings, or locations.
  • EXPERIENCE: Identifies mentions of work experience, durations, or qualifications.
Feature Description
Name en_ner_job_postings
Version 3.6.0
spaCy >=3.6.0,<3.7.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, lemmatizer, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 514157 keys, 514157 unique vectors (300 dimensions)
License MIT

Label Scheme

View label scheme (116 labels for 3 components)
Component Labels
tagger $, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, _SP, ````
parser ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp
ner CARDINAL, DATE, EVENT, EXPERIENCE, FAC, FACILITY, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, PROFESSION, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.86
TOKEN_P 99.57
TOKEN_R 99.58
TOKEN_F 99.57
TAG_ACC 97.35
SENTS_P 92.19
SENTS_R 89.27
SENTS_F 90.71
DEP_UAS 92.08
DEP_LAS 90.27
ENTS_P 85.16
ENTS_R 85.70
ENTS_F 85.43