YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

πŸ“– Introduction

Instruction-Tagger is a powerful model for labeling instructions with task tags. It allows users to easily adjust the proportion of tasks in a dataset.

Example Input

What are the main differences between Type 1 and Type 2 diabetes, and how do their treatment approaches differ?"

Example Output

Medicine

πŸš€ Quick Start

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

import torch
from transformers import DebertaV2Tokenizer,DebertaV2ForSequenceClassification, Trainer, TrainingArguments

model = DebertaV2ForSequenceClassification.from_pretrained('deberta_cls', num_labels=33).cuda()
tokenizer = DebertaV2Tokenizer.from_pretrained('alibaba-pai/Instruction-Tagger')

labels={14: 'Writting',
 0: 'Common-Sense',
 28: 'Ecology',
 22: 'Medicine',
 17: 'Grammar',
 3: 'Code Generation',
 31: 'Others',
 20: 'Paraphrase',
 19: 'Economy',
 6: 'Code Debug',
 21: 'Reasoning',
 18: 'Computer Science',
 4: 'Technology',
 13: 'Math',
 32: 'Literature',
 26: 'Chemistry',
 15: 'Complex Format',
 25: 'Ethics',
 27: 'Multilingual',
 29: 'Roleplay',
 30: 'Entertainment',
 23: 'Biology',
 16: 'Art',
 10: 'Academic Writing',
 24: 'Health',
 11: 'Philosophy',
 5: 'Sport',
 1: 'History',
 12: 'Music',
 7: 'Toxicity',
 2: 'Law',
 9: 'Physics',
 8: 'Counterfactual'}

def task_cls(pp):
    inputs = tokenizer(pp, return_tensors="pt",padding=True).to("cuda")

    with torch.no_grad():
        logits = model(**inputs).logits

    predicted_class_id = logits.argmax().item()

    return labels[predicted_class_id]

instruct="""
What are the main differences between Type 1 and Type 2 diabetes, and how do their treatment approaches differ?"
"""

tag=task_cls(instruct)

πŸ” Evaluation

To assess the accuracy of task classification, we manually evaluate a sample set of 100 entries (not in the training set), resulting in a classification precision of 92%.

πŸ“œ Citation

If you find our work helpful, please cite it!

@misc{TAPIR,
      title={Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning}, 
      author={Yuanhao Yue and Chengyu Wang and Jun Huang and Peng Wang},
      year={2024},
      eprint={2405.13448},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2405.13448}, 
}
Downloads last month
5
Safetensors
Model size
184M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.