Model Card for Model ID

Model Details

elyzaタスク100のタスクのinputを入力してタスクを分類するためのタスクです。 タスクの分類は以下のものです。

  • 知識説明型 Knowledge Explanation
  • 創作型 Creative Generation
  • 分析推論型 Analytical Reasoning
  • 課題解決型 Task Solution
  • 情報抽出型 Information Extraction
  • 計算・手順型 Step-by-Step Calculation
  • 意見・視点型 Opinion-Perspective
  • ロールプレイ型 Role-Play Response

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: [Hiroki Yanagisawa]
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [BERT]
  • Language(s) (NLP): [Japanese]
  • License: [More Information Needed]
  • Finetuned from model [optional]: [cl-tohoku/bert-base-japanese-v3]

Direct Use

from transformers import pipeline

label2id = {
    'Task_Solution': 0,
    'Creative_Generation': 1,
    'Knowledge_Explanation': 2,
    'Analytical_Reasoning': 3,
    'Information_Extraction': 4,
    'Step_by_Step_Calculation': 5,
    'Role_Play_Response': 6,
    'Opinion_Perspective': 7
}

def preprocess_text_classification(examples: dict[str, list]) -> BatchEncoding:
    """バッチ処理用に修正"""
    encoded_examples = tokenizer(
        examples["questions"],  # バッチ処理なのでリストで渡される
        max_length=512,
        padding=True,
        truncation=True,
        return_tensors=None  # バッチ処理時はNoneを指定
    )
    
    # ラベルをバッチで数値に変換
    encoded_examples["labels"] = [label2id[label] for label in examples["labels"]]
    return encoded_examples

# 使用するデータセット
test_data = test_data.to_pandas()
test_data["labels"] = test_data["labels"].apply(lambda x: label2id[x])
test_data

model_name = "hiroki-rad/bert-base-classification-ft"
classify_pipe = pipeline(model=model_name, device="cuda:0")

class_label = dataset["labels"].unique()
label2id = {label: id for id, label in enumerate(class_label)}
id2label = {id: label for id, label in enumerate(class_label)}

results: list[dict[str, float | str]] = []
for i, example in tqdm(enumerate(test_data.itertuples())):
    # モデルの予測結果を取得
    model_prediction = classify_pipe(example.questions)[0]
    # 正解のラベルIDをラベル名に変換
    true_label = id2label[example.labels]
    results.append(
        {
            "example_id": i,
            "pred_prob": model_prediction["score"],
            "pred_label": model_prediction["label"],
            "true_label": true_label,
        }
    )
Downloads last month
58
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hiroki-rad/bert-base-classification-ft

Finetuned
(33)
this model

Dataset used to train hiroki-rad/bert-base-classification-ft