Fix typo
Browse files- README.md +1 -1
- README_ja.md +1 -1
README.md
CHANGED
@@ -69,7 +69,7 @@ Both classifiers were trained using fastText with 20 epochs on the training data
|
|
69 |
|
70 |
### Wiki-based Classifier
|
71 |
|
72 |
-
We built this classifier by treating Wikipedia articles as positive examples of educational documents. Since not all articles, such as those about individuals, are necessarily “educational”, we extracted 37,399 Japanese Wikipedia articles from [academic categories](https://huggingface.co/tokyotech-llm/edu-classifier/blob/main/utils/academic_categories_wiki_ja.
|
73 |
|
74 |
### LLM-based Classifier
|
75 |
|
|
|
69 |
|
70 |
### Wiki-based Classifier
|
71 |
|
72 |
+
We built this classifier by treating Wikipedia articles as positive examples of educational documents. Since not all articles, such as those about individuals, are necessarily “educational”, we extracted 37,399 Japanese Wikipedia articles from [academic categories](https://huggingface.co/tokyotech-llm/edu-classifier/blob/main/utils/academic_categories_wiki_ja.tsv) as positive examples for training data. We randomly sampled 37,399 documents from the Swallow Corpus Version 2 for the negative examples.
|
73 |
|
74 |
### LLM-based Classifier
|
75 |
|
README_ja.md
CHANGED
@@ -53,7 +53,7 @@ edu_score = sum([int(label[-1]) * prob for label, prob in zip(res[0], res[1])])
|
|
53 |
|
54 |
### Wiki-based classifier
|
55 |
|
56 |
-
Wikipedia 記事を教育的な文書の正例と見なし、分類器を構築しました。人物に関する記事など、必ずしも教養的とは言えない記事もあるため、[学術分野のカテゴリ](https://huggingface.co/tokyotech-llm/edu-classifier/blob/main/utils/academic_categories_wiki_ja.
|
57 |
|
58 |
### LLM-based classifier
|
59 |
|
|
|
53 |
|
54 |
### Wiki-based classifier
|
55 |
|
56 |
+
Wikipedia 記事を教育的な文書の正例と見なし、分類器を構築しました。人物に関する記事など、必ずしも教養的とは言えない記事もあるため、[学術分野のカテゴリ](https://huggingface.co/tokyotech-llm/edu-classifier/blob/main/utils/academic_categories_wiki_ja.tsv)に属する日本語 Wikipedia 記事 37,399 件を抽出し、訓練データの正例としました。また、負例は Swallow コーパス v2 からランダムにサンプリングした文書 37,399 件としました。
|
57 |
|
58 |
### LLM-based classifier
|
59 |
|