ryan-minato commited on
Commit
1642eb7
·
verified ·
1 Parent(s): 55c230a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ja
4
+ tags:
5
+ - vibrato
6
+ ---
7
+
8
+ # Vibrato Model Archive
9
+
10
+ This repository hosts all models from the Vibrato GitHub release.
11
+ Models that were compressed using zstd have already been decompressed, so they are ready for direct download and use.
12
+
13
+ > Important: This repository deliberately does not provide a unified license because it contains models under different licensing terms.
14
+ > While most models have been detected to use the BSD license, some do not have a standard license.
15
+ > It is recommended to check the license file in each folder before using the models.
16
+
17
+ For those seeking models under the BSD license, you can visit the following repository ([aha-org/vibrato-models-bsdonly](https://huggingface.co/aha-org/vibrato-models-bsdonly)).
18
+
19
+ ## Available Models
20
+
21
+ - bccwj-suw+unidic-cwj-3_1_1+compact-dual
22
+ - bccwj-suw+unidic-cwj-3_1_1+compact
23
+ - bccwj-suw+unidic-cwj-3_1_1-extracted+compact-dual
24
+ - bccwj-suw+unidic-cwj-3_1_1-extracted+compact
25
+ - bccwj-suw+unidic-cwj-3_1_1
26
+ - ipadic-mecab-2_7_0-small
27
+ - ipadic-mecab-2_7_0
28
+ - jumandic-mecab-7_0
29
+ - naist-jdic-mecab-0_6_3b
30
+ - unidic-cwj-3_1_1+compact-dual
31
+ - unidic-cwj-3_1_1+compact
32
+ - unidic-cwj-3_1_1
33
+ - unidic-mecab-2_1_2
34
+
35
+ ## Usage
36
+
37
+ ```python
38
+ from huggingface_hub import hf_hub_download
39
+ import vibrato
40
+
41
+ # Load tokenizer from `.cache/hf`
42
+ model_path = hf_hub_download("ryan-minato/vibrato-models", "<<model_name>>/system.dic")
43
+ with open(model_path, "rb") as f:
44
+ tokenizer = vibrato.Vibrato(f.read())
45
+
46
+ text = """\
47
+ 「四十二だと!」ルーンクォールが叫んだ。
48
+ 「七百五十万年かけて、それだけか?」
49
+ 「何度も徹底的に検算しました」コンピュータが応じた。
50
+ 「まちがいなくそれが答えです。率直なところ、みなさんのほうで究極の疑問が何であるかわかっていなかったところに問題があるのです」
51
+ """
52
+
53
+ tokenizer.tokenize(text)
54
+ ```