File size: 2,379 Bytes
43ba99f
2abacad
 
9cefb87
 
43ba99f
2abacad
 
 
74057b1
2abacad
9cefb87
2abacad
9cefb87
 
 
 
 
74057b1
2abacad
74057b1
954ede7
 
2abacad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74057b1
2abacad
 
 
 
 
 
 
9cefb87
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
language: ja
tags:
- speech
license: other
---

# distilhubert-ft-japanese-50k

Fine-tuned (more precisely, continue trained) 50k steps model on Japanese using the [JVS corpus](https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus), [Tsukuyomi-Chan corpus](https://tyc.rei-yumesaki.net/material/corpus/), [Amitaro's ITA corpus V2.1](https://amitaro.net/), and recorded my own read [ITA corpus](https://github.com/mmorise/ita-corpus).

## Attention

This checkpoint was used the [JVS corpus](https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus) when training. Please read and accept the [terms of use](https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus#h.p_OP_G8FT_Kuf4)  
(This terms of use was also applies this checkpoint. This means also applies this "terms of use" when you use this checkpoint with another project etc...)


# References
Original repos, Many thanks!:  
[S3PRL](https://github.com/s3prl/s3prl/tree/main/s3prl/pretrain)
  - Using this when training (with little modify for train using own datasets).  

[distilhubert (hf)](https://huggingface.co/ntu-spml/distilhubert)  


Note: As same as the original, this model does not have a tokenizer as it was pretrained on audio alone. In order to use this model speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model.

# Usage

See [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more information on how to fine-tune the model. Note that the class `Wav2Vec2ForCTC` has to be replaced by `HubertForCTC`.

Note: This is not the best checkpoint and become more accurate with continued train, I think. I'll try to continue when I have a time.

## Credits
- [JVS corpus](https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus)
  
- [Tsukuyomi-Chan corpus](https://tyc.rei-yumesaki.net/material/corpus/)
```
  ■つくよみちゃんコーパス(CV.夢前黎)
```
[https://tyc.rei-yumesaki.net/material/corpus/](https://tyc.rei-yumesaki.net/material/corpus/)
  
- [Amitaro's ITA corpus](https://amitaro.net/)
```
あみたろの声素材工房
```
[https://amitaro.net/](https://amitaro.net/)
  
Thanks!