Update README.md
Browse files
README.md
CHANGED
@@ -14,32 +14,12 @@ Kokoro is a frontier TTS model for its size of 82 million parameters (text in/au
|
|
14 |
|
15 |
## Table of contents
|
16 |
|
17 |
-
- [Samples](#samples)
|
18 |
- [Usage](#usage)
|
19 |
- [JavaScript](#javascript)
|
20 |
- [Python](#python)
|
|
|
21 |
- [Quantizations](#quantizations)
|
22 |
|
23 |
-
## Samples
|
24 |
-
|
25 |
-
|
26 |
-
> Life is like a box of chocolates. You never know what you're gonna get.
|
27 |
-
|
28 |
-
|
29 |
-
| Voice | Nationality | Gender | Sample |
|
30 |
-
|--------------------------|-------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------|
|
31 |
-
| Default (`af`) | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/C0_ZUcNSAxvMwpS8QbnKv.wav"></audio> |
|
32 |
-
| Bella (`af_bella`) | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/B_q15Z_FXdgBP9-Hk9oKq.wav"></audio> |
|
33 |
-
| Nicole (`af_nicole`) | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/sS8U5lQHkhgX7rwTmy-5w.wav"></audio> |
|
34 |
-
| Sarah (`af_sarah`) | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/SokkBiqEqwxLLx_pqvf1p.wav"></audio> |
|
35 |
-
| Sky (`af_sky`) | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/IzySGHUtl5mYeFxx1oaRf.wav"></audio> |
|
36 |
-
| Adam (`am_adam`) | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/9n6myE6--ZsEuF5xDv5eC.wav"></audio> |
|
37 |
-
| Michael (`am_michael`) | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/EPFciGtTU1YUXu8MAw7DX.wav"></audio> |
|
38 |
-
| Emma (`bf_emma`) | British | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/AGEsXs-gyJq3dsyo7PjHo.wav"></audio> |
|
39 |
-
| Isabella (`bf_isabella`) | British | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/JEzrrXYJSDcmlEzI7tE0c.wav"></audio> |
|
40 |
-
| George (`bm_george`) | British | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/nsv4zKB4MX2TvXRxv504k.wav"></audio> |
|
41 |
-
| Lewis (`bm_lewis`) | British | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/g_mcBl2xTbQl0sbrpZt48.wav"></audio> |
|
42 |
-
|
43 |
|
44 |
## Usage
|
45 |
|
@@ -105,6 +85,45 @@ import scipy.io.wavfile as wavfile
|
|
105 |
wavfile.write('audio.wav', 24000, audio[0])
|
106 |
```
|
107 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
108 |
## Quantizations
|
109 |
|
110 |
The model is resilient to quantization, enabling efficient high-quality speech synthesis at a fraction of the original model size.
|
@@ -121,4 +140,4 @@ The model is resilient to quantization, enabling efficient high-quality speech s
|
|
121 |
| model_uint8.onnx (8-bit & mixed precision) | 177 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/tpOWRHIWwEb0PJX46dCWQ.wav"></audio> |
|
122 |
| model_uint8f16.onnx (Mixed precision) | 114 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/vtZhABzjP0pvGD7dRb5Vr.wav"></audio> |
|
123 |
| model_q4.onnx (4-bit matmul) | 305 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/8FVn0IJIUfccEBWq8Fnw_.wav"></audio> |
|
124 |
-
| model_q4f16.onnx (4-bit matmul & fp16 weights) | 154 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/7DrgWC_1q00s-wUJuG44X.wav"></audio> |
|
|
|
14 |
|
15 |
## Table of contents
|
16 |
|
|
|
17 |
- [Usage](#usage)
|
18 |
- [JavaScript](#javascript)
|
19 |
- [Python](#python)
|
20 |
+
- [Voices/Samples](#voicessamples)
|
21 |
- [Quantizations](#quantizations)
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
## Usage
|
25 |
|
|
|
85 |
wavfile.write('audio.wav', 24000, audio[0])
|
86 |
```
|
87 |
|
88 |
+
|
89 |
+
## Voices/Samples
|
90 |
+
|
91 |
+
|
92 |
+
> Life is like a box of chocolates. You never know what you're gonna get.
|
93 |
+
|
94 |
+
|
95 |
+
| Name | Nationality | Gender | Sample |
|
96 |
+
| ------------ | ----------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------- |
|
97 |
+
| **af_heart** | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/S_9tkA75BT_QHKOzSX6S-.wav"></audio> |
|
98 |
+
| af_alloy | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/wiZ3gvlL--p5pRItO4YRE.wav"></audio> |
|
99 |
+
| af_aoede | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/Nv1xMwzjTdF9MR8v0oEEJ.wav"></audio> |
|
100 |
+
| af_bella | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/sWN0rnKU6TlLsVdGqRktF.wav"></audio> |
|
101 |
+
| af_jessica | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/2Oa4wITWAmiCXJ_Q97-7R.wav"></audio> |
|
102 |
+
| af_kore | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/AOIgyspzZWDGpn7oQgwtu.wav"></audio> |
|
103 |
+
| af_nicole | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/EY_V2OGr-hzmtTGrTCTyf.wav"></audio> |
|
104 |
+
| af_nova | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/X-xdEkx3GPlQG5DK8Gsqd.wav"></audio> |
|
105 |
+
| af_river | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/ZqaV2-xGUZdBQmZAF1Xqy.wav"></audio> |
|
106 |
+
| af_sarah | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/xzoJBl1HCvkE8Fl8Xu2R4.wav"></audio> |
|
107 |
+
| af_sky | American | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/ubebYQoaseyQk-jDLeWX7.wav"></audio> |
|
108 |
+
| am_adam | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/tvauhDVRGvGK98I-4wv3H.wav"></audio> |
|
109 |
+
| am_echo | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/qy_KuUB0hXsu-u8XaJJ_Z.wav"></audio> |
|
110 |
+
| am_eric | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/JhqPjbpMhraUv5nTSPpwD.wav"></audio> |
|
111 |
+
| am_fenrir | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/c0R9caBdBiNjGUUalI_DQ.wav"></audio> |
|
112 |
+
| am_liam | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/DFHvulaLeOjXIDKecvNG3.wav"></audio> |
|
113 |
+
| am_michael | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/IPKhsnjq1tPh3JmHH8nEg.wav"></audio> |
|
114 |
+
| am_onyx | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/ov0pFDfE8NNKZ80LqW6Di.wav"></audio> |
|
115 |
+
| am_puck | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/MOC654sLMHWI64g8HWesV.wav"></audio> |
|
116 |
+
| am_santa | American | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/LzA6JmHBvQlhOviy8qVfJ.wav"></audio> |
|
117 |
+
| bf_alice | British | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/9mnYZ3JWq7f6U12plXilA.wav"></audio> |
|
118 |
+
| bf_emma | British | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/_fvGtKMttRI0cZVGqxMh8.wav"></audio> |
|
119 |
+
| bf_isabella | British | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/VzlcJpqGEND_Q3duYnhiu.wav"></audio> |
|
120 |
+
| bf_lily | British | Female | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/qZCoartohiRlVamY8Xpok.wav"></audio> |
|
121 |
+
| bm_daniel | British | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/Eb0TLnLXHDRYOA3TJQKq3.wav"></audio> |
|
122 |
+
| bm_fable | British | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/NT9XkmvlezQ0FJ6Th5hoZ.wav"></audio> |
|
123 |
+
| bm_george | British | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/y6VJbCESszLZGupPoqNkF.wav"></audio> |
|
124 |
+
| bm_lewis | British | Male | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/RlB5BRvLt-IFvTjzQNxCh.wav"></audio> |
|
125 |
+
|
126 |
+
|
127 |
## Quantizations
|
128 |
|
129 |
The model is resilient to quantization, enabling efficient high-quality speech synthesis at a fraction of the original model size.
|
|
|
140 |
| model_uint8.onnx (8-bit & mixed precision) | 177 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/tpOWRHIWwEb0PJX46dCWQ.wav"></audio> |
|
141 |
| model_uint8f16.onnx (Mixed precision) | 114 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/vtZhABzjP0pvGD7dRb5Vr.wav"></audio> |
|
142 |
| model_q4.onnx (4-bit matmul) | 305 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/8FVn0IJIUfccEBWq8Fnw_.wav"></audio> |
|
143 |
+
| model_q4f16.onnx (4-bit matmul & fp16 weights) | 154 | <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/7DrgWC_1q00s-wUJuG44X.wav"></audio> |
|