thanks to facebook ❤

Browse files

Files changed (17) hide show

README.md +336 -0
added_tokens.json +100 -0
config.json +117 -0
generation_config.json +0 -0
m4t_v2_multitask_unity2.pt +3 -0
model-00001-of-00002.safetensors +3 -0
model-00002-of-00002.safetensors +3 -0
model.safetensors.index.json +0 -0
preprocessor_config.json +111 -0
seamlessM4T_v2_large.pt +3 -0
seamlessm4t_arch.svg +1088 -0
sentencepiece.bpe.model +3 -0
special_tokens_map.json +144 -0
spm_char_lang38_tc.model +3 -0
tokenizer.model +3 -0
tokenizer_config.json +933 -0
vocoder_v2.pt +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,336 @@

+---
+license: cc-by-nc-4.0
+language:
+- af
+- am
+- ar
+- as
+- az
+- be
+- bn
+- bs
+- bg
+- ca
+- cs
+- zh
+- cy
+- da
+- de
+- el
+- en
+- et
+- fi
+- fr
+- or
+- om
+- ga
+- gl
+- gu
+- ha
+- he
+- hi
+- hr
+- hu
+- hy
+- ig
+- id
+- is
+- it
+- jv
+- ja
+- kn
+- ka
+- kk
+- mn
+- km
+- ky
+- ko
+- lo
+- ln
+- lt
+- lb
+- lg
+- lv
+- ml
+- mr
+- mk
+- mt
+- mi
+- my
+- nl
+- nb
+- ne
+- ny
+- oc
+- pa
+- ps
+- fa
+- pl
+- pt
+- ro
+- ru
+- sk
+- sl
+- sn
+- sd
+- so
+- es
+- sr
+- sv
+- sw
+- ta
+- te
+- tg
+- tl
+- th
+- tr
+- uk
+- ur
+- uz
+- vi
+- wo
+- xh
+- yo
+- ms
+- zu
+- ary
+- arz
+- yue
+- kea
+metrics:
+- bleu
+- wer
+- chrf
+inference: False
+tags:
+  - automatic-speech-recognition
+  - audio-to-audio
+  - text-to-speech
+library_name: seamless_communication
+---
+# SeamlessM4T v2
+**SeamlessM4T** is our foundational all-in-one **M**assively **M**ultilingual and **M**ultimodal **M**achine **T**ranslation model delivering high-quality translation for speech and text in nearly 100 languages.
+SeamlessM4T models support the tasks of:
+- Speech-to-speech translation (S2ST)
+- Speech-to-text translation (S2TT)
+- Text-to-speech translation (T2ST)
+- Text-to-text translation (T2TT)
+- Automatic speech recognition (ASR).
+SeamlessM4T models support:
+- 🎤 101 languages for speech input.
+- 💬 96 Languages for text input/output.
+- 🔊 35 languages for speech output.
+🌟 We are releasing SeamlessM4T v2, an updated version with our novel *UnitY2* architecture.
+This new model improves over SeamlessM4T v1 in quality as well as inference speed in speech generation tasks.
+The v2 version of SeamlessM4T is a multitask adaptation of our novel *UnitY2* architecture.
+*Unity2* with its hierarchical character-to-unit upsampling and non-autoregressive text-to-unit decoding considerably improves over SeamlessM4T v1 in quality and inference speed.
+**SeamlessM4T v2 is also supported by 🤗 Transformers, more on it [in the dedicated section below](#transformers-usage).**
+![SeamlessM4T architectures](seamlessm4t_arch.svg)
+## SeamlessM4T  models
+| Model Name         | #params | checkpoint                                                                              | metrics                                                                              |
+| ------------------ | ------- | --------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
+| [SeamlessM4T-Large v2](https://huggingface.co/facebook/seamless-m4t-v2-large)  | 2.3B    | [checkpoint](https://huggingface.co/facebook/seamless-m4t-v2-large/blob/main/seamlessM4T_v2_large.pt)   | [metrics](https://dl.fbaipublicfiles.com/seamless/metrics/seamlessM4T_large_v2.zip)  |
+| [SeamlessM4T-Large (v1)](https://huggingface.co/facebook/seamless-m4t-large) | 2.3B    | [checkpoint](https://huggingface.co/facebook/seamless-m4t-large/blob/main/multitask_unity_large.pt)   | [metrics](https://dl.fbaipublicfiles.com/seamless/metrics/seamlessM4T_large.zip)  |
+| [SeamlessM4T-Medium (v1)](https://huggingface.co/facebook/seamless-m4t-medium) | 1.2B    | [checkpoint](https://huggingface.co/facebook/seamless-m4t-medium/blob/main/multitask_unity_medium.pt) | [metrics](https://dl.fbaipublicfiles.com/seamless/metrics/seamlessM4T_medium.zip) |
+We provide the extensive evaluation results of seamlessM4T-Large and SeamlessM4T-Medium reported in the paper (as averages) in the `metrics` files above.
+The evaluation data ids for FLEURS, CoVoST2 and CVSS-C can be found [here](https://dl.fbaipublicfiles.com/seamless/metrics/evaluation_data_ids.zip)
+## Evaluating SeamlessM4T models
+To reproduce our results or to evaluate using the same metrics over your own test sets, please check out the [Evaluation README here](https://github.com/facebookresearch/seamless_communication/tree/main/src/seamless_communication/cli/m4t/evaluate).
+## Finetuning SeamlessM4T models
+Please check out the [Finetuning README here](https://github.com/facebookresearch/seamless_communication/tree/main/src/seamless_communication/cli/m4t/finetune).
+## Transformers usage
+SeamlessM4T is available in the 🤗 Transformers library, requiring minimal dependencies. Steps to get started:
+1. First install the 🤗 [Transformers library](https://github.com/huggingface/transformers) from main and [sentencepiece](https://github.com/google/sentencepiece):
+```
+pip install git+https://github.com/huggingface/transformers.git sentencepiece
+```
+2. Run the following Python code to generate speech samples. Here the target language is Russian:
+```py
+from transformers import AutoProcessor, SeamlessM4Tv2Model
+import torchaudio
+processor = AutoProcessor.from_pretrained("facebook/seamless-m4t-v2-large")
+model = SeamlessM4Tv2Model.from_pretrained("facebook/seamless-m4t-v2-large")
+# from text
+text_inputs = processor(text = "Hello, my dog is cute", src_lang="eng", return_tensors="pt")
+audio_array_from_text = model.generate(**text_inputs, tgt_lang="rus")[0].cpu().numpy().squeeze()
+# from audio
+audio, orig_freq =  torchaudio.load("https://www2.cs.uic.edu/~i101/SoundFiles/preamble10.wav")
+audio =  torchaudio.functional.resample(audio, orig_freq=orig_freq, new_freq=16_000) # must be a 16 kHz waveform array
+audio_inputs = processor(audios=audio, return_tensors="pt")
+audio_array_from_audio = model.generate(**audio_inputs, tgt_lang="rus")[0].cpu().numpy().squeeze()
+```
+3. Listen to the audio samples either in an ipynb notebook:
+```py
+from IPython.display import Audio
+sample_rate = model.sampling_rate
+Audio(audio_array_from_text, rate=sample_rate)
+# Audio(audio_array_from_audio, rate=sample_rate)
+```
+Or save them as a `.wav` file using a third-party library, e.g. `scipy`:
+```py
+import scipy
+sample_rate = model.sampling_rate
+scipy.io.wavfile.write("out_from_text.wav", rate=sample_rate, data=audio_array_from_text)
+# scipy.io.wavfile.write("out_from_audio.wav", rate=sample_rate, data=audio_array_from_audio)
+```
+For more details on using the SeamlessM4T model for inference using the 🤗 Transformers library, refer to the
+**[SeamlessM4T v2 docs](https://huggingface.co/docs/transformers/main/en/model_doc/seamless_m4t_v2)** or to this **hands-on [Google Colab](https://colab.research.google.com/github/ylacombe/scripts_and_notebooks/blob/main/v2_seamless_m4t_hugging_face.ipynb).**
+## Supported Languages:
+Listed below, are the languages supported by SeamlessM4T-large (v1/v2).
+The `source` column specifies whether a language is supported as source speech (`Sp`) and/or source text (`Tx`).
+The `target` column specifies whether a language is supported as target speech (`Sp`) and/or target text (`Tx`).
+| code | language               | script     | Source | Target |
+| ---- | ---------------------- | ---------- | ------ | ------ |
+| afr  | Afrikaans              | Latn       | Sp, Tx | Tx     |
+| amh  | Amharic                | Ethi       | Sp, Tx | Tx     |
+| arb  | Modern Standard Arabic | Arab       | Sp, Tx | Sp, Tx |
+| ary  | Moroccan Arabic        | Arab       | Sp, Tx | Tx     |
+| arz  | Egyptian Arabic        | Arab       | Sp, Tx | Tx     |
+| asm  | Assamese               | Beng       | Sp, Tx | Tx     |
+| ast  | Asturian               | Latn       | Sp     | \--    |
+| azj  | North Azerbaijani      | Latn       | Sp, Tx | Tx     |
+| bel  | Belarusian             | Cyrl       | Sp, Tx | Tx     |
+| ben  | Bengali                | Beng       | Sp, Tx | Sp, Tx |
+| bos  | Bosnian                | Latn       | Sp, Tx | Tx     |
+| bul  | Bulgarian              | Cyrl       | Sp, Tx | Tx     |
+| cat  | Catalan                | Latn       | Sp, Tx | Sp, Tx |
+| ceb  | Cebuano                | Latn       | Sp, Tx | Tx     |
+| ces  | Czech                  | Latn       | Sp, Tx | Sp, Tx |
+| ckb  | Central Kurdish        | Arab       | Sp, Tx | Tx     |
+| cmn  | Mandarin Chinese       | Hans       | Sp, Tx | Sp, Tx |
+| cmn_Hant  | Mandarin Chinese  | Hant       | Sp, Tx | Sp, Tx |
+| cym  | Welsh                  | Latn       | Sp, Tx | Sp, Tx |
+| dan  | Danish                 | Latn       | Sp, Tx | Sp, Tx |
+| deu  | German                 | Latn       | Sp, Tx | Sp, Tx |
+| ell  | Greek                  | Grek       | Sp, Tx | Tx     |
+| eng  | English                | Latn       | Sp, Tx | Sp, Tx |
+| est  | Estonian               | Latn       | Sp, Tx | Sp, Tx |
+| eus  | Basque                 | Latn       | Sp, Tx | Tx     |
+| fin  | Finnish                | Latn       | Sp, Tx | Sp, Tx |
+| fra  | French                 | Latn       | Sp, Tx | Sp, Tx |
+| fuv  | Nigerian Fulfulde      | Latn       | Sp, Tx | Tx     |
+| gaz  | West Central Oromo     | Latn       | Sp, Tx | Tx     |
+| gle  | Irish                  | Latn       | Sp, Tx | Tx     |
+| glg  | Galician               | Latn       | Sp, Tx | Tx     |
+| guj  | Gujarati               | Gujr       | Sp, Tx | Tx     |
+| heb  | Hebrew                 | Hebr       | Sp, Tx | Tx     |
+| hin  | Hindi                  | Deva       | Sp, Tx | Sp, Tx |
+| hrv  | Croatian               | Latn       | Sp, Tx | Tx     |
+| hun  | Hungarian              | Latn       | Sp, Tx | Tx     |
+| hye  | Armenian               | Armn       | Sp, Tx | Tx     |
+| ibo  | Igbo                   | Latn       | Sp, Tx | Tx     |
+| ind  | Indonesian             | Latn       | Sp, Tx | Sp, Tx |
+| isl  | Icelandic              | Latn       | Sp, Tx | Tx     |
+| ita  | Italian                | Latn       | Sp, Tx | Sp, Tx |
+| jav  | Javanese               | Latn       | Sp, Tx | Tx     |
+| jpn  | Japanese               | Jpan       | Sp, Tx | Sp, Tx |
+| kam  | Kamba                  | Latn       | Sp     | \--    |
+| kan  | Kannada                | Knda       | Sp, Tx | Tx     |
+| kat  | Georgian               | Geor       | Sp, Tx | Tx     |
+| kaz  | Kazakh                 | Cyrl       | Sp, Tx | Tx     |
+| kea  | Kabuverdianu           | Latn       | Sp     | \--    |
+| khk  | Halh Mongolian         | Cyrl       | Sp, Tx | Tx     |
+| khm  | Khmer                  | Khmr       | Sp, Tx | Tx     |
+| kir  | Kyrgyz                 | Cyrl       | Sp, Tx | Tx     |
+| kor  | Korean                 | Kore       | Sp, Tx | Sp, Tx |
+| lao  | Lao                    | Laoo       | Sp, Tx | Tx     |
+| lit  | Lithuanian             | Latn       | Sp, Tx | Tx     |
+| ltz  | Luxembourgish          | Latn       | Sp     | \--    |
+| lug  | Ganda                  | Latn       | Sp, Tx | Tx     |
+| luo  | Luo                    | Latn       | Sp, Tx | Tx     |
+| lvs  | Standard Latvian       | Latn       | Sp, Tx | Tx     |
+| mai  | Maithili               | Deva       | Sp, Tx | Tx     |
+| mal  | Malayalam              | Mlym       | Sp, Tx | Tx     |
+| mar  | Marathi                | Deva       | Sp, Tx | Tx     |
+| mkd  | Macedonian             | Cyrl       | Sp, Tx | Tx     |
+| mlt  | Maltese                | Latn       | Sp, Tx | Sp, Tx |
+| mni  | Meitei                 | Beng       | Sp, Tx | Tx     |
+| mya  | Burmese                | Mymr       | Sp, Tx | Tx     |
+| nld  | Dutch                  | Latn       | Sp, Tx | Sp, Tx |
+| nno  | Norwegian Nynorsk      | Latn       | Sp, Tx | Tx     |
+| nob  | Norwegian Bokmål       | Latn       | Sp, Tx | Tx     |
+| npi  | Nepali                 | Deva       | Sp, Tx | Tx     |
+| nya  | Nyanja                 | Latn       | Sp, Tx | Tx     |
+| oci  | Occitan                | Latn       | Sp     | \--    |
+| ory  | Odia                   | Orya       | Sp, Tx | Tx     |
+| pan  | Punjabi                | Guru       | Sp, Tx | Tx     |
+| pbt  | Southern Pashto        | Arab       | Sp, Tx | Tx     |
+| pes  | Western Persian        | Arab       | Sp, Tx | Sp, Tx |
+| pol  | Polish                 | Latn       | Sp, Tx | Sp, Tx |
+| por  | Portuguese             | Latn       | Sp, Tx | Sp, Tx |
+| ron  | Romanian               | Latn       | Sp, Tx | Sp, Tx |
+| rus  | Russian                | Cyrl       | Sp, Tx | Sp, Tx |
+| slk  | Slovak                 | Latn       | Sp, Tx | Sp, Tx |
+| slv  | Slovenian              | Latn       | Sp, Tx | Tx     |
+| sna  | Shona                  | Latn       | Sp, Tx | Tx     |
+| snd  | Sindhi                 | Arab       | Sp, Tx | Tx     |
+| som  | Somali                 | Latn       | Sp, Tx | Tx     |
+| spa  | Spanish                | Latn       | Sp, Tx | Sp, Tx |
+| srp  | Serbian                | Cyrl       | Sp, Tx | Tx     |
+| swe  | Swedish                | Latn       | Sp, Tx | Sp, Tx |
+| swh  | Swahili                | Latn       | Sp, Tx | Sp, Tx |
+| tam  | Tamil                  | Taml       | Sp, Tx | Tx     |
+| tel  | Telugu                 | Telu       | Sp, Tx | Sp, Tx |
+| tgk  | Tajik                  | Cyrl       | Sp, Tx | Tx     |
+| tgl  | Tagalog                | Latn       | Sp, Tx | Sp, Tx |
+| tha  | Thai                   | Thai       | Sp, Tx | Sp, Tx |
+| tur  | Turkish                | Latn       | Sp, Tx | Sp, Tx |
+| ukr  | Ukrainian              | Cyrl       | Sp, Tx | Sp, Tx |
+| urd  | Urdu                   | Arab       | Sp, Tx | Sp, Tx |
+| uzn  | Northern Uzbek         | Latn       | Sp, Tx | Sp, Tx |
+| vie  | Vietnamese             | Latn       | Sp, Tx | Sp, Tx |
+| xho  | Xhosa                  | Latn       | Sp     | \--    |
+| yor  | Yoruba                 | Latn       | Sp, Tx | Tx     |
+| yue  | Cantonese              | Hant       | Sp, Tx | Tx     |
+| zlm  | Colloquial Malay       | Latn       | Sp     | \--    |
+| zsm  | Standard Malay         | Latn       | Tx     | Tx     |
+| zul  | Zulu                   | Latn       | Sp, Tx | Tx     |
+Note that seamlessM4T-medium supports 200 languages in the text modality, and is based on NLLB-200 (see full list in [asset card](https://github.com/facebookresearch/seamless_communication/blob/main/src/seamless_communication/cards/unity_nllb-200.yaml))
+## Citation
+For SeamlessM4T v2, please cite :
+```bibtex
+@inproceedings{seamless2023,
+   title="Seamless: Multilingual Expressive and Streaming Speech Translation",
+   author="{Seamless Communication}, Lo{\"i}c Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek, Yilin Yang, Ethan Ye, Ivan Evtimov, Pierre Fernandez, Cynthia Gao, Prangthip Hansanti, Elahe Kalbassi, Amanda Kallet, Artyom Kozhevnikov, Gabriel Mejia, Robin San Roman, Christophe Touret, Corinne Wong, Carleigh Wood, Bokai Yu, Pierre Andrews, Can Balioglu, Peng-Jen Chen, Marta R. Costa-juss{\`a}, Maha Elbayad, Hongyu Gong, Francisco Guzm{\'a}n, Kevin Heffernan, Somya Jain, Justine Kao, Ann Lee, Xutai Ma, Alex Mourachko, Benjamin Peloquin, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Anna Sun, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang, Mary Williamson",
+  journal={ArXiv},
+  year={2023}
+}
+```

added_tokens.json ADDED Viewed

	@@ -0,0 +1,100 @@

+{
+  "__afr__": 256001,
+  "__amh__": 256002,
+  "__arb__": 256003,
+  "__ary__": 256004,
+  "__arz__": 256005,
+  "__asm__": 256006,
+  "__azj__": 256007,
+  "__bel__": 256008,
+  "__ben__": 256009,
+  "__bos__": 256010,
+  "__bul__": 256011,
+  "__cat__": 256012,
+  "__ceb__": 256013,
+  "__ces__": 256014,
+  "__ckb__": 256015,
+  "__cmn_Hant__": 256017,
+  "__cmn__": 256016,
+  "__cym__": 256018,
+  "__dan__": 256019,
+  "__deu__": 256020,
+  "__ell__": 256021,
+  "__eng__": 256022,
+  "__est__": 256023,
+  "__eus__": 256024,
+  "__fin__": 256025,
+  "__fra__": 256026,
+  "__fuv__": 256027,
+  "__gaz__": 256028,
+  "__gle__": 256029,
+  "__glg__": 256030,
+  "__guj__": 256031,
+  "__heb__": 256032,
+  "__hin__": 256033,
+  "__hrv__": 256034,
+  "__hun__": 256035,
+  "__hye__": 256036,
+  "__ibo__": 256037,
+  "__ind__": 256038,
+  "__isl__": 256039,
+  "__ita__": 256040,
+  "__jav__": 256041,
+  "__jpn__": 256042,
+  "__kan__": 256043,
+  "__kat__": 256044,
+  "__kaz__": 256045,
+  "__khk__": 256046,
+  "__khm__": 256047,
+  "__kir__": 256048,
+  "__kor__": 256049,
+  "__lao__": 256050,
+  "__lit__": 256051,
+  "__lug__": 256052,
+  "__luo__": 256053,
+  "__lvs__": 256054,
+  "__mai__": 256055,
+  "__mal__": 256056,
+  "__mar__": 256057,
+  "__mkd__": 256058,
+  "__mlt__": 256059,
+  "__mni__": 256060,
+  "__mya__": 256061,
+  "__nld__": 256062,
+  "__nno__": 256063,
+  "__nob__": 256064,
+  "__npi__": 256065,
+  "__nya__": 256066,
+  "__ory__": 256067,
+  "__pan__": 256068,
+  "__pbt__": 256069,
+  "__pes__": 256070,
+  "__pol__": 256071,
+  "__por__": 256072,
+  "__ron__": 256073,
+  "__rus__": 256074,
+  "__sat__": 256075,
+  "__slk__": 256076,
+  "__slv__": 256077,
+  "__sna__": 256078,
+  "__snd__": 256079,
+  "__som__": 256080,
+  "__spa__": 256081,
+  "__srp__": 256082,
+  "__swe__": 256083,
+  "__swh__": 256084,
+  "__tam__": 256085,
+  "__tel__": 256086,
+  "__tgk__": 256087,
+  "__tgl__": 256088,
+  "__tha__": 256089,
+  "__tur__": 256090,
+  "__ukr__": 256091,
+  "__urd__": 256092,
+  "__uzn__": 256093,
+  "__vie__": 256094,
+  "__yor__": 256095,
+  "__yue__": 256096,
+  "__zlm__": 256097,
+  "__zul__": 256098
+}

config.json ADDED Viewed

	@@ -0,0 +1,117 @@

+{
+  "activation_dropout": 0.0,
+  "activation_function": "relu",
+  "adaptor_dropout": 0.1,
+  "adaptor_kernel_size": 8,
+  "adaptor_stride": 8,
+  "add_adapter": true,
+  "architectures": [
+    "SeamlessM4Tv2Model"
+  ],
+  "attention_dropout": 0.1,
+  "bos_token_id": 2,
+  "char_vocab_size": 10943,
+  "conv_depthwise_kernel_size": 31,
+  "decoder_attention_heads": 16,
+  "decoder_ffn_dim": 8192,
+  "decoder_layerdrop": 0.05,
+  "decoder_layers": 24,
+  "decoder_start_token_id": 3,
+  "dropout": 0.1,
+  "encoder_attention_heads": 16,
+  "encoder_ffn_dim": 8192,
+  "encoder_layerdrop": 0.05,
+  "encoder_layers": 24,
+  "eos_token_id": 3,
+  "feature_projection_input_dim": 160,
+  "hidden_size": 1024,
+  "initializer_range": 0.02,
+  "is_encoder_decoder": true,
+  "lang_embed_dim": 256,
+  "layer_norm_eps": 1e-05,
+  "leaky_relu_slope": 0.1,
+  "left_max_position_embeddings": 64,
+  "max_new_tokens": 256,
+  "max_position_embeddings": 4096,
+  "model_type": "seamless_m4t_v2",
+  "num_adapter_layers": 1,
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "pad_token_id": 0,
+  "position_embeddings_type": "relative_key",
+  "resblock_dilation_sizes": [
+    [
+      1,
+      3,
+      5
+    ],
+    [
+      1,
+      3,
+      5
+    ],
+    [
+      1,
+      3,
+      5
+    ]
+  ],
+  "resblock_kernel_sizes": [
+    3,
+    7,
+    11
+  ],
+  "right_max_position_embeddings": 8,
+  "sampling_rate": 16000,
+  "scale_embedding": true,
+  "speech_encoder_attention_heads": 16,
+  "speech_encoder_chunk_size": 20000,
+  "speech_encoder_dropout": 0.0,
+  "speech_encoder_hidden_act": "swish",
+  "speech_encoder_intermediate_size": 4096,
+  "speech_encoder_layerdrop": 0.1,
+  "speech_encoder_layers": 24,
+  "speech_encoder_left_chunk_num": 128,
+  "spkr_embed_dim": 256,
+  "t2u_bos_token_id": 0,
+  "t2u_decoder_attention_heads": 16,
+  "t2u_decoder_ffn_dim": 8192,
+  "t2u_decoder_layers": 6,
+  "t2u_encoder_attention_heads": 16,
+  "t2u_encoder_ffn_dim": 8192,
+  "t2u_encoder_layers": 6,
+  "t2u_eos_token_id": 2,
+  "t2u_max_position_embeddings": 4096,
+  "t2u_pad_token_id": 1,
+  "t2u_variance_pred_dropout": 0.5,
+  "t2u_variance_predictor_embed_dim": 1024,
+  "t2u_variance_predictor_hidden_dim": 256,
+  "t2u_variance_predictor_kernel_size": 3,
+  "t2u_vocab_size": 10082,
+  "torch_dtype": "float32",
+  "transformers_version": "4.36.0.dev0",
+  "unit_embed_dim": 1280,
+  "unit_hifi_gan_vocab_size": 10000,
+  "upsample_initial_channel": 512,
+  "upsample_kernel_sizes": [
+    11,
+    8,
+    8,
+    4,
+    4
+  ],
+  "upsample_rates": [
+    5,
+    4,
+    4,
+    2,
+    2
+  ],
+  "use_cache": true,
+  "var_pred_dropout": 0.5,
+  "variance_predictor_kernel_size": 3,
+  "vocab_size": 256102,
+  "vocoder_num_langs": 36,
+  "vocoder_num_spkrs": 200,
+  "vocoder_offset": 4
+}

generation_config.json ADDED Viewed

The diff for this file is too large to render. See raw diff

m4t_v2_multitask_unity2.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2ebd8fe5fbf99dc2363f7da7fa6e318cf9317f682cfc0a252ade38e0a3fbacab
+size 11386771266

model-00001-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:85cab984fbc111f8713827c440499453b9e66f262862866eeed8725c302ba2ac
+size 4999163080

model-00002-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9536dc05892a6ca8410bcfea763dde422e2430c3cc1f3acd26c55182c8989017
+size 4238114628

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,111 @@

+{
+  "feature_extractor_type": "SeamlessM4TFeatureExtractor",
+  "feature_size": 80,
+  "language_code": [
+    "__afr__",
+    "__amh__",
+    "__arb__",
+    "__ary__",
+    "__arz__",
+    "__asm__",
+    "__azj__",
+    "__bel__",
+    "__ben__",
+    "__bos__",
+    "__bul__",
+    "__cat__",
+    "__ceb__",
+    "__ces__",
+    "__ckb__",
+    "__cmn__",
+    "__cmn_Hant__",
+    "__cym__",
+    "__dan__",
+    "__deu__",
+    "__ell__",
+    "__eng__",
+    "__est__",
+    "__eus__",
+    "__fin__",
+    "__fra__",
+    "__fuv__",
+    "__gaz__",
+    "__gle__",
+    "__glg__",
+    "__guj__",
+    "__heb__",
+    "__hin__",
+    "__hrv__",
+    "__hun__",
+    "__hye__",
+    "__ibo__",
+    "__ind__",
+    "__isl__",
+    "__ita__",
+    "__jav__",
+    "__jpn__",
+    "__kan__",
+    "__kat__",
+    "__kaz__",
+    "__khk__",
+    "__khm__",
+    "__kir__",
+    "__kor__",
+    "__lao__",
+    "__lit__",
+    "__lug__",
+    "__luo__",
+    "__lvs__",
+    "__mai__",
+    "__mal__",
+    "__mar__",
+    "__mkd__",
+    "__mlt__",
+    "__mni__",
+    "__mya__",
+    "__nld__",
+    "__nno__",
+    "__nob__",
+    "__npi__",
+    "__nya__",
+    "__ory__",
+    "__pan__",
+    "__pbt__",
+    "__pes__",
+    "__pol__",
+    "__por__",
+    "__ron__",
+    "__rus__",
+    "__sat__",
+    "__slk__",
+    "__slv__",
+    "__sna__",
+    "__snd__",
+    "__som__",
+    "__spa__",
+    "__srp__",
+    "__swe__",
+    "__swh__",
+    "__tam__",
+    "__tel__",
+    "__tgk__",
+    "__tgl__",
+    "__tha__",
+    "__tur__",
+    "__ukr__",
+    "__urd__",
+    "__uzn__",
+    "__vie__",
+    "__yor__",
+    "__yue__",
+    "__zlm__",
+    "__zul__"
+  ],
+  "num_mel_bins": 80,
+  "padding_side": "right",
+  "padding_value": 0.0,
+  "processor_class": "SeamlessM4TProcessor",
+  "return_attention_mask": true,
+  "sampling_rate": 16000,
+  "stride": 2
+}

seamlessM4T_v2_large.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f2122cedcffc532d6048847092414e638cfb4db402881cb8146a606008e9ff56
+size 9070060243

seamlessm4t_arch.svg ADDED Viewed

sentencepiece.bpe.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:026a76827537db9f1348e4d5aaa127bb10a2f2ff633243f3a52d16be82d73f9d
+size 5165809

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,144 @@

+{
+  "additional_special_tokens": [
+    "__afr__",
+    "__amh__",
+    "__arb__",
+    "__ary__",
+    "__arz__",
+    "__asm__",
+    "__azj__",
+    "__bel__",
+    "__ben__",
+    "__bos__",
+    "__bul__",
+    "__cat__",
+    "__ceb__",
+    "__ces__",
+    "__ckb__",
+    "__cmn__",
+    "__cmn_Hant__",
+    "__cym__",
+    "__dan__",
+    "__deu__",
+    "__ell__",
+    "__eng__",
+    "__est__",
+    "__eus__",
+    "__fin__",
+    "__fra__",
+    "__fuv__",
+    "__gaz__",
+    "__gle__",
+    "__glg__",
+    "__guj__",
+    "__heb__",
+    "__hin__",
+    "__hrv__",
+    "__hun__",
+    "__hye__",
+    "__ibo__",
+    "__ind__",
+    "__isl__",
+    "__ita__",
+    "__jav__",
+    "__jpn__",
+    "__kan__",
+    "__kat__",
+    "__kaz__",
+    "__khk__",
+    "__khm__",
+    "__kir__",
+    "__kor__",
+    "__lao__",
+    "__lit__",
+    "__lug__",
+    "__luo__",
+    "__lvs__",
+    "__mai__",
+    "__mal__",
+    "__mar__",
+    "__mkd__",
+    "__mlt__",
+    "__mni__",
+    "__mya__",
+    "__nld__",
+    "__nno__",
+    "__nob__",
+    "__npi__",
+    "__nya__",
+    "__ory__",
+    "__pan__",
+    "__pbt__",
+    "__pes__",
+    "__pol__",
+    "__por__",
+    "__ron__",
+    "__rus__",
+    "__sat__",
+    "__slk__",
+    "__slv__",
+    "__sna__",
+    "__snd__",
+    "__som__",
+    "__spa__",
+    "__srp__",
+    "__swe__",
+    "__swh__",
+    "__tam__",
+    "__tel__",
+    "__tgk__",
+    "__tgl__",
+    "__tha__",
+    "__tur__",
+    "__ukr__",
+    "__urd__",
+    "__uzn__",
+    "__vie__",
+    "__yor__",
+    "__yue__",
+    "__zlm__",
+    "__zul__"
+  ],
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

spm_char_lang38_tc.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e7f2075dbc38dbe11d2414bfa4fb8e900022e87bbff4f74c97817e32a7ab493
+size 368901

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:026a76827537db9f1348e4d5aaa127bb10a2f2ff633243f3a52d16be82d73f9d
+size 5165809

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,933 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256001": {
+      "content": "__afr__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256002": {
+      "content": "__amh__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256003": {
+      "content": "__arb__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256004": {
+      "content": "__ary__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256005": {
+      "content": "__arz__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256006": {
+      "content": "__asm__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256007": {
+      "content": "__azj__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256008": {
+      "content": "__bel__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256009": {
+      "content": "__ben__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256010": {
+      "content": "__bos__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256011": {
+      "content": "__bul__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256012": {
+      "content": "__cat__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256013": {
+      "content": "__ceb__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256014": {
+      "content": "__ces__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256015": {
+      "content": "__ckb__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256016": {
+      "content": "__cmn__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256017": {
+      "content": "__cmn_Hant__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256018": {
+      "content": "__cym__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256019": {
+      "content": "__dan__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256020": {
+      "content": "__deu__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256021": {
+      "content": "__ell__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256022": {
+      "content": "__eng__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256023": {
+      "content": "__est__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256024": {
+      "content": "__eus__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256025": {
+      "content": "__fin__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256026": {
+      "content": "__fra__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256027": {
+      "content": "__fuv__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256028": {
+      "content": "__gaz__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256029": {
+      "content": "__gle__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256030": {
+      "content": "__glg__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256031": {
+      "content": "__guj__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256032": {
+      "content": "__heb__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256033": {
+      "content": "__hin__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256034": {
+      "content": "__hrv__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256035": {
+      "content": "__hun__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256036": {
+      "content": "__hye__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256037": {
+      "content": "__ibo__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256038": {
+      "content": "__ind__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256039": {
+      "content": "__isl__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256040": {
+      "content": "__ita__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256041": {
+      "content": "__jav__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256042": {
+      "content": "__jpn__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256043": {
+      "content": "__kan__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256044": {
+      "content": "__kat__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256045": {
+      "content": "__kaz__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256046": {
+      "content": "__khk__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256047": {
+      "content": "__khm__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256048": {
+      "content": "__kir__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256049": {
+      "content": "__kor__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256050": {
+      "content": "__lao__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256051": {
+      "content": "__lit__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256052": {
+      "content": "__lug__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256053": {
+      "content": "__luo__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256054": {
+      "content": "__lvs__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256055": {
+      "content": "__mai__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256056": {
+      "content": "__mal__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256057": {
+      "content": "__mar__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256058": {
+      "content": "__mkd__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256059": {
+      "content": "__mlt__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256060": {
+      "content": "__mni__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256061": {
+      "content": "__mya__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256062": {
+      "content": "__nld__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256063": {
+      "content": "__nno__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256064": {
+      "content": "__nob__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256065": {
+      "content": "__npi__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256066": {
+      "content": "__nya__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256067": {
+      "content": "__ory__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256068": {
+      "content": "__pan__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256069": {
+      "content": "__pbt__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256070": {
+      "content": "__pes__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256071": {
+      "content": "__pol__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256072": {
+      "content": "__por__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256073": {
+      "content": "__ron__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256074": {
+      "content": "__rus__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256075": {
+      "content": "__sat__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256076": {
+      "content": "__slk__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256077": {
+      "content": "__slv__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256078": {
+      "content": "__sna__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256079": {
+      "content": "__snd__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256080": {
+      "content": "__som__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256081": {
+      "content": "__spa__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256082": {
+      "content": "__srp__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256083": {
+      "content": "__swe__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256084": {
+      "content": "__swh__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256085": {
+      "content": "__tam__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256086": {
+      "content": "__tel__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256087": {
+      "content": "__tgk__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256088": {
+      "content": "__tgl__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256089": {
+      "content": "__tha__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256090": {
+      "content": "__tur__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256091": {
+      "content": "__ukr__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256092": {
+      "content": "__urd__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256093": {
+      "content": "__uzn__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256094": {
+      "content": "__vie__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256095": {
+      "content": "__yor__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256096": {
+      "content": "__yue__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256097": {
+      "content": "__zlm__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "256098": {
+      "content": "__zul__",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "__afr__",
+    "__amh__",
+    "__arb__",
+    "__ary__",
+    "__arz__",
+    "__asm__",
+    "__azj__",
+    "__bel__",
+    "__ben__",
+    "__bos__",
+    "__bul__",
+    "__cat__",
+    "__ceb__",
+    "__ces__",
+    "__ckb__",
+    "__cmn__",
+    "__cmn_Hant__",
+    "__cym__",
+    "__dan__",
+    "__deu__",
+    "__ell__",
+    "__eng__",
+    "__est__",
+    "__eus__",
+    "__fin__",
+    "__fra__",
+    "__fuv__",
+    "__gaz__",
+    "__gle__",
+    "__glg__",
+    "__guj__",
+    "__heb__",
+    "__hin__",
+    "__hrv__",
+    "__hun__",
+    "__hye__",
+    "__ibo__",
+    "__ind__",
+    "__isl__",
+    "__ita__",
+    "__jav__",
+    "__jpn__",
+    "__kan__",
+    "__kat__",
+    "__kaz__",
+    "__khk__",
+    "__khm__",
+    "__kir__",
+    "__kor__",
+    "__lao__",
+    "__lit__",
+    "__lug__",
+    "__luo__",
+    "__lvs__",
+    "__mai__",
+    "__mal__",
+    "__mar__",
+    "__mkd__",
+    "__mlt__",
+    "__mni__",
+    "__mya__",
+    "__nld__",
+    "__nno__",
+    "__nob__",
+    "__npi__",
+    "__nya__",
+    "__ory__",
+    "__pan__",
+    "__pbt__",
+    "__pes__",
+    "__pol__",
+    "__por__",
+    "__ron__",
+    "__rus__",
+    "__sat__",
+    "__slk__",
+    "__slv__",
+    "__sna__",
+    "__snd__",
+    "__som__",
+    "__spa__",
+    "__srp__",
+    "__swe__",
+    "__swh__",
+    "__tam__",
+    "__tel__",
+    "__tgk__",
+    "__tgl__",
+    "__tha__",
+    "__tur__",
+    "__ukr__",
+    "__urd__",
+    "__uzn__",
+    "__vie__",
+    "__yor__",
+    "__yue__",
+    "__zlm__",
+    "__zul__"
+  ],
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<pad>",
+  "processor_class": "SeamlessM4TProcessor",
+  "sep_token": "</s>",
+  "sp_model_kwargs": {},
+  "src_lang": "__eng__",
+  "tgt_lang": "__fra__",
+  "tokenizer_class": "SeamlessM4TTokenizer",
+  "unk_token": "<unk>"
+}

vocoder_v2.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:20c50c3edd3fb08704c10542bf3ec72e8a96aaba4ec09fb6ac1fa64172c8ca13
+size 167785015