Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
@@ -175,12 +175,16 @@ with gr.Blocks(theme=theme, css=css) as demo:
|
|
175 |
## 🎶YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
|
176 |
## Model card:
|
177 |
- Model name: `{model_name}`
|
|
|
|
|
|
|
178 |
- Encoder backbone: Perceiver-TF + Mixture of Experts (2/8)
|
179 |
- Decoder backbone: Multi-channel T5-small
|
180 |
- Tokenizer: MT3 tokens with Singing extension
|
181 |
- Dataset: YourMT3 dataset
|
182 |
- Augmentation strategy: Intra-/Cross dataset stem augment, No Pitch-shifting
|
183 |
- FP Precision: BF16-mixed for training, FP16 for inference
|
|
|
184 |
|
185 |
## Caution:
|
186 |
- Currently running on CPU, and it takes longer than 3 minutes for a 30-second input. Please try [GPU-HuggingFace-demo](mimbres/YourMT3) for fast inference.
|
|
|
175 |
## 🎶YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
|
176 |
## Model card:
|
177 |
- Model name: `{model_name}`
|
178 |
+
<details>
|
179 |
+
<summary>Details</summary>
|
180 |
+
|
181 |
- Encoder backbone: Perceiver-TF + Mixture of Experts (2/8)
|
182 |
- Decoder backbone: Multi-channel T5-small
|
183 |
- Tokenizer: MT3 tokens with Singing extension
|
184 |
- Dataset: YourMT3 dataset
|
185 |
- Augmentation strategy: Intra-/Cross dataset stem augment, No Pitch-shifting
|
186 |
- FP Precision: BF16-mixed for training, FP16 for inference
|
187 |
+
</details>
|
188 |
|
189 |
## Caution:
|
190 |
- Currently running on CPU, and it takes longer than 3 minutes for a 30-second input. Please try [GPU-HuggingFace-demo](mimbres/YourMT3) for fast inference.
|