devasheeshG
/

whisper_medium_fp16_transformers

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

devasheeshG commited on Jul 2, 2023

Commit

803f441

·

1 Parent(s): 967796b

added model card

Files changed (4) hide show

.gitattributes +1 -0
README.md +58 -0
requirements.txt +5 -0
test.wav +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+test.wav filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+## Versions:
+- CUDA: 12.1
+- cuDNN Version: 8.9.2.26_1.0-1_amd64
+* tensorflow Version: 2.12.0
+* torch Version: 2.1.0.dev20230606+cu121
+* transformers Version: 4.30.2
+* accelerate Version: 0.20.3
+## BENCHMARK:
+- RAM: 2.8 GB (Original_Model: 5.5GB)
+- VRAM: 1812 MB (Original_Model: 6GB)
+- test.wav: 23 s (Multilingual Speech i.e. English+Hindi)
+  | Device Name       | float32 (Original)   | float16 | CudaCores | TensorCores |
+  | ----------------- | -------------------- | ------- | --------- | ----------- |
+  | 3060              | 1.7                  | 1.1     | 3,584     | 112         |
+  | 1660 Super        | can't use this model | 3.3     | 1,408     | -           |
+  | Collab (Tesla T4) | 2.8                  | 2.2     | 2,560     | 320         |
+  | CPU               | -                    | -       | -         | -           |
+  - CPU -> torch.float16 not supported on CPU (AMD Ryzen 5 3600 or Collab GPU)
+- Punchuation: True
+## Usage
+A file ``__init__.py`` is contained inside this repo which contains all the code to use this model.
+Firstly, clone this repo and place all the files inside a folder.
+**Please try in jupyter notebook**
+```python
+# Import the Model
+from whisper_medium_fp16_transformers import Model
+```
+```python
+# Initilise the model
+model = Model(
+            model_name_or_path='whisper_medium_fp16_transformers',
+            cuda_visible_device="0",
+            device='cuda',
+      )
+```
+```python
+# Load Audio
+audio = model.load_audio('test.wav')
+```
+```python
+# Transcribe (First transcription takes time.)
+model.transcribe(audio)
+```

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+ffmpeg_python==0.2.0
+numpy==1.23.5
+torch==2.1.0.dev20230606+cu121
+transformers==4.30.2
+accelerate==0.20.3

test.wav ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1483a4b2c200e9c0fd9c3006158665740f739c81c20da572afbbf33e3a5a3fd6
+size 4452466