README.md · devasheeshG/whisper_medium_fp16_transformers at 803f44174100f6a865686bd4e13c8675281f5917

metadata

license: apache-2.0

Versions:

CUDA: 12.1
cuDNN Version: 8.9.2.26_1.0-1_amd64

tensorflow Version: 2.12.0
torch Version: 2.1.0.dev20230606+cu121
transformers Version: 4.30.2
accelerate Version: 0.20.3

BENCHMARK:

RAM: 2.8 GB (Original_Model: 5.5GB)
VRAM: 1812 MB (Original_Model: 6GB)
test.wav: 23 s (Multilingual Speech i.e. English+Hindi)

Device Name float32 (Original) float16 CudaCores TensorCores

3060 1.7 1.1 3,584 112

1660 Super can't use this model 3.3 1,408 -

Collab (Tesla T4) 2.8 2.2 2,560 320

CPU - - - -
- CPU -> torch.float16 not supported on CPU (AMD Ryzen 5 3600 or Collab GPU)
Punchuation: True

Device Name	float32 (Original)	float16	CudaCores	TensorCores
3060	1.7	1.1	3,584	112
1660 Super	can't use this model	3.3	1,408	-
Collab (Tesla T4)	2.8	2.2	2,560	320
CPU	-	-	-	-

Usage

A file __init__.py is contained inside this repo which contains all the code to use this model.

Firstly, clone this repo and place all the files inside a folder.

Please try in jupyter notebook

# Import the Model
from whisper_medium_fp16_transformers import Model

# Initilise the model
model = Model(
            model_name_or_path='whisper_medium_fp16_transformers',
            cuda_visible_device="0", 
            device='cuda',
      )

# Load Audio
audio = model.load_audio('test.wav')

# Transcribe (First transcription takes time.)
model.transcribe(audio)