|
--- |
|
license: cc |
|
datasets: |
|
- speechlab/SPRING_INX_R1 |
|
tags: |
|
- ASR |
|
- speech-recognition |
|
--- |
|
|
|
|
|
# Fairseq Inference Setup and Usage |
|
|
|
This repository provides a streamlined setup and guide for performing inference with Fairseq models, tailored for automatic speech recognition. |
|
|
|
## Table of Contents |
|
|
|
1. [Setup Instructions](#setup-instructions) |
|
2. [Download Required Models](#download-required-models) |
|
3. [Running Inference](#running-inference) |
|
4. [Getting Transcripts](#getting-transcripts) |
|
|
|
--- |
|
|
|
### Setup Instructions |
|
|
|
To set up the environment and install necessary dependencies for Fairseq inference, follow these steps. |
|
|
|
#### 1. Create and Activate a Virtual Environment |
|
|
|
Choose between Python's `venv` or Conda for environment management. |
|
|
|
Using `venv`: |
|
```bash |
|
python3.8 -m venv lm_env # use python3.8 or adjust for your preferred version |
|
source lm_env/bin/activate |
|
``` |
|
|
|
Using Conda: |
|
```bash |
|
conda create -n fairseq_inference python==3.8.10 |
|
conda activate fairseq_inference |
|
``` |
|
|
|
#### 2. Install PyTorch and CUDA |
|
|
|
Install the appropriate version of PyTorch and CUDA for your setup: |
|
```bash |
|
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html |
|
``` |
|
|
|
If using Python 3.10.15 and CUDA 12.4: |
|
```bash |
|
pip install torch+cu124 torchvision+cu124 torchaudio+cu124 -f https://download.pytorch.org/whl/cu124/torch_stable.html |
|
``` |
|
|
|
#### 3. Install Additional Packages |
|
|
|
```bash |
|
pip install wheel soundfile editdistance pyarrow tensorboard tensorboardX |
|
``` |
|
|
|
#### 4. Clone the Fairseq Inference Repository |
|
|
|
```bash |
|
git clone https://github.com/Speech-Lab-IITM/Fairseq-Inference.git |
|
cd Fairseq-Inference/fairseq-0.12.2 |
|
pip install --editable ./ |
|
python setup.py build develop |
|
|
|
``` |
|
|
|
--- |
|
|
|
### Download Required Models |
|
|
|
Download the necessary models for your ASR tasks. Place them in the appropriate directory (`model_path`). |
|
|
|
### Running Inference |
|
|
|
Once setup is complete and models are downloaded, use the following command to run inference: |
|
|
|
```bash |
|
python3 infer.py model_path audio_path |
|
``` |
|
|
|
This script takes in the model directory and an audio file to generate a transcription. |
|
|
|
### Getting Transcripts |
|
|
|
After running the inference script, you will receive the transcript for the provided audio file in the output. |