data2vec_aqc / README.md
ak4off's picture
Update README.md
616eae1 verified
---
license: cc
datasets:
- speechlab/SPRING_INX_R1
tags:
- ASR
- speech-recognition
---
# Fairseq Inference Setup and Usage
This repository provides a streamlined setup and guide for performing inference with Fairseq models, tailored for automatic speech recognition.
## Table of Contents
1. [Setup Instructions](#setup-instructions)
2. [Download Required Models](#download-required-models)
3. [Running Inference](#running-inference)
4. [Getting Transcripts](#getting-transcripts)
---
### Setup Instructions
To set up the environment and install necessary dependencies for Fairseq inference, follow these steps.
#### 1. Create and Activate a Virtual Environment
Choose between Python's `venv` or Conda for environment management.
Using `venv`:
```bash
python3.8 -m venv lm_env # use python3.8 or adjust for your preferred version
source lm_env/bin/activate
```
Using Conda:
```bash
conda create -n fairseq_inference python==3.8.10
conda activate fairseq_inference
```
#### 2. Install PyTorch and CUDA
Install the appropriate version of PyTorch and CUDA for your setup:
```bash
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
```
If using Python 3.10.15 and CUDA 12.4:
```bash
pip install torch+cu124 torchvision+cu124 torchaudio+cu124 -f https://download.pytorch.org/whl/cu124/torch_stable.html
```
#### 3. Install Additional Packages
```bash
pip install wheel soundfile editdistance pyarrow tensorboard tensorboardX
```
#### 4. Clone the Fairseq Inference Repository
```bash
git clone https://github.com/Speech-Lab-IITM/Fairseq-Inference.git
cd Fairseq-Inference/fairseq-0.12.2
pip install --editable ./
python setup.py build develop
```
---
### Download Required Models
Download the necessary models for your ASR tasks. Place them in the appropriate directory (`model_path`).
### Running Inference
Once setup is complete and models are downloaded, use the following command to run inference:
```bash
python3 infer.py model_path audio_path
```
This script takes in the model directory and an audio file to generate a transcription.
### Getting Transcripts
After running the inference script, you will receive the transcript for the provided audio file in the output.