SPRINGLab
/

data2vec_aqc

speech-recognition

Model card Files Files and versions Community

data2vec_aqc / README.md

ak4off's picture

Update README.md

616eae1 verified 2 months ago

|

history blame contribute delete

2.29 kB

	---
	license: cc
	datasets:
	- speechlab/SPRING_INX_R1
	tags:
	- ASR
	- speech-recognition
	---


	# Fairseq Inference Setup and Usage

	This repository provides a streamlined setup and guide for performing inference with Fairseq models, tailored for automatic speech recognition.

	## Table of Contents

	1. [Setup Instructions](#setup-instructions)
	2. [Download Required Models](#download-required-models)
	3. [Running Inference](#running-inference)
	4. [Getting Transcripts](#getting-transcripts)

	---

	### Setup Instructions

	To set up the environment and install necessary dependencies for Fairseq inference, follow these steps.

	#### 1. Create and Activate a Virtual Environment

	Choose between Python's `venv` or Conda for environment management.

	Using `venv`:
	```bash
	python3.8 -m venv lm_env # use python3.8 or adjust for your preferred version
	source lm_env/bin/activate
	```

	Using Conda:
	```bash
	conda create -n fairseq_inference python==3.8.10
	conda activate fairseq_inference
	```

	#### 2. Install PyTorch and CUDA

	Install the appropriate version of PyTorch and CUDA for your setup:
	```bash
	pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
	```

	If using Python 3.10.15 and CUDA 12.4:
	```bash
	pip install torch+cu124 torchvision+cu124 torchaudio+cu124 -f https://download.pytorch.org/whl/cu124/torch_stable.html
	```

	#### 3. Install Additional Packages

	```bash
	pip install wheel soundfile editdistance pyarrow tensorboard tensorboardX
	```

	#### 4. Clone the Fairseq Inference Repository

	```bash
	git clone https://github.com/Speech-Lab-IITM/Fairseq-Inference.git
	cd Fairseq-Inference/fairseq-0.12.2
	pip install --editable ./
	python setup.py build develop

	```

	---

	### Download Required Models

	Download the necessary models for your ASR tasks. Place them in the appropriate directory (`model_path`).

	### Running Inference

	Once setup is complete and models are downloaded, use the following command to run inference:

	```bash
	python3 infer.py model_path audio_path
	```

	This script takes in the model directory and an audio file to generate a transcription.

	### Getting Transcripts

	After running the inference script, you will receive the transcript for the provided audio file in the output.