1step ASR-NL for two-people patient-doctor medical exams.

This model is adapted from NemO based ASR model. It can take a turn-taking doctor-patient conversation as input in streaming, to output speaker roles, transcription, key-phrases, intent and speaker emotion.

This model is suitable for efficient and accurate transcription of two-people patient-doctor medical exams.

Model Details

Model type: NeMo ASR
Architecture: Conformer CTC
Language: English
Training data: Speech Simulated Medical Exams dataset
Performance metrics: [Metrics]

Usage

To use this model, you need to install the NeMo library:

pip install nemo_toolkit

How to run

Assuming in the Nemo docker, or have nemo_toolkit installed

#Make sure you change chunk_len_in_secs = 4.0,total_buffer_in_secs = 7.2, model_stride = 4

python NeMo/examples/asr/asr_chunked_inference/ctc/speech_to_text_buffered_infer_ctc.py \
          model_path=<path-to-nemo-file> \
          audio_dir=<folder-with-audio-files>