eustlb HF staff commited on
Commit
82daa1b
·
verified ·
1 Parent(s): fec8e73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -38
README.md CHANGED
@@ -93,44 +93,6 @@ We anticipate that Moonshine models’ transcription capabilities may be used fo
93
 
94
  There are also potential dual-use concerns that come with releasing Moonshine. While we hope the technology will be used primarily for beneficial purposes, making ASR technology more accessible could enable more actors to build capable surveillance technologies or scale up existing surveillance efforts, as the speed and accuracy allow for affordable automatic transcription and translation of large volumes of audio communication. Moreover, these models may have some capabilities to recognize specific individuals out of the box, which in turn presents safety concerns related both to dual use and disparate performance. In practice, we expect that the cost of transcription is not the limiting factor of scaling up surveillance projects.
95
 
96
- ## Setup
97
-
98
- * Install `uv` for Python environment management
99
-
100
- - Follow instructions [here](https://github.com/astral-sh/uv)
101
-
102
- * Create and activate virtual environment
103
-
104
- ```shell
105
- uv venv env_moonshine
106
- source env_moonshine/bin/activate
107
- ```
108
-
109
- * Install the `useful-moonshine` package from this github repo
110
-
111
- ```shell
112
- uv pip install transformers torchaudio
113
- ```
114
-
115
- * Test transcribing an audio file
116
-
117
- ```python
118
- from transformers import AutoModelForSpeechSeq2Seq, AutoConfig, PreTrainedTokenizerFast
119
-
120
- import torchaudio
121
- import sys
122
-
123
- audio, sr = torchaudio.load(sys.argv[1])
124
- if sr != 16000:
125
- audio = torchaudio.functional.resample(audio, sr, 16000)
126
-
127
- model = AutoModelForSpeechSeq2Seq.from_pretrained('usefulsensors/moonshine-tiny', trust_remote_code=True)
128
- tokenizer = PreTrainedTokenizerFast.from_pretrained('usefulsensors/moonshine-tiny')
129
-
130
- tokens = model(audio)
131
- print(tokenizer.decode(tokens[0], skip_special_tokens=True))
132
- ```
133
-
134
  ## Citation
135
  If you benefit from our work, please cite us:
136
  ```
 
93
 
94
  There are also potential dual-use concerns that come with releasing Moonshine. While we hope the technology will be used primarily for beneficial purposes, making ASR technology more accessible could enable more actors to build capable surveillance technologies or scale up existing surveillance efforts, as the speed and accuracy allow for affordable automatic transcription and translation of large volumes of audio communication. Moreover, these models may have some capabilities to recognize specific individuals out of the box, which in turn presents safety concerns related both to dual use and disparate performance. In practice, we expect that the cost of transcription is not the limiting factor of scaling up surveillance projects.
95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  ## Citation
97
  If you benefit from our work, please cite us:
98
  ```