RafaG commited on
Commit
cc8a95a
·
verified ·
1 Parent(s): 5fa5566

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +164 -152
README.md CHANGED
@@ -1,152 +1,164 @@
1
- # LeGen
2
-
3
- ![legen-wide](https://github.com/matheusbach/legen/assets/35426162/05a7acd2-52d5-43e0-8f31-7da7d6aa7c3c)
4
-
5
-
6
- LeGen is a Python script that uses Whisper/WhisperX AI to locally transcribes speech from media files, generating subtitle files, can translates the generated subtitles, inserts them into the mp4 container, and burns them directly into video
7
-
8
- This is very useful for making it available in another language, or even just subtitling any video that belongs to you or that you have the proper authorization to do so, be it a film, lecture, course, presentation, interview, etc.
9
-
10
- ## Run on Colab
11
-
12
- LeGen works on Google Colab, using their computing power to do the work. Aceess the link to [run on Google Colab](https://colab.research.google.com/github/matheusbach/legen/blob/main/legen.ipynb)
13
-
14
- <a href='https://colab.research.google.com/github/matheusbach/legen/blob/main/legen.ipynb' style='padding-left: 0.5rem;'><img src='https://colab.research.google.com/assets/colab-badge.svg' alt='Google Colab'></a>
15
-
16
- ## Install locally:
17
-
18
- Install FFMpeg from [FFMPeg Oficial Site](https://ffmpeg.org/download.html) or from your linux package manager. _If using windows, prefer gyan_dev release full `choco install ffmpeg-full`_
19
-
20
- Install [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
21
-
22
- Install [Python](https://www.python.org/downloads/) 3.8 or up. _If using windows, select "Add to PATH" option when installing_
23
-
24
- Clone LeGen using git
25
- ```sh
26
- git clone https://github.com/matheusbach/legen.git
27
- cd legen
28
- ```
29
-
30
- Install requirements using pip. Is recommended to create a virtual environment (venv) as a good practice
31
- ```sh
32
- pip3 install -r requirements.txt --upgrade
33
- ```
34
-
35
- ### GPU compatibility
36
-
37
- If having troubles with GPU compatibility, get [PyTorch](https://pytorch.org/get-started/locally/) for your GPU.
38
-
39
- _And done. Now you can use LeGen_
40
-
41
- ### Update
42
-
43
- For dry-run update, use in legen folder:
44
- ```sh
45
- git fetch && git reset --hard origin/main && git pull
46
- pip3 install -r requirements.txt --upgrade --force-reinstall
47
- ```
48
-
49
- ## Run locally:
50
-
51
- To use LeGen, run the following command:
52
-
53
- The minimum comand line is:
54
-
55
- ```sh
56
- python3 legen.py -i [input_path]
57
- ```
58
-
59
- Users could for example also translate generated subtitles for other language like portuguese (pt) adding `--translate pt` to the command line
60
-
61
-
62
- Full options list are described bellow:
63
-
64
- - `-i`, `--input_path`: Specifies the path to the media files. This can be a folder containing files or an individual file. Example: `LeGen -i /path/to/media/files`.
65
-
66
- - `--norm`: Normalizes folder times and runs vidqa on the input path before starting to process files. Useful for synchronizing timestamps across multiple media files.
67
-
68
- - `-ts:e`, `--transcription_engine`: Specifies the transcription engine to use. Possible values are "whisperx" and "whisper". Default is "whisperx".
69
-
70
- - `-ts:m`, `--transcription_model`: Specifies the path or name of the Whisper transcription model. A larger model will consume more resources and be slower, but with better transcription quality. Possible values: tiny, base, small, medium (default), large, ...
71
-
72
- - `-ts:d`, `--transcription_device`: Specifies the device to run the transcription through Whisper. Possible values: auto (default), cpu, cuda.
73
-
74
- - `-ts:c`, `--transcription_compute_type`: Specifies the quantization for the neural network. Possible values: auto (default), int8, int8_float32, int8_float16, int8_bfloat16, int16, float16, bfloat16, float32.
75
-
76
- - `-ts:b`, `--transcription_batch`: Specifies the number of simultaneous segments being transcribed. Higher values will speed up processing. If you have low RAM/VRAM, long duration media files or have buggy subtitles, reduce this value to avoid issues. Only works using transcription_engine whisperx. Default is 4.
77
-
78
- - `--translate`: Translates subtitles to a language code if they are not the same as the original. The language code should be specified after the equals sign. For example, `LeGen --translate=fr` would translate the subtitles to French.
79
-
80
- - `--input_lang`: Indicates (forces) the language of the voice in the input media. Default is "auto".
81
-
82
- - `-c:v`, `--codec_video`: Specifies the target video codec. Can be used to set acceleration via GPU or another video API [codec_api], if supported (ffmpeg -encoders). Examples include h264, libx264, h264_vaapi, h264_nvenc, hevc, libx265 hevc_vaapi, hevc_nvenc, hevc_cuvid, hevc_qsv, hevc_amf. Default is h264.
83
-
84
- - `-c:a`, `--codec_audio`: Specifies the target audio codec. Default is aac. Examples include aac, libopus, mp3, vorbis.
85
-
86
- - `-o:s`, `--output_softsubs`: Specifies the path to the folder or output file for the video files with embedded softsub (embedded in the mp4 container and .srt files). Default is "softsubs_" followed by the input path.
87
-
88
- - `-o:h`, `--output_hardsubs`: Specifies the output folder path for video files with burned-in captions and embedded in the mp4 container. Default is "hardsubs_" followed by the input path.
89
-
90
- - `--overwrite`: Overwrites existing files in output directories. By default, this option is false.
91
-
92
- - `--disable_srt`: Disables .srt file generation and doesn't insert subtitles in the mp4 container of output_softsubs. By default, this option is false.
93
-
94
- - `--disable_softsubs`: Doesn't insert subtitles in the mp4 container of output_softsubs. This option continues generating .srt files. By default, this option is false.
95
-
96
- - `--disable_hardsubs`: Disables subtitle burn in output_hardsubs. By default, this option is false.
97
-
98
- - `--copy_files`: Copies other (non-video) files present in the input directory to output directories. Only generates the subtitles and videos. By default, this option is false.
99
-
100
- Each of these options provides control over various aspects of the video processing workflow. Make sure to refer to the documentation or help message (`LeGen --help`) for more details on each option[Source 0](https://docs.python.org/3/library/argparse.html)[Source 2](https://realpython.com/command-line-interfaces-python-argparse/).
101
-
102
- ## Dependencies
103
-
104
- LeGen requires the following **pip** dependencies to be installed:
105
- - deep_translator
106
- - ffmpeg_progress_yield
107
- - openai_whisper
108
- - pysrt
109
- - torch
110
- - tqdm
111
- - whisper
112
- - vidqa
113
- - matheusbach/whisperx (fork from m-bain/whisperx)
114
-
115
- This dependencies can be installed and updated with ```pip install -r requirements.txt --upgrade```
116
-
117
- You also need to [install FFmpeg](https://ffmpeg.org/download.html)
118
-
119
- ## Contributing
120
-
121
- Contributions are welcome. Submit your pull request ❤️
122
-
123
- ## Issues, Doubts
124
-
125
- Not being able to use the software, or encountering an error? open an [issue](https://github.com/matheusbach/legen/issues/new)
126
-
127
- ## Telegram Group
128
-
129
- Welcome and don't be a sick. We are brazilian, but you can write in other language if you want. https://t.me/+c0VRonlcd9Q2YTAx
130
-
131
- ## Video Tutorials
132
-
133
- [PT-BR] [SEMI-OUTDATED] [**Tutorial - LeGen no Google Colab**](https://odysee.com/@legen_software:d/legen_no_colab:0)
134
-
135
- ## Donations
136
-
137
- You can donate to project using:
138
- Monero (XMR): ```86HjTCsiaELEoNhH96rTf3ezGMXgKmHjqFrNmca2tesCESdCTZvRvQ9QWQXPGDtmaZhKz4ryHCdZXFzdbmtGahVa5VMLJnx```
139
- LivePix: https://livepix.gg/legendonate
140
-
141
- ### Donators
142
- - Picasso Neves
143
- - Erasmo de Souza Mora
144
- - viniciuspro
145
- - Igor
146
- - NiNi
147
- - PopularC
148
-
149
-
150
- ## License
151
-
152
- This project is licensed under the terms of the [GNU GPLv3](https://choosealicense.com/licenses/gpl-3.0/).
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Legen
3
+ emoji: 💻
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: gradio
7
+ sdk_version: 5.0.1
8
+ app_file: app_hf.py
9
+ pinned: false
10
+ license: gpl-3.0
11
+ ---
12
+
13
+ # LeGen
14
+
15
+ ![legen-wide](https://github.com/matheusbach/legen/assets/35426162/05a7acd2-52d5-43e0-8f31-7da7d6aa7c3c)
16
+
17
+
18
+ LeGen is a Python script that uses Whisper/WhisperX AI to locally transcribes speech from media files, generating subtitle files, can translates the generated subtitles, inserts them into the mp4 container, and burns them directly into video
19
+
20
+ This is very useful for making it available in another language, or even just subtitling any video that belongs to you or that you have the proper authorization to do so, be it a film, lecture, course, presentation, interview, etc.
21
+
22
+ ## Run on Colab
23
+
24
+ LeGen works on Google Colab, using their computing power to do the work. Aceess the link to [run on Google Colab](https://colab.research.google.com/github/matheusbach/legen/blob/main/legen.ipynb)
25
+
26
+ <a href='https://colab.research.google.com/github/matheusbach/legen/blob/main/legen.ipynb' style='padding-left: 0.5rem;'><img src='https://colab.research.google.com/assets/colab-badge.svg' alt='Google Colab'></a>
27
+
28
+ ## Install locally:
29
+
30
+ Install FFMpeg from [FFMPeg Oficial Site](https://ffmpeg.org/download.html) or from your linux package manager. _If using windows, prefer gyan_dev release full `choco install ffmpeg-full`_
31
+
32
+ Install [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
33
+
34
+ Install [Python](https://www.python.org/downloads/) 3.8 or up. _If using windows, select "Add to PATH" option when installing_
35
+
36
+ Clone LeGen using git
37
+ ```sh
38
+ git clone https://github.com/matheusbach/legen.git
39
+ cd legen
40
+ ```
41
+
42
+ Install requirements using pip. Is recommended to create a virtual environment (venv) as a good practice
43
+ ```sh
44
+ pip3 install -r requirements.txt --upgrade
45
+ ```
46
+
47
+ ### GPU compatibility
48
+
49
+ If having troubles with GPU compatibility, get [PyTorch](https://pytorch.org/get-started/locally/) for your GPU.
50
+
51
+ _And done. Now you can use LeGen_
52
+
53
+ ### Update
54
+
55
+ For dry-run update, use in legen folder:
56
+ ```sh
57
+ git fetch && git reset --hard origin/main && git pull
58
+ pip3 install -r requirements.txt --upgrade --force-reinstall
59
+ ```
60
+
61
+ ## Run locally:
62
+
63
+ To use LeGen, run the following command:
64
+
65
+ The minimum comand line is:
66
+
67
+ ```sh
68
+ python3 legen.py -i [input_path]
69
+ ```
70
+
71
+ Users could for example also translate generated subtitles for other language like portuguese (pt) adding `--translate pt` to the command line
72
+
73
+
74
+ Full options list are described bellow:
75
+
76
+ - `-i`, `--input_path`: Specifies the path to the media files. This can be a folder containing files or an individual file. Example: `LeGen -i /path/to/media/files`.
77
+
78
+ - `--norm`: Normalizes folder times and runs vidqa on the input path before starting to process files. Useful for synchronizing timestamps across multiple media files.
79
+
80
+ - `-ts:e`, `--transcription_engine`: Specifies the transcription engine to use. Possible values are "whisperx" and "whisper". Default is "whisperx".
81
+
82
+ - `-ts:m`, `--transcription_model`: Specifies the path or name of the Whisper transcription model. A larger model will consume more resources and be slower, but with better transcription quality. Possible values: tiny, base, small, medium (default), large, ...
83
+
84
+ - `-ts:d`, `--transcription_device`: Specifies the device to run the transcription through Whisper. Possible values: auto (default), cpu, cuda.
85
+
86
+ - `-ts:c`, `--transcription_compute_type`: Specifies the quantization for the neural network. Possible values: auto (default), int8, int8_float32, int8_float16, int8_bfloat16, int16, float16, bfloat16, float32.
87
+
88
+ - `-ts:b`, `--transcription_batch`: Specifies the number of simultaneous segments being transcribed. Higher values will speed up processing. If you have low RAM/VRAM, long duration media files or have buggy subtitles, reduce this value to avoid issues. Only works using transcription_engine whisperx. Default is 4.
89
+
90
+ - `--translate`: Translates subtitles to a language code if they are not the same as the original. The language code should be specified after the equals sign. For example, `LeGen --translate=fr` would translate the subtitles to French.
91
+
92
+ - `--input_lang`: Indicates (forces) the language of the voice in the input media. Default is "auto".
93
+
94
+ - `-c:v`, `--codec_video`: Specifies the target video codec. Can be used to set acceleration via GPU or another video API [codec_api], if supported (ffmpeg -encoders). Examples include h264, libx264, h264_vaapi, h264_nvenc, hevc, libx265 hevc_vaapi, hevc_nvenc, hevc_cuvid, hevc_qsv, hevc_amf. Default is h264.
95
+
96
+ - `-c:a`, `--codec_audio`: Specifies the target audio codec. Default is aac. Examples include aac, libopus, mp3, vorbis.
97
+
98
+ - `-o:s`, `--output_softsubs`: Specifies the path to the folder or output file for the video files with embedded softsub (embedded in the mp4 container and .srt files). Default is "softsubs_" followed by the input path.
99
+
100
+ - `-o:h`, `--output_hardsubs`: Specifies the output folder path for video files with burned-in captions and embedded in the mp4 container. Default is "hardsubs_" followed by the input path.
101
+
102
+ - `--overwrite`: Overwrites existing files in output directories. By default, this option is false.
103
+
104
+ - `--disable_srt`: Disables .srt file generation and doesn't insert subtitles in the mp4 container of output_softsubs. By default, this option is false.
105
+
106
+ - `--disable_softsubs`: Doesn't insert subtitles in the mp4 container of output_softsubs. This option continues generating .srt files. By default, this option is false.
107
+
108
+ - `--disable_hardsubs`: Disables subtitle burn in output_hardsubs. By default, this option is false.
109
+
110
+ - `--copy_files`: Copies other (non-video) files present in the input directory to output directories. Only generates the subtitles and videos. By default, this option is false.
111
+
112
+ Each of these options provides control over various aspects of the video processing workflow. Make sure to refer to the documentation or help message (`LeGen --help`) for more details on each option[Source 0](https://docs.python.org/3/library/argparse.html)[Source 2](https://realpython.com/command-line-interfaces-python-argparse/).
113
+
114
+ ## Dependencies
115
+
116
+ LeGen requires the following **pip** dependencies to be installed:
117
+ - deep_translator
118
+ - ffmpeg_progress_yield
119
+ - openai_whisper
120
+ - pysrt
121
+ - torch
122
+ - tqdm
123
+ - whisper
124
+ - vidqa
125
+ - matheusbach/whisperx (fork from m-bain/whisperx)
126
+
127
+ This dependencies can be installed and updated with ```pip install -r requirements.txt --upgrade```
128
+
129
+ You also need to [install FFmpeg](https://ffmpeg.org/download.html)
130
+
131
+ ## Contributing
132
+
133
+ Contributions are welcome. Submit your pull request ❤️
134
+
135
+ ## Issues, Doubts
136
+
137
+ Not being able to use the software, or encountering an error? open an [issue](https://github.com/matheusbach/legen/issues/new)
138
+
139
+ ## Telegram Group
140
+
141
+ Welcome and don't be a sick. We are brazilian, but you can write in other language if you want. https://t.me/+c0VRonlcd9Q2YTAx
142
+
143
+ ## Video Tutorials
144
+
145
+ [PT-BR] [SEMI-OUTDATED] [**Tutorial - LeGen no Google Colab**](https://odysee.com/@legen_software:d/legen_no_colab:0)
146
+
147
+ ## Donations
148
+
149
+ You can donate to project using:
150
+ Monero (XMR): ```86HjTCsiaELEoNhH96rTf3ezGMXgKmHjqFrNmca2tesCESdCTZvRvQ9QWQXPGDtmaZhKz4ryHCdZXFzdbmtGahVa5VMLJnx```
151
+ LivePix: https://livepix.gg/legendonate
152
+
153
+ ### Donators
154
+ - Picasso Neves
155
+ - Erasmo de Souza Mora
156
+ - viniciuspro
157
+ - Igor
158
+ - NiNi
159
+ - PopularC
160
+
161
+
162
+ ## License
163
+
164
+ This project is licensed under the terms of the [GNU GPLv3](https://choosealicense.com/licenses/gpl-3.0/).