README.md update (#414)

Co-authored-by: N Corentin Jemine <corentin.jemine@gmail.com>

README.md update (#414)
Co-authored-by: N Corentin Jemine <corentin.jemine@gmail.com>
6edc39eb · blue-fish · GitHub · 91ab270d · 6edc39eb
隐藏空白更改
内联并排

Showing with 16 addition and 11 deletion

README.md README.md +16 -11

未找到文件。
--- a/README.md
+++ b/README.md
@@ -31,31 +31,29 @@ SV2TTS is a three-stage deep learning framework that allows to create a numerica
 ## Setup
-Note: setup up this project is a lot of work. Somebody took the time to make [a better guide](https://poorlydocumented.com/2019/11/installing-corentinjs-real-time-voice-cloning-project-on-windows-10-from-scratch/) on how to install everything. I recommend using it. 
-### Requirements
+### 1. Install Requirements
-You will need the following whether you plan to use the toolbox only or to retrain the models.
-**Python 3.6+**.
+**Python 3.6 or 3.7** is needed to run the toolbox.
-Run `pip install -r requirements.txt` to install the necessary packages. Additionally you will need [PyTorch](https://pytorch.org/get-started/locally/) (>=1.0.1).
+* Install [PyTorch](https://pytorch.org/get-started/locally/) (>=1.0.1).
+* Install [ffmpeg](https://ffmpeg.org/download.html#get-packages).
+* Run `pip install -r requirements.txt` to install the remaining necessary packages.
-If you have a GPU, run `pip install -r requirements_gpu.txt` to enable GPU support. A GPU is recommended, but it is not required to use the toolbox.
+### 2. Download Pretrained Models
-### Pretrained models
 Download the latest [here](https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Pretrained-models).
-### Preliminary
+### 3. (Optional) Test Configuration
 Before you download any dataset, you can begin by testing your configuration with:
 `python demo_cli.py`
 If all tests pass, you're good to go.
-### Datasets
+### 4. (Optional) Download Datasets
 For playing with the toolbox alone, I only recommend downloading [`LibriSpeech/train-clean-100`](http://www.openslr.org/resources/12/train-clean-100.tar.gz). Extract the contents as `<datasets_root>/LibriSpeech/train-clean-100` where `<datasets_root>` is a directory of your choosing. Other datasets are supported in the toolbox, see [here](https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training#datasets). You're free not to download any dataset, but then you will need your own data as audio files or you will have to record it with the toolbox.
-### Toolbox
+### 5. Launch the Toolbox
 You can then try the toolbox:
 `python demo_toolbox.py -d <datasets_root>`  
@@ -63,3 +61,10 @@ or
 `python demo_toolbox.py`  
 depending on whether you downloaded any datasets. If you are running an X-server or if you have the error `Aborted (core dumped)`, see [this issue](https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/11#issuecomment-504733590).
+### 6. (Optional) Enable GPU Support
+Note: Enabling GPU support is a lot of work. You will want to set this up if you are going to train your own models. Somebody took the time to make [a better guide](https://poorlydocumented.com/2019/11/installing-corentinjs-real-time-voice-cloning-project-on-windows-10-from-scratch/) on how to install everything. I recommend using it.
+This command installs additional GPU dependencies and recommended packages: `pip install -r requirements_gpu.txt`
+Additionally, you will need to ensure GPU drivers are properly installed and that your CUDA version matches your PyTorch and Tensorflow installations.