README.md 1.1 KB
Newer Older
H
Hui Zhang 已提交
1 2
# Speech Application based on PaddleSpeech

K
KP 已提交
3 4
([简体中文](./README_cn.md)|English)

R
r 已提交
5
This directory contains many speech applications in multiple scenarios.
H
Hui Zhang 已提交
6

7
* audio searching - mass audio similarity retrieval
K
KP 已提交
8
* audio tagging - multi-label tagging of an audio file
R
r 已提交
9
* automatic_video_subtitles - generate subtitles from a video
K
KP 已提交
10 11
* metaverse - 2D AR with TTS  
* punctuation_restoration - restore punctuation from raw text
R
r 已提交
12
* speech recognition - recognize text of an audio file 
H
Hui Zhang 已提交
13
* speech server - Server for Speech Task, e.g. ASR,TTS,CLS
14
* streaming asr server - receive audio stream from websocket, and recognize to transcript.
L
lym0302 已提交
15
* streaming tts server - receive text from http or websocket, and streaming audio data stream.
H
Hui Zhang 已提交
16 17 18
* speech translation - end to end speech translation  
* story talker - book reader based on OCR and TTS  
* style_fs2 - multi style control for FastSpeech2 model  
K
KP 已提交
19
* text_to_speech - convert text into speech 
20 21
* self supervised pretraining - speech feature extraction and speech recognition based on wav2vec2
* Wishper - speech recognize and translate based on Whisper model