add streaming asr demo, test=doc

1a0c2bea · xiongxinlei · 0b46e423 · 1a0c2bea · 1a0c2bea · 1a0c2bea
36 changed file
--- a/demos/streaming_asr_server/README.md
+++ b/demos/streaming_asr_server/README.md
+([简体中文](./README_cn.md)|English)
+
+# Speech Server
+
+## Introduction
+This demo is an implementation of starting the voice service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
+
+
+## Usage
+### 1. Installation
+see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
+
+It is recommended to use **paddlepaddle 2.2.1** or above.
+You can choose one way from meduim and hard to install paddlespeech.
+
+### 2. Prepare config File
+The configuration file can be found in `conf/application.yaml` .
+Among them, `engine_list` indicates the speech engine that will be included in the service to be started, in the format of `<speech task>_<engine type>`.
+At present, the speech tasks integrated by the service include: asr (speech recognition), tts (text to sppech) and cls (audio classification).
+Currently the engine type supports two forms: python and inference (Paddle Inference)
+
+
+The input of  ASR client demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
+
+Here are sample files for thisASR client demo that can be downloaded:
+```bash
+wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
+```
+
+### 3. Server Usage
+- Command Line (Recommended)
+
+  ```bash
+  # start the service
+  paddlespeech_server start --config_file ./conf/application.yaml
+  ```
+
+  Usage:
+  
+  ```bash
+  paddlespeech_server start --help
+  ```
+  Arguments:
+  - `config_file`: yaml file of the app, defalut: ./conf/application.yaml
+  - `log_file`: log file. Default: ./log/paddlespeech.log
+
+  Output:
+  ```bash
+  [2022-02-23 11:17:32] [INFO] [server.py:64] Started server process [6384]
+  INFO:     Waiting for application startup.
+  [2022-02-23 11:17:32] [INFO] [on.py:26] Waiting for application startup.
+  INFO:     Application startup complete.
+  [2022-02-23 11:17:32] [INFO] [on.py:38] Application startup complete.
+  INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+  [2022-02-23 11:17:32] [INFO] [server.py:204] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+
+  server_executor = ServerExecutor()
+  server_executor(
+      config_file="./conf/application.yaml", 
+      log_file="./log/paddlespeech.log")
+  ```
+
+  Output:
+  ```bash
+  INFO:     Started server process [529]
+  [2022-02-23 14:57:56] [INFO] [server.py:64] Started server process [529]
+  INFO:     Waiting for application startup.
+  [2022-02-23 14:57:56] [INFO] [on.py:26] Waiting for application startup.
+  INFO:     Application startup complete.
+  [2022-02-23 14:57:56] [INFO] [on.py:38] Application startup complete.
+  INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+  [2022-02-23 14:57:56] [INFO] [server.py:204] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+
+  ```
+
+
+### 4. ASR Client Usage
+**Note:** The response time will be slightly longer when using the client for the first time
+- Command Line (Recommended)
+   ```
+   paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
+   ```
+
+  Usage:
+  
+  ```bash
+  paddlespeech_client asr --help
+  ```
+  Arguments:
+  - `server_ip`: server ip. Default: 127.0.0.1
+  - `port`: server port. Default: 8090
+  - `input`(required): Audio file to be recognized.
+  - `sample_rate`: Audio ampling rate, default: 16000.
+  - `lang`: Language. Default: "zh_cn".
+  - `audio_format`: Audio format. Default: "wav".
+
+  Output:
+  ```bash
+  [2022-02-23 18:11:22,819] [    INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
+  [2022-02-23 18:11:22,820] [    INFO] - time cost 0.689145 s.
+
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import ASRClientExecutor
+  import json
+
+  asrclient_executor = ASRClientExecutor()
+  res = asrclient_executor(
+      input="./zh.wav",
+      server_ip="127.0.0.1",
+      port=8090,
+      sample_rate=16000,
+      lang="zh_cn",
+      audio_format="wav")
+  print(res.json())
+  ```
+
+  Output:
+  ```bash
+  {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
+  ```
+ 
+### 5. TTS Client Usage
+**Note:** The response time will be slightly longer when using the client for the first time
+- Command Line (Recommended)
+   ```bash
+   paddlespeech_client tts --server_ip 127.0.0.1 --port 8090 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+   ```
+     Usage:
+  
+    ```bash
+    paddlespeech_client tts --help
+    ```
+    Arguments:
+    - `server_ip`: server ip. Default: 127.0.0.1
+    - `port`: server port. Default: 8090
+    - `input`(required): Input text to generate.
+    - `spk_id`: Speaker id for multi-speaker text to speech. Default: 0
+    - `speed`: Audio speed, the value should be set between 0 and 3. Default: 1.0
+    - `volume`: Audio volume, the value should be set between 0 and 3. Default: 1.0
+    - `sample_rate`: Sampling rate, choice: [0, 8000, 16000], the default is the same as the model. Default: 0
+    - `output`: Output wave filepath. Default: None, which means not to save the audio to the local.
+
+    Output:
+    ```bash
+    [2022-02-23 15:20:37,875] [    INFO] - {'description': 'success.'}
+    [2022-02-23 15:20:37,875] [    INFO] - Save synthesized audio successfully on output.wav.
+    [2022-02-23 15:20:37,875] [    INFO] - Audio duration: 3.612500 s.
+    [2022-02-23 15:20:37,875] [    INFO] - Response time: 0.348050 s.
+
+    ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TTSClientExecutor
+  import json
+
+  ttsclient_executor = TTSClientExecutor()
+  res = ttsclient_executor(
+      input="您好，欢迎使用百度飞桨语音合成服务。",
+      server_ip="127.0.0.1",
+      port=8090,
+      spk_id=0,
+      speed=1.0,
+      volume=1.0,
+      sample_rate=0,
+      output="./output.wav")
+
+  response_dict = res.json()
+  print(response_dict["message"])
+  print("Save synthesized audio successfully on %s." % (response_dict['result']['save_path']))
+  print("Audio duration: %f s." %(response_dict['result']['duration']))
+  ```
+
+  Output:
+  ```bash
+  {'description': 'success.'}
+  Save synthesized audio successfully on ./output.wav.
+  Audio duration: 3.612500 s.
+
+  ```
+
+### 6. CLS Client Usage
+**Note:** The response time will be slightly longer when using the client for the first time
+- Command Line (Recommended)
+   ```
+   paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
+   ```
+
+  Usage:
+  
+  ```bash
+  paddlespeech_client cls --help
+  ```
+  Arguments:
+  - `server_ip`: server ip. Default: 127.0.0.1
+  - `port`: server port. Default: 8090
+  - `input`(required): Audio file to be classified.
+  - `topk`: topk scores of classification result.
+
+  Output:
+  ```bash
+  [2022-03-09 20:44:39,974] [    INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'topk': 1, 'results': [{'class_name': 'Speech', 'prob': 0.9027184844017029}]}}
+  [2022-03-09 20:44:39,975] [    INFO] - Response time 0.104360 s.
+
+
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import CLSClientExecutor
+  import json
+
+  clsclient_executor = CLSClientExecutor()
+  res = clsclient_executor(
+      input="./zh.wav",
+      server_ip="127.0.0.1",
+      port=8090,
+      topk=1)
+  print(res.json())
+  ```
+
+  Output:
+  ```bash
+  {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'topk': 1, 'results': [{'class_name': 'Speech', 'prob': 0.9027184844017029}]}}
+
+  ```
+
+
+## Models supported by the service
+### ASR model
+Get all models supported by the ASR service via `paddlespeech_server stats --task asr`, where static models can be used for paddle inference inference.
+
+### TTS model
+Get all models supported by the TTS service via `paddlespeech_server stats --task tts`, where static models can be used for paddle inference inference.
+
+### CLS model
+Get all models supported by the CLS service via `paddlespeech_server stats --task cls`, where static models can be used for paddle inference inference.
--- a/demos/streaming_asr_server/README_cn.md
+++ b/demos/streaming_asr_server/README_cn.md
--- a/demos/streaming_asr_server/conf/ws_application.yaml
+++ b/demos/streaming_asr_server/conf/ws_application.yaml
+# This is the parameter configuration file for PaddleSpeech Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8090
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_online', 'tts_online']
+# protocol = ['websocket', 'http'] (only one can be selected).
+# websocket only support online engine type.
+protocol: 'websocket'
+engine_list: ['asr_online']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online #######################
+asr_online:
+    model_type: 'deepspeech2online_aishell'
+    am_model: # the pdmodel file of am static model [optional]
+    am_params:  # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 
+    force_yes: True
+
+    am_predictor_conf:
+        device:  # set 'gpu:id' or 'cpu'
+        switch_ir_optim: True
+        glog_info: False  # True -> print glog
+        summary: True  # False -> do not show predictor config
+
+    chunk_buffer_conf:
+        frame_duration_ms: 80
+        shift_ms: 40
+        sample_rate: 16000
+        sample_width: 2
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 20   # ms
+        shift_ms: 10    # ms
--- a/demos/streaming_asr_server/conf/ws_conformer_application.yaml
+++ b/demos/streaming_asr_server/conf/ws_conformer_application.yaml
+# This is the parameter configuration file for PaddleSpeech Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8090
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_online', 'tts_online']
+# protocol = ['websocket', 'http'] (only one can be selected).
+# websocket only support online engine type.
+protocol: 'websocket'
+engine_list: ['asr_online']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online #######################
+asr_online:
+    model_type: 'conformer_online_multicn'
+    am_model: # the pdmodel file of am static model [optional]
+    am_params:  # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 
+    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+    am_predictor_conf:
+        device:  # set 'gpu:id' or 'cpu'
+        switch_ir_optim: True
+        glog_info: False  # True -> print glog
+        summary: True  # False -> do not show predictor config
+
+    chunk_buffer_conf:
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 25   # ms
+        shift_ms: 10    # ms
+        sample_rate: 16000
+        sample_width: 2
\ No newline at end of file
--- a/demos/streaming_asr_server/run.sh
+++ b/demos/streaming_asr_server/run.sh
+# start the streaming asr service
+paddlespeech_server start --config_file ./conf/ws_conformer_application.yaml
\ No newline at end of file
--- a/demos/streaming_asr_server/test.sh
+++ b/demos/streaming_asr_server/test.sh
+# download the test wav
+wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav 
+
+# read the wav and pass it to service
+python3 websocket_client.py --wavfile ./zh.wav
--- a/paddlespeech/server/tests/asr/online/web/app.py
+++ b/paddlespeech/server/tests/asr/online/web/app.py
--- a/paddlespeech/server/tests/asr/online/web/paddle_web_demo.png
+++ b/paddlespeech/server/tests/asr/online/web/paddle_web_demo.png
--- a/paddlespeech/server/tests/asr/online/web/readme.md
+++ b/paddlespeech/server/tests/asr/online/web/readme.md
--- a/paddlespeech/server/tests/asr/online/web/static/css/font-awesome.min.css
+++ b/paddlespeech/server/tests/asr/online/web/static/css/font-awesome.min.css
--- a/paddlespeech/server/tests/asr/online/web/static/css/style.css
+++ b/paddlespeech/server/tests/asr/online/web/static/css/style.css
--- a/paddlespeech/server/tests/asr/online/web/static/fonts/FontAwesome.otf
+++ b/paddlespeech/server/tests/asr/online/web/static/fonts/FontAwesome.otf
--- a/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.eot
+++ b/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.eot
--- a/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.svg
+++ b/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.svg
--- a/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.ttf
+++ b/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.ttf
--- a/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.woff
+++ b/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.woff
--- a/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.woff2
+++ b/paddlespeech/server/tests/asr/online/web/static/fonts/fontawesome-webfont.woff2
--- a/paddlespeech/server/tests/asr/online/web/static/image/PaddleSpeech_logo.png
+++ b/paddlespeech/server/tests/asr/online/web/static/image/PaddleSpeech_logo.png
--- a/paddlespeech/server/tests/asr/online/web/static/image/voice-dictation.svg
+++ b/paddlespeech/server/tests/asr/online/web/static/image/voice-dictation.svg
--- a/paddlespeech/server/tests/asr/online/web/static/js/SoundRecognizer.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/SoundRecognizer.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/jquery-3.2.1.min.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/jquery-3.2.1.min.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/recorder/engine/mp3.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/recorder/engine/mp3.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/recorder/engine/pcm.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/recorder/engine/pcm.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/recorder/engine/wav.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/recorder/engine/wav.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/recorder/extensions/frequency.histogram.view.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/recorder/extensions/frequency.histogram.view.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/recorder/extensions/lib.fft.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/recorder/extensions/lib.fft.js
--- a/paddlespeech/server/tests/asr/online/web/static/js/recorder/recorder-core.js
+++ b/paddlespeech/server/tests/asr/online/web/static/js/recorder/recorder-core.js
--- a/paddlespeech/server/tests/asr/online/web/static/paddle.ico
+++ b/paddlespeech/server/tests/asr/online/web/static/paddle.ico
--- a/paddlespeech/server/tests/asr/online/web/templates/index.html
+++ b/paddlespeech/server/tests/asr/online/web/templates/index.html
--- a/paddlespeech/server/tests/asr/online/websocket_client.py
+++ b/paddlespeech/server/tests/asr/online/websocket_client.py
@@ -16,102 +16,24 @@
 import argparse
 import asyncio
 import codecs
-import json
 import logging
 import os

-import numpy as np
-import soundfile
-import websockets
-
-
-class ASRAudioHandler:
-    def __init__(self, url="127.0.0.1", port=8090):
-        self.url = url
-        self.port = port
-        self.url = "ws://" + self.url + ":" + str(self.port) + "/ws/asr"
-
-    def read_wave(self, wavfile_path: str):
-        samples, sample_rate = soundfile.read(wavfile_path, dtype='int16')
-        x_len = len(samples)
-
-        chunk_size = 85 * 16  #80ms, sample_rate = 16kHz
-        if x_len % chunk_size!= 0:
-            padding_len_x = chunk_size - x_len % chunk_size
-        else:
-            padding_len_x = 0
-
-        padding = np.zeros((padding_len_x), dtype=samples.dtype)
-        padded_x = np.concatenate([samples, padding], axis=0)
-
-        assert (x_len + padding_len_x) % chunk_size == 0
-        num_chunk = (x_len + padding_len_x) / chunk_size
-        num_chunk = int(num_chunk)
-        for i in range(0, num_chunk):
-            start = i * chunk_size
-            end = start + chunk_size
-            x_chunk = padded_x[start:end]
-            yield x_chunk
-
-    async def run(self, wavfile_path: str):
-        logging.info("send a message to the server")
-        # self.read_wave()
-        # send websocket handshake protocal
-        async with websockets.connect(self.url) as ws:
-            # server has already received handshake protocal
-            # client start to send the command
-            audio_info = json.dumps(
-                {
-                    "name": "test.wav",
-                    "signal": "start",
-                    "nbest": 5
-                },
-                sort_keys=True,
-                indent=4,
-                separators=(',', ': '))
-            await ws.send(audio_info)
-            msg = await ws.recv()
-            logging.info("receive msg={}".format(msg))
-
-            # send chunk audio data to engine
-            for chunk_data in self.read_wave(wavfile_path):
-                await ws.send(chunk_data.tobytes())
-                msg = await ws.recv()
-                msg = json.loads(msg)
-                logging.info("receive msg={}".format(msg))
-
-            # finished 
-            audio_info = json.dumps(
-                {
-                    "name": "test.wav",
-                    "signal": "end",
-                    "nbest": 5
-                },
-                sort_keys=True,
-                indent=4,
-                separators=(',', ': '))
-            await ws.send(audio_info)
-            msg = await ws.recv()
-            
-            # decode the bytes to str
-            msg = json.loads(msg)
-            logging.info("final receive msg={}".format(msg))
-            result = msg
-            return result
+from paddlespeech.cli.log import logger
+from paddlespeech.server.utils.audio_handler import ASRAudioHandler


 def main(args):
-    logging.basicConfig(level=logging.INFO)
-    logging.info("asr websocket client start")
+    logger.info("asr websocket client start")
    handler = ASRAudioHandler("127.0.0.1", 8090)
    loop = asyncio.get_event_loop()

    # support to process single audio file
    if args.wavfile and os.path.exists(args.wavfile):
-        logging.info(f"start to process the wavscp: {args.wavfile}")
+        logger.info(f"start to process the wavscp: {args.wavfile}")
        result = loop.run_until_complete(handler.run(args.wavfile))
        result = result["asr_results"]
-        logging.info(f"asr websocket client finished : {result}")
+        logger.info(f"asr websocket client finished : {result}")

    # support to process batch audios from wav.scp 
    if args.wavscp and os.path.exists(args.wavscp):
@@ -126,6 +48,7 @@ def main(args):


 if __name__ == "__main__":
+    logger.info("Start to do streaming asr client")
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--wavfile",

--- a/paddlespeech/server/bin/paddlespeech_client.py
+++ b/paddlespeech/server/bin/paddlespeech_client.py
@@ -30,11 +30,14 @@ from ..executor import BaseExecutor
 from ..util import cli_client_register
 from ..util import stats_wrapper
 from paddlespeech.cli.log import logger
-from paddlespeech.server.tests.asr.online.websocket_client import ASRAudioHandler
+from paddlespeech.server.utils.audio_handler import ASRAudioHandler
 from paddlespeech.server.utils.audio_process import wav2pcm
 from paddlespeech.server.utils.util import wav2base64

-__all__ = ['TTSClientExecutor', 'ASRClientExecutor', 'CLSClientExecutor']
+__all__ = [
+    'TTSClientExecutor', 'ASRClientExecutor', 'ASRClientExecutor',
+    'CLSClientExecutor'
+]


 @cli_client_register(
@@ -236,11 +239,11 @@ class ASRClientExecutor(BaseExecutor):
 @cli_client_register(
    name='paddlespeech_client.asr_online',
    description='visit asr online service')
-class ASRClientExecutor(BaseExecutor):
+class ASROnlineClientExecutor(BaseExecutor):
    def __init__(self):
-        super(ASRClientExecutor, self).__init__()
+        super(ASROnlineClientExecutor, self).__init__()
        self.parser = argparse.ArgumentParser(
-            prog='paddlespeech_client.asr', add_help=True)
+            prog='paddlespeech_client.asr_online', add_help=True)
        self.parser.add_argument(
            '--server_ip', type=str, default='127.0.0.1', help='server ip')
        self.parser.add_argument(
@@ -305,6 +308,7 @@ class ASRClientExecutor(BaseExecutor):

        return res['asr_results']

+
 @cli_client_register(
    name='paddlespeech_client.cls', description='visit cls service')
 class CLSClientExecutor(BaseExecutor):

--- a/paddlespeech/server/conf/ws_conformer_application.yaml
+++ b/paddlespeech/server/conf/ws_conformer_application.yaml
@@ -29,7 +29,7 @@ asr_online:
    cfg_path: 
    decode_method: 
    force_yes: True
-
+    device: 'cpu' # cpu or gpu:id
    am_predictor_conf:
        device:  # set 'gpu:id' or 'cpu'
        switch_ir_optim: True

--- a/paddlespeech/server/engine/asr/online/asr_engine.py
+++ b/paddlespeech/server/engine/asr/online/asr_engine.py
@@ -1028,6 +1028,17 @@ class ASREngine(BaseEngine):
        self.output = ""
        self.executor = ASRServerExecutor()
        self.config = config
+        try:
+            if self.config.get("device", None):
+                self.device = self.config.device
+            else:
+                self.device = paddle.get_device()
+            logger.info(f"paddlespeech_server set the device: {self.device}")
+            paddle.set_device(self.device)
+        except BaseException:
+            logger.error(
+                "Set device failed, please check if device is already used and the parameter 'device' in the yaml file"
+            )

        self.executor._init_from_path(
            model_type=self.config.model_type,

--- a/paddlespeech/server/tests/asr/online/README_cn.md
+++ b/paddlespeech/server/tests/asr/online/README_cn.md
-([简体中文](./README_cn.md)|English)
-
-# 语音服务
-
-## 介绍
-本文档介绍如何使用流式ASR的三种不同客户端:网页、麦克风、Python模拟流式服务。 
-
-
-## 使用方法
-### 1. 安装
-请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
-
-推荐使用 **paddlepaddle 2.2.1** 或以上版本。
-你可以从 medium，hard 三中方式中选择一种方式安装 PaddleSpeech。
-
-
-### 2. 准备测试文件
-
-这个 ASR client 的输入应该是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。
-
-可以下载此 ASR client的示例音频：
-```bash
-wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
-```
-
-### 2. 流式 ASR 客户端使用方法
-
- Python模拟流式服务命令行
-   ```
-
-   # 流式ASR
-   paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8091 --input ./zh.wav
-
-   ```
-
-
- 麦克风
-   ```
-   # 直接调用麦克风设备
-   python microphone_client.py
-
-   ```
-
-
- 网页
-   ```
-   # 进入web目录后参考相关readme.md
-
-   ```
--- a/paddlespeech/server/tests/asr/online/__init__.py
+++ b/paddlespeech/server/tests/asr/online/__init__.py
-# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
--- a/paddlespeech/server/tests/asr/online/microphone_client.py
+++ b/paddlespeech/server/tests/asr/online/microphone_client.py
-# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-"""
-record wave from the mic
-"""
-import asyncio
-import json
-import logging
-import threading
-import wave
-from signal import SIGINT
-from signal import SIGTERM
-
-import pyaudio
-import websockets
-
-
-class ASRAudioHandler(threading.Thread):
-    def __init__(self, url="127.0.0.1", port=8091):
-        threading.Thread.__init__(self)
-        self.url = url
-        self.port = port
-        self.url = "ws://" + self.url + ":" + str(self.port) + "/ws/asr"
-        self.fileName = "./output.wav"
-        self.chunk = 5120
-        self.format = pyaudio.paInt16
-        self.channels = 1
-        self.rate = 16000
-        self._running = True
-        self._frames = []
-        self.data_backup = []
-
-    def startrecord(self):
-        """
-        start a new thread to record wave
-        """
-        threading._start_new_thread(self.recording, ())
-
-    def recording(self):
-        """
-        recording wave
-        """
-        self._running = True
-        self._frames = []
-        p = pyaudio.PyAudio()
-        stream = p.open(
-            format=self.format,
-            channels=self.channels,
-            rate=self.rate,
-            input=True,
-            frames_per_buffer=self.chunk)
-        while (self._running):
-            data = stream.read(self.chunk)
-            self._frames.append(data)
-            self.data_backup.append(data)
-
-        stream.stop_stream()
-        stream.close()
-        p.terminate()
-
-    def save(self):
-        """
-        save wave data
-        """
-        p = pyaudio.PyAudio()
-        wf = wave.open(self.fileName, 'wb')
-        wf.setnchannels(self.channels)
-        wf.setsampwidth(p.get_sample_size(self.format))
-        wf.setframerate(self.rate)
-        wf.writeframes(b''.join(self.data_backup))
-        wf.close()
-        p.terminate()
-
-    def stoprecord(self):
-        """
-        stop recording
-        """
-        self._running = False
-
-    async def run(self):
-        aa = input("是否开始录音？   (y/n)")
-        if aa.strip() == "y":
-            self.startrecord()
-            logging.info("*" * 10 + "开始录音，请输入语音")
-
-            async with websockets.connect(self.url) as ws:
-                # 发送开始指令
-                audio_info = json.dumps(
-                    {
-                        "name": "test.wav",
-                        "signal": "start",
-                        "nbest": 5
-                    },
-                    sort_keys=True,
-                    indent=4,
-                    separators=(',', ': '))
-                await ws.send(audio_info)
-                msg = await ws.recv()
-                logging.info("receive msg={}".format(msg))
-
-                # send bytes data
-                logging.info("结束录音请: Ctrl + c。继续请按回车。")
-                try:
-                    while True:
-                        while len(self._frames) > 0:
-                            await ws.send(self._frames.pop(0))
-                            msg = await ws.recv()
-                            logging.info("receive msg={}".format(msg))
-                except asyncio.CancelledError:
-                    # quit
-                    # send finished 
-                    audio_info = json.dumps(
-                        {
-                            "name": "test.wav",
-                            "signal": "end",
-                            "nbest": 5
-                        },
-                        sort_keys=True,
-                        indent=4,
-                        separators=(',', ': '))
-                    await ws.send(audio_info)
-                    msg = await ws.recv()
-                    logging.info("receive msg={}".format(msg))
-
-                    self.stoprecord()
-                    logging.info("*" * 10 + "录音结束")
-                    self.save()
-        elif aa.strip() == "n":
-            exit()
-        else:
-            print("无效输入!")
-            exit()
-
-
-if __name__ == "__main__":
-
-    logging.basicConfig(level=logging.INFO)
-    logging.info("asr websocket client start")
-
-    handler = ASRAudioHandler("127.0.0.1", 8091)
-    loop = asyncio.get_event_loop()
-    main_task = asyncio.ensure_future(handler.run())
-    for signal in [SIGINT, SIGTERM]:
-        loop.add_signal_handler(signal, main_task.cancel)
-    try:
-        loop.run_until_complete(main_task)
-    finally:
-        loop.close()
-
-    logging.info("asr websocket client finished")