diff --git a/doc/LOW_PRECISION_DEPLOYMENT.md b/doc/LOW_PRECISION_DEPLOYMENT.md index cb08a88f2f3b2435f3b270575652217b1d956fbf..86d9c5d82b6d4e5c4ed57524fa3b37d93e1d0d2c 100644 --- a/doc/LOW_PRECISION_DEPLOYMENT.md +++ b/doc/LOW_PRECISION_DEPLOYMENT.md @@ -17,7 +17,7 @@ python -m paddle_serving_client.convert --dirname ResNet50_quant ``` Start RPC service, specify the GPU id and precision mode ``` -python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 0 --use_gpu --use_trt --precision int8 +python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 0 --use_trt --precision int8 ``` Request the serving service with Client ``` @@ -44,4 +44,4 @@ print(fetch_map["score"].reshape(-1)) ## Reference * [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) * [Deploy the quantized model Using Paddle Inference on Intel CPU](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_x86_cpu_int8.html) -* [Deploy the quantized model Using Paddle Inference on Nvidia GPU](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_trt.html) \ No newline at end of file +* [Deploy the quantized model Using Paddle Inference on Nvidia GPU](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_trt.html) diff --git a/doc/LOW_PRECISION_DEPLOYMENT_CN.md b/doc/LOW_PRECISION_DEPLOYMENT_CN.md index e543db94396eecbe64a61d7a9362369d02ab42de..f77f4e241f3f4b95574d22b9ca55788b5abc968e 100644 --- a/doc/LOW_PRECISION_DEPLOYMENT_CN.md +++ b/doc/LOW_PRECISION_DEPLOYMENT_CN.md @@ -16,7 +16,7 @@ python -m paddle_serving_client.convert --dirname ResNet50_quant ``` 启动rpc服务, 设定所选GPU id、部署模型精度 ``` -python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 0 --use_gpu --use_trt --precision int8 +python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 0 --use_trt --precision int8 ``` 使用client进行请求 ``` @@ -43,4 +43,4 @@ print(fetch_map["score"].reshape(-1)) ## 参考文档 * [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) * PaddleInference Intel CPU部署量化模型[文档](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_x86_cpu_int8.html) -* PaddleInference NV GPU部署量化模型[文档](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_trt.html) \ No newline at end of file +* PaddleInference NV GPU部署量化模型[文档](https://paddle-inference.readthedocs.io/en/latest/optimize/paddle_trt.html)