提交 b561864f 编写于 作者: Z zhangjun

update

上级 b675d5cb
# Low-Precision Deployment for Paddle Serving # Low-Precision Deployment for Paddle Serving
Intel CPU supports int8 and bfloat16 models, NVIDIA TensorRT supports int8 and bfload16 models. Intel CPU supports int8 and bfloat16 models, NVIDIA TensorRT supports int8 and bfload16 models.
## Obtain the quantized model using PaddleSlim tool ## Obtain the quantized model through PaddleSlim tool
Train the low-precision models please refer to [PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/tutorials/quant/overview.html). Train the low-precision models please refer to [PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/tutorials/quant/overview.html).
## Deploy the quantized model from PaddleSlim using Paddle Serving with Nvidia TensorRT int8 mode ## Deploy the quantized model from PaddleSlim using Paddle Serving with Nvidia TensorRT int8 mode
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册