提交 b561864f 编写于 作者: Z zhangjun

update

上级 b675d5cb
# Low-Precision Deployment for Paddle Serving
Intel CPU supports int8 and bfloat16 models, NVIDIA TensorRT supports int8 and bfload16 models.
## Obtain the quantized model using PaddleSlim tool
## Obtain the quantized model through PaddleSlim tool
Train the low-precision models please refer to [PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/tutorials/quant/overview.html).
## Deploy the quantized model from PaddleSlim using Paddle Serving with Nvidia TensorRT int8 mode
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册