README.md




    Build Status Release Issues License Slack

    Motivation

    We consider deploying deep learning inference service online to be a user-facing application in the future. The goal of this project: When you have trained a deep neural net with Paddle, you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:

    Some Key Features

    • Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
    • Industrial serving features supported, such as models management, online loading, online A/B testing etc.
    • Distributed Key-Value indexing supported which is especially useful for large scale sparse features as model inputs.
    • Highly concurrent and efficient communication between clients and servers supported.
    • Multiple programming languages supported on client side, such as Golang, C++ and python.
    • Extensible framework design which can support model serving beyond Paddle.

    Installation

    We highly recommend you to run Paddle Serving in Docker, please visit Run in Docker

    # Run CPU Docker
    docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0
    docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0
    docker exec -it test bash
    # Run GPU Docker
    nvidia-docker pull hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
    nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:0.2.0-gpu
    nvidia-docker exec -it test bash
    pip install paddle-serving-client 
    pip install paddle-serving-server # CPU
    pip install paddle-serving-server-gpu # GPU

    You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add -i https://pypi.tuna.tsinghua.edu.cn/simple to pip command) to speed up the download.

    Client package support Centos 7 and Ubuntu 18, or you can use HTTP service without install client.

    Quick Start Example

    Boston House Price Prediction model

    wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gz
    tar -xzf uci_housing.tar.gz

    Paddle Serving provides HTTP and RPC based service for users to access

    HTTP service

    Paddle Serving provides a built-in python module called paddle_serving_server.serve that can start a RPC service or a http service with one-line command. If we specify the argument --name uci, it means that we will have a HTTP service with a url of $IP:$PORT/uci/prediction

    python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
    Argument Type Default Description
    thread int 4 Concurrency of current service
    port int 9292 Exposed port of current service to users
    name str "" Service name, can be used to generate HTTP request url
    model str "" Path of paddle model directory to be served
    mem_optim bool False Enable memory optimization

    Here, we use curl to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, requests.

    curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction

    RPC service

    A user can also start a RPC service with paddle_serving_server.serve. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify --name here.

    python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
    # A user can visit rpc service through paddle_serving_client API
    from paddle_serving_client import Client
    
    client = Client()
    client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
    client.connect(["127.0.0.1:9292"])
    data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
            -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
    fetch_map = client.predict(feed={"x": data}, fetch=["price"])
    print(fetch_map)
    

    Here, client.predict function has two arguments. feed is a python dict with model input variable alias name and values. fetch assigns the prediction variables to be returned from servers. In the example, the name of "x" and "price" are assigned when the servable model is saved during training.

    Pre-built services with Paddle Serving

    Chinese Word Segmentation
    • Description:
    Chinese word segmentation HTTP service that can be deployed with one line command.
    • Download Servable Package:
    wget --no-check-certificate https://paddle-serving.bj.bcebos.com/lac/lac_model_jieba_web.tar.gz
    • Host web service:
    tar -xzf lac_model_jieba_web.tar.gz
    python lac_web_service.py jieba_server_model/ lac_workdir 9292
    • Request sample:
    curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9292/lac/prediction
    • Request result:
    {"word_seg":"我|爱|北京|天安门"}

    Image Classification
    • Description:
    Image classification trained with Imagenet dataset. A label and corresponding probability will be returned.
    Note: This demo needs paddle-serving-server-gpu. 
    • Download Servable Package:
    wget --no-check-certificate https://paddle-serving.bj.bcebos.com/imagenet-example/imagenet_demo.tar.gz
    • Host web service:
    tar -xzf imagenet_demo.tar.gz
    python image_classification_service_demo.py resnet50_serving_model
    • Request sample:



    curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"url": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction
    • Request result:
    {"label":"daisy","prob":0.9341403245925903}

    More Demos

    Key Value
    Model Name Bert-Base-Baike
    URL https://paddle-serving.bj.bcebos.com/bert_example/bert_seq128.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert
    Description Get semantic representation from a Chinese Sentence
    Key Value
    Model Name Resnet50-Imagenet
    URL https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet50_vd.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet
    Description Get image semantic representation from an image
    Key Value
    Model Name Resnet101-Imagenet
    URL https://paddle-serving.bj.bcebos.com/imagenet-example/ResNet101_vd.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imagenet
    Description Get image semantic representation from an image
    Key Value
    Model Name CNN-IMDB
    URL https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb
    Description Get category probability from an English Sentence
    Key Value
    Model Name LSTM-IMDB
    URL https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb
    Description Get category probability from an English Sentence
    Key Value
    Model Name BOW-IMDB
    URL https://paddle-serving.bj.bcebos.com/imdb-demo/imdb_model.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/imdb
    Description Get category probability from an English Sentence
    Key Value
    Model Name Jieba-LAC
    URL https://paddle-serving.bj.bcebos.com/lac/lac_model.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/lac
    Description Get word segmentation from a Chinese Sentence
    Key Value
    Model Name DNN-CTR
    URL https://paddle-serving.bj.bcebos.com/criteo_ctr_example/criteo_ctr_demo_model.tar.gz
    Client/Server Code https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/criteo_ctr
    Description Get click probability from a feature vector of item

    Document

    New to Paddle Serving

    Developers

    About Efficiency

    FAQ

    Design

    Community

    Slack

    To connect with other users and contributors, welcome to join our Slack channel

    Contribution

    If you want to contribute code to Paddle Serving, please reference Contribution Guidelines

    Feedback

    For any feedback or to report a bug, please propose a GitHub Issue.

    License

    Apache 2.0 License

    项目简介

    A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)

    🚀 Github 镜像仓库 🚀

    源项目地址

    https://github.com/PaddlePaddle/Serving

    发行版本 14

    Release v0.9.0

    全部发行版

    贡献者 36

    全部贡献者

    开发语言

    • C++ 51.6 %
    • Python 27.0 %
    • Shell 8.0 %
    • CMake 6.0 %
    • Go 4.4 %