README.md 4.6 KB
Newer Older
1 2 3 4
<p align="center">
<img src=".github/PARL-logo.png" alt="PARL" width="500"/>
</p>

B
Bo Zhou 已提交
5
English | [简体中文](./README.cn.md)   
B
Bo Zhou 已提交
6
[**Documentation**](https://parl.readthedocs.io/en/stable/index.html) | [**中文文档**](./docs/zh_CN/Overview.md)
B
Bo Zhou 已提交
7

B
Bo Zhou 已提交
8
> PARL is a flexible and high-efficient reinforcement learning framework.
H
Hongsheng Zeng 已提交
9

10
# Features
11
**Reproducible**. We provide algorithms that stably reproduce the result of many influential reinforcement learning algorithms.
12

H
Hongsheng Zeng 已提交
13
**Large Scale**. Ability to support high-performance parallelization of training with thousands of CPUs and multi-GPUs.
14

H
Hongsheng Zeng 已提交
15
**Reusable**.  Algorithms provided in the repository could be directly adapted to a new task by defining a forward network and training mechanism will be built automatically.
16 17 18 19 20

**Extensible**. Build new algorithms quickly by inheriting the abstract class in the framework.


# Abstractions
21
<img src=".github/abstractions.png" alt="abstractions" width="400"/>
22 23 24 25 26 27 28 29 30 31
PARL aims to build an agent for training algorithms to perform complex tasks.   
The main abstractions introduced by PARL that are used to build an agent recursively are the following:

### Model
`Model` is abstracted to construct the forward network which defines a policy network or critic network given state as input.

### Algorithm
`Algorithm` describes the mechanism to update parameters in `Model` and often contains at least one model.

### Agent
B
Bo Zhou 已提交
32
`Agent`, a data bridge between the environment and the algorithm, is responsible for data I/O with the outside environment and describes data preprocessing before feeding data into the training process.  
33

B
Bo Zhou 已提交
34
Note: For more information about base classes, please visit our [tutorial](https://parl.readthedocs.io/en/latest/getting_started.html) and [API documentation](https://parl.readthedocs.io/en/latest/model.html).
35

36
# Parallelization
B
Bo Zhou 已提交
37
PARL provides a compact API for distributed training, allowing users to transfer the code into a parallelized version by simply adding a decorator. For more information about our APIs for parallel training, please visit our [documentation](https://parl.readthedocs.io/en/latest/parallel_training/setup.html).  
B
Bo Zhou 已提交
38
Here is a `Hello World` example to demonstrate how easy it is to leverage outer computation resources.
39 40 41 42 43
```python
#============Agent.py=================
@parl.remote_class
class Agent(object):

H
Hongsheng Zeng 已提交
44 45
    def say_hello(self):
        print("Hello World!")
46

H
Hongsheng Zeng 已提交
47 48
    def sum(self, a, b):
        return a+b
49

B
Bo Zhou 已提交
50 51
parl.connect('localhost:8037')
agent = Agent()
52
agent.say_hello()
B
Bo Zhou 已提交
53
ans = agent.sum(1,5) # it runs remotely, without consuming any local computation resources
54 55
```
Two steps to use outer computation resources:
H
Hongsheng Zeng 已提交
56
1. use the `parl.remote_class` to decorate a class at first, after which it is transferred to be a new class that can run in other CPUs or machines.
B
Bo Zhou 已提交
57
2. call `parl.connect` to initialize parallel communication before creating an object. Calling any function of the objects **does not** consume local computation resources since they are executed elsewhere.
58 59

<img src=".github/decorator.png" alt="PARL" width="450"/>
B
Bo Zhou 已提交
60
As shown in the above figure, real actors (orange circle) are running at the cpu cluster, while the learner (blue circle) is running at the local gpu with several remote actors (yellow circle with dotted edge).  
61

H
Hongsheng Zeng 已提交
62
For users, they can write code in a simple way, just like writing multi-thread code, but with actors consuming remote resources. We have also provided examples of parallized algorithms like [IMPALA](examples/IMPALA), [A2C](examples/A2C) and [GA3C](examples/GA3C). For more details in usage please refer to these examples.  
63 64


65 66
# Install:
### Dependencies
B
Bo Zhou 已提交
67
- Python 2.7 or 3.5+(On **Windows**, PARL only supprorts the enviroment with python3.6+). 
B
Bo Zhou 已提交
68
- [paddlepaddle>=1.6.1](https://github.com/PaddlePaddle/Paddle) (**Optional**, if you only want to use APIs related to parallelization alone)  
69 70 71


```
B
Bo Zhou 已提交
72
pip install parl
73 74 75
```

# Examples
H
Hongsheng Zeng 已提交
76 77
- [QuickStart](examples/QuickStart/)
- [DQN](examples/DQN/)
B
Bo Zhou 已提交
78
- [ES](examples/ES/)
H
Hongsheng Zeng 已提交
79
- [DDPG](examples/DDPG/)
H
Hongsheng Zeng 已提交
80
- [PPO](examples/PPO/)
81
- [IMPALA](examples/IMPALA/)
H
Hongsheng Zeng 已提交
82
- [A2C](examples/A2C/)
L
LI Yunxiang 已提交
83 84
- [TD3](examples/TD3/)
- [SAC](examples/SAC/)
R
rical730 已提交
85
- [MADDPG](examples/MADDPG/)
H
Hongsheng Zeng 已提交
86
- [Winning Solution for NIPS2018: AI for Prosthetics Challenge](examples/NeurIPS2018-AI-for-Prosthetics-Challenge/)
L
LI Yunxiang 已提交
87
- [Winning Solution for NIPS2019: Learn to Move Challenge](examples/NeurIPS2019-Learn-to-Move-Challenge/)
H
Hongsheng Zeng 已提交
88

89
<img src="examples/NeurIPS2019-Learn-to-Move-Challenge/image/performance.gif" width = "300" height ="200" alt="NeurlIPS2018"/> <img src=".github/Half-Cheetah.gif" width = "300" height ="200" alt="Half-Cheetah"/> <img src=".github/Breakout.gif" width = "200" height ="200" alt="Breakout"/>
H
Hongsheng Zeng 已提交
90 91
<br>
<img src=".github/Aircraft.gif"  width = "808" height ="300"  alt="NeurlIPS2018"/>