未验证 提交 7f2abd56 编写于 作者: R rical730 提交者: GitHub

update parl.maddpg without import gym (#208)

* add maddpg example

* format with yapf

* fix coding style

* fix coding style

* unittest without import multiagent env

* update maddpg code

* update maddpg readme

* add copyright comments

* update parl.maddpg without import gym

* update NeurlIPS2018.gif to NeurlIPS2019.gif

* update readme and comments
上级 450a4a34
......@@ -82,6 +82,6 @@ pip install parl
- [冠军解决方案:NIPS2018强化学习假肢挑战赛](examples/NeurIPS2018-AI-for-Prosthetics-Challenge/)
- [冠军解决方案:NIPS2019强化学习仿生人控制赛事](examples/NeurIPS2019-Learn-to-Move-Challenge/)
<img src=".github/NeurlIPS2018.gif" width = "300" height ="200" alt="NeurlIPS2018"/> <img src=".github/Half-Cheetah.gif" width = "300" height ="200" alt="Half-Cheetah"/> <img src=".github/Breakout.gif" width = "200" height ="200" alt="Breakout"/>
<img src="examples/NeurIPS2019-Learn-to-Move-Challenge/image/performance.gif" width = "300" height ="200" alt="NeurlIPS2018"/> <img src=".github/Half-Cheetah.gif" width = "300" height ="200" alt="Half-Cheetah"/> <img src=".github/Breakout.gif" width = "200" height ="200" alt="Breakout"/>
<br>
<img src=".github/Aircraft.gif" width = "808" height ="300" alt="NeurlIPS2018"/>
......@@ -85,6 +85,6 @@ pip install parl
- [Winning Solution for NIPS2018: AI for Prosthetics Challenge](examples/NeurIPS2018-AI-for-Prosthetics-Challenge/)
- [Winning Solution for NIPS2019: Learn to Move Challenge](examples/NeurIPS2019-Learn-to-Move-Challenge/)
<img src=".github/NeurlIPS2018.gif" width = "300" height ="200" alt="NeurlIPS2018"/> <img src=".github/Half-Cheetah.gif" width = "300" height ="200" alt="Half-Cheetah"/> <img src=".github/Breakout.gif" width = "200" height ="200" alt="Breakout"/>
<img src="examples/NeurIPS2019-Learn-to-Move-Challenge/image/performance.gif" width = "300" height ="200" alt="NeurlIPS2018"/> <img src=".github/Half-Cheetah.gif" width = "300" height ="200" alt="Half-Cheetah"/> <img src=".github/Breakout.gif" width = "200" height ="200" alt="Breakout"/>
<br>
<img src=".github/Aircraft.gif" width = "808" height ="300" alt="NeurlIPS2018"/>
......@@ -22,20 +22,27 @@ from parl.core.fluid.algorithm import Algorithm
__all__ = ['MADDPG']
from gym import spaces
from parl.core.fluid.policy_distribution import SoftCategoricalDistribution
from parl.core.fluid.policy_distribution import SoftMultiCategoricalDistribution
def SoftPDistribution(logits, act_space):
if (isinstance(act_space, spaces.Discrete)):
"""input:
logits: the output of policy model
act_space: action space, must be gym.spaces.Discrete or multiagent.multi_discrete.MultiDiscrete
output:
instance of SoftCategoricalDistribution or SoftMultiCategoricalDistribution
"""
# is instance of gym.spaces.Discrete
if (hasattr(act_space, 'n')):
return SoftCategoricalDistribution(logits)
# is instance of multiagent.multi_discrete.MultiDiscrete
elif (hasattr(act_space, 'num_discrete_space')):
return SoftMultiCategoricalDistribution(logits, act_space.low,
act_space.high)
else:
raise NotImplementedError
raise AssertionError("act_space must be instance of \
gym.spaces.Discrete or multiagent.multi_discrete.MultiDiscrete")
class MADDPG(Algorithm):
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册