Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
OpenDILab开源决策智能平台
DI-engine
提交
118cc673
D
DI-engine
项目概览
OpenDILab开源决策智能平台
/
DI-engine
上一次同步 2 年多
通知
56
Star
321
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
1
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DI-engine
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
1
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
118cc673
编写于
12月 30, 2021
作者:
N
niuyazhe
1
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
polish(nyz): move actor_head_type to action_space field in qac and update readme new repo link
上级
a0435286
变更
46
显示空白变更内容
内联
并排
Showing
46 changed file
with
54 addition
and
74 deletion
+54
-74
README.md
README.md
+2
-0
ding/model/template/maqac.py
ding/model/template/maqac.py
+7
-27
ding/model/template/qac.py
ding/model/template/qac.py
+2
-4
dizoo/box2d/bipedalwalker/config/bipedalwalker_sac_config.py
dizoo/box2d/bipedalwalker/config/bipedalwalker_sac_config.py
+1
-1
dizoo/box2d/bipedalwalker/config/bipedalwalker_td3_config.py
dizoo/box2d/bipedalwalker/config/bipedalwalker_td3_config.py
+1
-1
dizoo/d4rl/config/hopper_cql_default_config.py
dizoo/d4rl/config/hopper_cql_default_config.py
+1
-1
dizoo/d4rl/config/hopper_expert_cql_default_config.py
dizoo/d4rl/config/hopper_expert_cql_default_config.py
+1
-1
dizoo/d4rl/config/hopper_medium_cql_default_config.py
dizoo/d4rl/config/hopper_medium_cql_default_config.py
+1
-1
dizoo/mujoco/config/ant_ddpg_default_config.py
dizoo/mujoco/config/ant_ddpg_default_config.py
+1
-1
dizoo/mujoco/config/ant_sac_default_config.py
dizoo/mujoco/config/ant_sac_default_config.py
+1
-1
dizoo/mujoco/config/ant_td3_default_config.py
dizoo/mujoco/config/ant_td3_default_config.py
+1
-1
dizoo/mujoco/config/ant_trex_sac_default_config.py
dizoo/mujoco/config/ant_trex_sac_default_config.py
+1
-1
dizoo/mujoco/config/halfcheetah_ddpg_default_config.py
dizoo/mujoco/config/halfcheetah_ddpg_default_config.py
+1
-1
dizoo/mujoco/config/halfcheetah_gcl_config.py
dizoo/mujoco/config/halfcheetah_gcl_config.py
+1
-1
dizoo/mujoco/config/halfcheetah_sac_default_config.py
dizoo/mujoco/config/halfcheetah_sac_default_config.py
+1
-1
dizoo/mujoco/config/halfcheetah_td3_default_config.py
dizoo/mujoco/config/halfcheetah_td3_default_config.py
+1
-1
dizoo/mujoco/config/halfcheetah_trex_sac_default_config.py
dizoo/mujoco/config/halfcheetah_trex_sac_default_config.py
+1
-1
dizoo/mujoco/config/hopper_cql_default_config.py
dizoo/mujoco/config/hopper_cql_default_config.py
+1
-1
dizoo/mujoco/config/hopper_d4pg_default_config.py
dizoo/mujoco/config/hopper_d4pg_default_config.py
+1
-1
dizoo/mujoco/config/hopper_ddpg_default_config.py
dizoo/mujoco/config/hopper_ddpg_default_config.py
+1
-1
dizoo/mujoco/config/hopper_sac_data_generation_default_config.py
...ujoco/config/hopper_sac_data_generation_default_config.py
+1
-1
dizoo/mujoco/config/hopper_sac_default_config.py
dizoo/mujoco/config/hopper_sac_default_config.py
+1
-1
dizoo/mujoco/config/hopper_td3_bc_default_config.py
dizoo/mujoco/config/hopper_td3_bc_default_config.py
+1
-1
dizoo/mujoco/config/hopper_td3_data_generation_config.py
dizoo/mujoco/config/hopper_td3_data_generation_config.py
+1
-1
dizoo/mujoco/config/hopper_td3_default_config.py
dizoo/mujoco/config/hopper_td3_default_config.py
+1
-1
dizoo/mujoco/config/hopper_trex_sac_default_config.py
dizoo/mujoco/config/hopper_trex_sac_default_config.py
+1
-1
dizoo/mujoco/config/sac_halfcheetah_mbpo_default_config.py
dizoo/mujoco/config/sac_halfcheetah_mbpo_default_config.py
+1
-1
dizoo/mujoco/config/sac_hopper_mbpo_default_config.py
dizoo/mujoco/config/sac_hopper_mbpo_default_config.py
+1
-1
dizoo/mujoco/config/walker2d_ddpg_default_config.py
dizoo/mujoco/config/walker2d_ddpg_default_config.py
+1
-1
dizoo/mujoco/config/walker2d_ddpg_gail_config.py
dizoo/mujoco/config/walker2d_ddpg_gail_config.py
+1
-1
dizoo/mujoco/config/walker2d_sac_default_config.py
dizoo/mujoco/config/walker2d_sac_default_config.py
+1
-1
dizoo/mujoco/config/walker2d_td3_default_config.py
dizoo/mujoco/config/walker2d_td3_default_config.py
+1
-1
dizoo/mujoco/config/walker2d_trex_sac_default_config.py
dizoo/mujoco/config/walker2d_trex_sac_default_config.py
+1
-1
dizoo/multiagent_mujoco/config/ant_masac_default_config.py
dizoo/multiagent_mujoco/config/ant_masac_default_config.py
+1
-1
dizoo/pybullet/config/ant_ddpg_default_config.py
dizoo/pybullet/config/ant_ddpg_default_config.py
+1
-1
dizoo/pybullet/config/ant_sac_default_config.py
dizoo/pybullet/config/ant_sac_default_config.py
+1
-1
dizoo/pybullet/config/ant_td3_default_config.py
dizoo/pybullet/config/ant_td3_default_config.py
+1
-1
dizoo/pybullet/config/halfcheetah_ddpg_default_config.py
dizoo/pybullet/config/halfcheetah_ddpg_default_config.py
+1
-1
dizoo/pybullet/config/halfcheetah_sac_default_config.py
dizoo/pybullet/config/halfcheetah_sac_default_config.py
+1
-1
dizoo/pybullet/config/halfcheetah_td3_default_config.py
dizoo/pybullet/config/halfcheetah_td3_default_config.py
+1
-1
dizoo/pybullet/config/hopper_ddpg_default_config.py
dizoo/pybullet/config/hopper_ddpg_default_config.py
+1
-1
dizoo/pybullet/config/hopper_sac_default_config.py
dizoo/pybullet/config/hopper_sac_default_config.py
+1
-1
dizoo/pybullet/config/hopper_td3_default_config.py
dizoo/pybullet/config/hopper_td3_default_config.py
+1
-1
dizoo/pybullet/config/walker2d_ddpg_default_config.py
dizoo/pybullet/config/walker2d_ddpg_default_config.py
+1
-1
dizoo/pybullet/config/walker2d_sac_default_config.py
dizoo/pybullet/config/walker2d_sac_default_config.py
+1
-1
dizoo/pybullet/config/walker2d_td3_default_config.py
dizoo/pybullet/config/walker2d_td3_default_config.py
+1
-1
未找到文件。
README.md
浏览文件 @
118cc673
...
...
@@ -54,11 +54,13 @@ Updated on 2021.12.03 DI-engine-v0.2.2 (beta)
-
[
DI-star
](
https://github.com/opendilab/DI-star
)
: Decision AI in StarCraftII
-
[
DI-drive
](
https://github.com/opendilab/DI-drive
)
: Auto-driving platform
-
[
GoBigger
](
https://github.com/opendilab/GoBigger
)
: Multi-Agent Decision Intelligence Environment
-
[
DI-smartcross
](
https://github.com/opendilab/DI-smartcross
)
: Decision AI in Traffic Light Control
-
General nested data lib
-
[
treevalue
](
https://github.com/opendilab/treevalue
)
: Tree-nested data structure
-
[
DI-treetensor
](
https://github.com/opendilab/DI-treetensor
)
: Tree-nested PyTorch tensor Lib
-
Docs and Tutorials
-
[
DI-engine-docs
](
https://github.com/opendilab/DI-engine-docs
)
-
[
awesome-model-based-RL
](
https://github.com/opendilab/awesome-model-based-RL
)
: A curated list of awesome Model-Based RL resources
**DI-engine**
also has some
**system optimization and design**
for efficient and robust large-scale RL training:
...
...
ding/model/template/maqac.py
浏览文件 @
118cc673
...
...
@@ -24,7 +24,6 @@ class MAQAC(nn.Module):
agent_obs_shape
:
Union
[
int
,
SequenceType
],
global_obs_shape
:
Union
[
int
,
SequenceType
],
action_shape
:
Union
[
int
,
SequenceType
],
# actor_head_type: str,
twin_critic
:
bool
=
False
,
actor_head_hidden_size
:
int
=
64
,
actor_head_layer_num
:
int
=
1
,
...
...
@@ -39,7 +38,6 @@ class MAQAC(nn.Module):
Arguments:
- obs_shape (:obj:`Union[int, SequenceType]`): Observation's space.
- action_shape (:obj:`Union[int, SequenceType]`): Action's space.
- actor_head_type (:obj:`str`): Whether choose ``regression`` or ``reparameterization``.
- twin_critic (:obj:`bool`): Whether include twin critic.
- actor_head_hidden_size (:obj:`Optional[int]`): The ``hidden_size`` to pass to actor-nn's ``Head``.
- actor_head_layer_num (:obj:`int`):
...
...
@@ -179,11 +177,6 @@ class MAQAC(nn.Module):
- obs (:obj:`torch.Tensor`): :math:`(B, N1)`, where B is batch size and N1 is ``obs_shape``
- action (:obj:`torch.Tensor`): :math:`(B, N2)`, where B is batch size and N2 is ``action_shape``
- q_value (:obj:`torch.FloatTensor`): :math:`(B, )`, where B is batch size.
Examples:
>>> inputs = {'obs': torch.randn(4, N), 'action': torch.randn(4, 1)}
>>> model = QAC(obs_shape=(N, ),action_shape=1,actor_head_type='regression')
>>> model(inputs, mode='compute_critic')['q_value'] # q value
tensor([0.0773, 0.1639, 0.0917, 0.0370], grad_fn=<SqueezeBackward1>)
"""
if
self
.
twin_critic
:
...
...
@@ -208,7 +201,7 @@ class ContinuousMAQAC(nn.Module):
agent_obs_shape
:
Union
[
int
,
SequenceType
],
global_obs_shape
:
Union
[
int
,
SequenceType
],
action_shape
:
Union
[
int
,
SequenceType
,
EasyDict
],
act
or_head_typ
e
:
str
,
act
ion_spac
e
:
str
,
twin_critic
:
bool
=
False
,
actor_head_hidden_size
:
int
=
64
,
actor_head_layer_num
:
int
=
1
,
...
...
@@ -222,9 +215,8 @@ class ContinuousMAQAC(nn.Module):
Init the QAC Model according to arguments.
Arguments:
- obs_shape (:obj:`Union[int, SequenceType]`): Observation's space.
- action_shape (:obj:`Union[int, SequenceType, EasyDict]`): Action's space, such as 4, (3, ),
EasyDict({'action_type_shape': 3, 'action_args_shape': 4}).
- actor_head_type (:obj:`str`): Whether choose ``regression`` or ``reparameterization`` or ``hybrid`` .
- action_shape (:obj:`Union[int, SequenceType, EasyDict]`): Action's space, such as 4, (3, )
- action_space (:obj:`str`): Whether choose ``regression`` or ``reparameterization``.
- twin_critic (:obj:`bool`): Whether include twin critic.
- actor_head_hidden_size (:obj:`Optional[int]`): The ``hidden_size`` to pass to actor-nn's ``Head``.
- actor_head_layer_num (:obj:`int`):
...
...
@@ -243,9 +235,9 @@ class ContinuousMAQAC(nn.Module):
global_obs_shape
:
int
=
squeeze
(
global_obs_shape
)
action_shape
=
squeeze
(
action_shape
)
self
.
action_shape
=
action_shape
self
.
act
or_head_type
=
actor_head_typ
e
assert
self
.
act
or_head_typ
e
in
[
'regression'
,
'reparameterization'
]
if
self
.
act
or_head_typ
e
==
'regression'
:
# DDPG, TD3
self
.
act
ion_space
=
action_spac
e
assert
self
.
act
ion_spac
e
in
[
'regression'
,
'reparameterization'
]
if
self
.
act
ion_spac
e
==
'regression'
:
# DDPG, TD3
self
.
actor
=
nn
.
Sequential
(
nn
.
Linear
(
obs_shape
,
actor_head_hidden_size
),
activation
,
RegressionHead
(
...
...
@@ -350,12 +342,6 @@ class ContinuousMAQAC(nn.Module):
>>> actor_outputs['logit'][1].shape # sigma
>>> torch.Size([4, 64])
Critic Examples:
>>> inputs = {'obs': torch.randn(4,N), 'action': torch.randn(4,1)}
>>> model = QAC(obs_shape=(N, ),action_shape=1,actor_head_type='regression')
>>> model(inputs, mode='compute_critic')['q_value'] # q value
tensor([0.0773, 0.1639, 0.0917, 0.0370], grad_fn=<SqueezeBackward1>)
"""
assert
mode
in
self
.
mode
,
"not support forward mode: {}/{}"
.
format
(
mode
,
self
.
mode
)
return
getattr
(
self
,
mode
)(
inputs
)
...
...
@@ -404,7 +390,7 @@ class ContinuousMAQAC(nn.Module):
>>> torch.Size([4, 64])
"""
inputs
=
inputs
[
'agent_state'
]
if
self
.
act
or_head_typ
e
==
'regression'
:
if
self
.
act
ion_spac
e
==
'regression'
:
x
=
self
.
actor
(
inputs
)
return
{
'action'
:
x
[
'pred'
]}
else
:
...
...
@@ -434,12 +420,6 @@ class ContinuousMAQAC(nn.Module):
- obs (:obj:`torch.Tensor`): :math:`(B, N1)`, where B is batch size and N1 is ``obs_shape``
- action (:obj:`torch.Tensor`): :math:`(B, N2)`, where B is batch size and N2 is ``action_shape``
- q_value (:obj:`torch.FloatTensor`): :math:`(B, )`, where B is batch size.
Examples:
>>> inputs = {'obs': torch.randn(4, N), 'action': torch.randn(4, 1)}
>>> model = QAC(obs_shape=(N, ),action_shape=1,actor_head_type='regression')
>>> model(inputs, mode='compute_critic')['q_value'] # q value
>>> tensor([0.0773, 0.1639, 0.0917, 0.0370], grad_fn=<SqueezeBackward1>)
"""
obs
,
action
=
inputs
[
'obs'
][
'global_state'
],
inputs
[
'action'
]
...
...
ding/model/template/qac.py
浏览文件 @
118cc673
...
...
@@ -325,7 +325,6 @@ class DiscreteQAC(nn.Module):
global_obs_shape
:
Union
[
int
,
SequenceType
],
action_shape
:
Union
[
int
,
SequenceType
],
encoder_hidden_size_list
:
SequenceType
=
[
64
],
#actor_head_type: str,
twin_critic
:
bool
=
False
,
actor_head_hidden_size
:
int
=
64
,
actor_head_layer_num
:
int
=
1
,
...
...
@@ -340,7 +339,6 @@ class DiscreteQAC(nn.Module):
Arguments:
- obs_shape (:obj:`Union[int, SequenceType]`): Observation's space.
- action_shape (:obj:`Union[int, SequenceType]`): Action's space.
- actor_head_type (:obj:`str`): Whether choose ``regression`` or ``reparameterization``.
- twin_critic (:obj:`bool`): Whether include twin critic.
- actor_head_hidden_size (:obj:`Optional[int]`): The ``hidden_size`` to pass to actor-nn's ``Head``.
- actor_head_layer_num (:obj:`int`):
...
...
@@ -468,7 +466,7 @@ class DiscreteQAC(nn.Module):
Critic Examples:
>>> inputs = {'obs': torch.randn(4,N), 'action': torch.randn(4,1)}
>>> model = QAC(obs_shape=(N, ),
action_shape=1,actor_head_typ
e='regression')
>>> model = QAC(obs_shape=(N, ),
action_shape=1, action_spac
e='regression')
>>> model(inputs, mode='compute_critic')['q_value'] # q value
tensor([0.0773, 0.1639, 0.0917, 0.0370], grad_fn=<SqueezeBackward1>)
...
...
@@ -537,7 +535,7 @@ class DiscreteQAC(nn.Module):
Examples:
>>> inputs = {'obs': torch.randn(4, N), 'action': torch.randn(4, 1)}
>>> model = QAC(obs_shape=(N, ),action_shape=1,
actor_head_typ
e='regression')
>>> model = QAC(obs_shape=(N, ),action_shape=1,
action_spac
e='regression')
>>> model(inputs, mode='compute_critic')['q_value'] # q value
tensor([0.0773, 0.1639, 0.0917, 0.0370], grad_fn=<SqueezeBackward1>)
...
...
dizoo/box2d/bipedalwalker/config/bipedalwalker_sac_config.py
浏览文件 @
118cc673
...
...
@@ -21,7 +21,7 @@ bipedalwalker_sac_config = dict(
obs_shape
=
24
,
action_shape
=
4
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
128
,
critic_head_hidden_size
=
128
,
),
...
...
dizoo/box2d/bipedalwalker/config/bipedalwalker_td3_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ bipedalwalker_td3_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
400
,
critic_head_hidden_size
=
400
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
4
,
...
...
dizoo/d4rl/config/hopper_cql_default_config.py
浏览文件 @
118cc673
...
...
@@ -17,7 +17,7 @@ hopper_cql_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/d4rl/config/hopper_expert_cql_default_config.py
浏览文件 @
118cc673
...
...
@@ -17,7 +17,7 @@ hopper_expert_cql_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/d4rl/config/hopper_medium_cql_default_config.py
浏览文件 @
118cc673
...
...
@@ -17,7 +17,7 @@ hopper_medium_cql_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/ant_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ ant_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/ant_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -19,7 +19,7 @@ ant_sac_default_config = dict(
obs_shape
=
111
,
action_shape
=
8
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/ant_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ ant_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/ant_trex_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -36,7 +36,7 @@ ant_trex_sac_default_config = dict(
obs_shape
=
111
,
action_shape
=
8
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/halfcheetah_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ halfcheetah_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/halfcheetah_gcl_config.py
浏览文件 @
118cc673
...
...
@@ -28,7 +28,7 @@ halfcheetah_gcl_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/halfcheetah_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -19,7 +19,7 @@ halfcheetah_sac_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/halfcheetah_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ halfcheetah_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/halfcheetah_trex_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -36,7 +36,7 @@ halfcheetah_trex_sac_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/hopper_cql_default_config.py
浏览文件 @
118cc673
...
...
@@ -17,7 +17,7 @@ hopper_cql_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/hopper_d4pg_default_config.py
浏览文件 @
118cc673
...
...
@@ -21,7 +21,7 @@ hopper_d4pg_default_config = dict(
action_shape
=
3
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
critic_head_type
=
'categorical'
,
v_min
=-
100
,
v_max
=
100
,
...
...
dizoo/mujoco/config/hopper_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ hopper_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/hopper_sac_data_generation_default_config.py
浏览文件 @
118cc673
...
...
@@ -18,7 +18,7 @@ hopper_sac_data_genearation_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/hopper_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -19,7 +19,7 @@ hopper_sac_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/hopper_td3_bc_default_config.py
浏览文件 @
118cc673
...
...
@@ -19,7 +19,7 @@ hopper_td3_bc_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
normalize_states
=
True
,
...
...
dizoo/mujoco/config/hopper_td3_data_generation_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ halfcheetah_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/hopper_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ hopper_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/hopper_trex_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -36,7 +36,7 @@ hopper_trex_sac_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/sac_halfcheetah_mbpo_default_config.py
浏览文件 @
118cc673
...
...
@@ -44,7 +44,7 @@ main_config = dict(
obs_shape
=
obs_shape
,
action_shape
=
action_shape
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/sac_hopper_mbpo_default_config.py
浏览文件 @
118cc673
...
...
@@ -44,7 +44,7 @@ main_config = dict(
obs_shape
=
obs_shape
,
action_shape
=
action_shape
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/walker2d_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -21,7 +21,7 @@ walker2d_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/walker2d_ddpg_gail_config.py
浏览文件 @
118cc673
...
...
@@ -34,7 +34,7 @@ walker2d_ddpg_gail_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/walker2d_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -18,7 +18,7 @@ walker2d_sac_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/mujoco/config/walker2d_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ walker2d_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/mujoco/config/walker2d_trex_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -36,7 +36,7 @@ walker2d_trex_sac_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/multiagent_mujoco/config/ant_masac_default_config.py
浏览文件 @
118cc673
...
...
@@ -22,7 +22,7 @@ ant_sac_default_config = dict(
global_obs_shape
=
111
,
action_shape
=
4
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/pybullet/config/ant_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ ant_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/ant_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -18,7 +18,7 @@ ant_sac_default_config = dict(
obs_shape
=
111
,
action_shape
=
8
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/pybullet/config/ant_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ ant_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/halfcheetah_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ halfcheetah_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/halfcheetah_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -18,7 +18,7 @@ halfcheetah_sac_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/pybullet/config/halfcheetah_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ halfcheetah_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/hopper_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ hopper_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/hopper_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -18,7 +18,7 @@ hopper_sac_default_config = dict(
obs_shape
=
11
,
action_shape
=
3
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/pybullet/config/hopper_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ hopper_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/walker2d_ddpg_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ walker2d_ddpg_default_config = dict(
twin_critic
=
False
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
dizoo/pybullet/config/walker2d_sac_default_config.py
浏览文件 @
118cc673
...
...
@@ -18,7 +18,7 @@ walker2d_sac_default_config = dict(
obs_shape
=
17
,
action_shape
=
6
,
twin_critic
=
True
,
act
or_head_typ
e
=
'reparameterization'
,
act
ion_spac
e
=
'reparameterization'
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
),
...
...
dizoo/pybullet/config/walker2d_td3_default_config.py
浏览文件 @
118cc673
...
...
@@ -20,7 +20,7 @@ walker2d_td3_default_config = dict(
twin_critic
=
True
,
actor_head_hidden_size
=
256
,
critic_head_hidden_size
=
256
,
act
or_head_typ
e
=
'regression'
,
act
ion_spac
e
=
'regression'
,
),
learn
=
dict
(
update_per_collect
=
1
,
...
...
OpenDILab开源决策智能平台
@m0_55289267
mentioned in commit
c5c988cd
·
12月 31, 2021
mentioned in commit
c5c988cd
mentioned in commit c5c988cd3c94ee0e812aa94ca5f0a928a2b18679
开关提交列表
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录