提交 · main · OpenDILab开源决策智能平台 / DI-engine

01 1月, 2022 2 次提交
- N
  
  fix(nyz): fix exp_name seedx name bug with data generation path · ae6ab6c7
  由 niuyazhe 提交于 1月 01, 2022
  
  ae6ab6c7
- N
  
  feature(nyz): add vim in docker and add multiple seed cli · 35241df3
  由 niuyazhe 提交于 1月 01, 2022
  
  35241df3
31 12月, 2021 1 次提交
- Q
  feature(gg15): add gumbel softmax (#169) · 58084df3
  由 qiuqiu 提交于 12月 31, 2021
```
* feature(gg15): add gumbel softmax

* add ww

* fix format
```
  58084df3
30 12月, 2021 1 次提交
- N
  
  polish(nyz): move actor_head_type to action_space field in qac and update readme new repo link · 118cc673
  由 niuyazhe 提交于 12月 30, 2021
  
  118cc673
29 12月, 2021 2 次提交
- R
  
  polish(nyz): update multi-discrete policies (#167) · a0435286
  由 Robin Chen 提交于 12月 29, 2021
  
  a0435286
- X
  
  fix(xjx): fix deps mpire (#168) · 2699aa5e
  由 Xu Jingxin 提交于 12月 29, 2021
  
  2699aa5e
27 12月, 2021 1 次提交
- X
  
  doc(xjx): add docs, wrap decorators for framework (#166) · 88658ea3
  由 Xu Jingxin 提交于 12月 27, 2021
  
  88658ea3
24 12月, 2021 1 次提交

feature(nyz): add H-PPO hybrid action space algorithm (#140) · 0b71fc4e

由 Swain 提交于 12月 24, 2021

* feature(nyz): add hybrid ppo, unify action_space field and use dict type mu sigma

* polish(nyz): polish ppo config continous field, move to action_space field

* fix(nyz): fix ppo action_space field compatibility bug

* fix(nyz): fix ppg/sac/cql action_space field compatibility bug

* demo(nyz): update gym hybrid hppo config

* polish(pu): polish hppo hyper-para, use tanh and fixed sigma 0.3 in actor_action_args, use clamp [0,1] and [-1,1] for acceleration_value and rotation_value correspondingly after sample from the pi distri. in collect phase

* polish(pu):polish as review

* polish(pu): polish hppo config

* polish(pu): entropy weight=0.03 performs best empirically

* fix(nyz): fix unittest compatibility bugs

* polish(nyz): remove atari env unused print(ci skip)
Co-authored-by: Npuyuan1996 <2402552459@qq.com>

0b71fc4e

23 12月, 2021 2 次提交
- X
  
  fix(nyz): fix default max step dtype bug (#163) · eb6c60cc
  由 Xu Jingxin 提交于 12月 23, 2021
  
  eb6c60cc
- X
  feature(xjx): cli in new pipeline (#160) · 954d3100
  由 Xu Jingxin 提交于 12月 23, 2021
```
* Cli ditask

* Import ditask in init

* Add current path as default package path

* Fix style

* Add topology on ditask
```
  954d3100
22 12月, 2021 2 次提交

feature(xjx): multiprocess tblogger, fix circular reference problem (#156) · 92d973c1

由 Xu Jingxin 提交于 12月 22, 2021

* Fix recur reference in task and parallel, add distributed logger

* Update logger

* Clear ref list when exit task/parallel

* Put task in with statment

* Fix test

* FFix test

* Test is hard

* More comments

92d973c1

蒲

fix(pu): fix dqfd compatibility (#161) · 1040c9fc

由蒲源提交于 12月 22, 2021

* polish(pu): polish r2d3

* polish(pu): first abs then sum each item in td-error

* fix(pu): fix dqfd compatibility

1040c9fc

21 12月, 2021 3 次提交
- 蒲
  polish(pu): polish r2d3 for abs priority (#158) · b532b4cd
  由蒲源提交于 12月 21, 2021
```
* polish(pu): polish r2d3

* polish(pu): first abs then sum each item in td-error
```
  b532b4cd
- N
  
  style(nyz): fix rl intro link problem(ci skip) · b7cd6751
  由 niuyazhe 提交于 12月 21, 2021
  
  b7cd6751
- N
  
  style(nyz): update en doc link and add how to migrate a new env · 2f4d53be
  由 niuyazhe 提交于 12月 21, 2021
  
  2f4d53be
19 12月, 2021 1 次提交
- N
  
  style(nyz): update issue template doc link and polish comment doc · ad71feba
  由 niuyazhe 提交于 12月 19, 2021
  
  ad71feba
17 12月, 2021 2 次提交

蒲

polish(pu): polish eps_greedy_multinomial_sample in model_wrapper (#154) · 16833c62

由蒲源提交于 12月 17, 2021

* polish(pu):polish eps_greedy_multinomial_sample in model_wrappers

* polish(pu): delete masac wrapper

* polish(pu): delete sql wrapper

16833c62

fix(nyz): fix doc generate bug with enum_tools (#155) · e6604502

由 Swain 提交于 12月 17, 2021

* fix(nyz): fix doc generation python version(ci skip)

* fix(nyz): modify dev doc branch trigger(ci skip)

e6604502

16 12月, 2021 1 次提交

feature(nyp): add residual in R2D2(#150) · ab94376c

由 Will-Nie 提交于 12月 16, 2021

* add comments for r2d2

* sort style

* revise according to the comments

* fix style

* add r2d2 residual link + commnets

* revise according to comments, add spaceinvader

* add test for the model and fix test bugs

ab94376c

15 12月, 2021 4 次提交

feature(wyh): multi agent mujoco environment (#146) · b040b1c3

由 Weiyuhong-1998 提交于 12月 15, 2021

* ma mujoco env and masac code

* env(wyh):ma mujoco agent id

* feature(wyh):maqac continuous

* fix(wyh):multi-mujoco add readme

* fix(wyh): td error

* fix(wyh)style

* fix(wyh):multi agent mujoco test

b040b1c3

N

fix(nyz): fix test_ppo same dir bug · 02bd3300
由 niuyazhe 提交于 12月 15, 2021

02bd3300

feature(xjx): new main framework and profile helper (#142) · d8bde45c

由 Xu Jingxin 提交于 12月 15, 2021

* Init base buffer and storage

* Use ratelimit as middleware

* Pass style check

* Keep the return original return value

* Add buffer.view

* Add replace flag on sample, rewrite middleware processing

* Test slicing

* Add buffer copy middleware

* Add update/delete api in buffer, rename middleware

* Implement update and delete api of buffer

* add naive use time count middleware in buffer

* Rename next to chain

* feature(nyz): add staleness check middleware and polish buffer

* feature(nyz): add naive priority experience replay

* Sample by indices

* Combine buffer and storage layers

* Support indices when deleting items from the queue

* Use dataclass to save buffered data, remove return_index and return_meta

* Add ignore_insufficient

* polish(nyz): add return index in push and copy same data in sample

* Drop useless import

* Fix sample with indices, ensure return size is equal to input size or indices size

* Make sure sampled data in buffer is different from each other

* Support sample by grouped meta key

* Support sample by rolling window

* Add import/export data in buffer

* Padding after sampling from buffer

* Polish use_time_check

* Use buffer as dataset

* Set collate_fn in buffer test

* Init framework

* Remove set_default, add keep

* Move backward_stack to task

* Fix total_step

* Pydash pick is too slow

* Add step records

* Add async mode

* Reuse forward and backward functions in sequence

* Fix sample profile

* demo(nyz): add atari pong runnable demo

* Fix forward bug

* Add task test

* Test pong

* feature(nyz): add deque buffer compatibility wrapper and demo

* polish(nyz): polish code style and add pong dqn new deque buffer demo

* Use sync mode

* Config worker number

* Init parallel mode

* Add prev property on context

* Mesh workers

* First version of parallel mode

* Make send rpc async

* Dont pickle prev

* Support tcp

* More cleanup on system exit

* Test parallel and task

* Enable task copy

* Test attach mode

* Add with statment

* Polish code

* Raise exception when timeout in attach mode

* Add event listeners

* feature(nyz): add pendulum sac new pipeline demo

* Fix main

* Add profiler and step profiler

* Rewrite parallel, cleanup res after task finished

* Add comments

* Remove ctx.prev

* Enable standalone parallel mode

* Remove hooks on ctx

* Add max mean

* demo(nyz): add pong dqn new pipeline demo

* Ensure parallel sock closed before program exit

* Fix parallel test

* Fix pong

* feature(zjow): add feature of profile in ding (#135)

* add profiling feature in ding cli.

* fix ding --profile cli.

* reformat files.

* reformat files again.

* reformat files again.

* Remove flameprof

* Change kept_keys to set

* Use finish as a properity

* Use wrapper

* Reformat step timer output

* Test random seed

* Revert learning rate

* Add topology on parallel

* Use labels on task

* Star in parallel mode

* Don't use daemon process

* Auto sync finish state

* Return logvars

* Fix test wrapper

* Fix test profiler helper

* Pass flake_check

* Lazy launch

* Reporter

* Replace main with main_sac

* Fix parallel ctx

* Fix test

* Fix merge issues
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
Co-authored-by: Nzjowowen <93968541+zjowowen@users.noreply.github.com>

d8bde45c

fix(lk): fix port conflict in gym_soccer (#139) · aa612443

由 Ke Li 提交于 12月 15, 2021

* feature(lk): fix port conflict

* polish(lk): polish code style and format

* fix(lk): change to subprocess

aa612443

14 12月, 2021 4 次提交

N

fix(nyz): fix PER indice repeat unittest bug · ff31a86b
由 niuyazhe 提交于 12月 14, 2021

ff31a86b

polish(nyp): fix unittest for trex training and collecting (#144) · f089d02a

由 Will-Nie 提交于 12月 14, 2021

* add trex algorithm for pong

* sort style

* add atari, ll,cp; fix device, collision; add_ppo

* add accuracy evaluation

* correct style

* add seed to make sure results are replicable

* remove useless part in cum return  of model part

* add mujoco onppo training pipeline; ppo config

* improve style

* add sac training config for mujoco

* add log, add save data; polish config

* logger; hyperparameter;walker

* correct style

* modify else condition

* change rnd to trex

* revise according to comments, add eposode collect

* new collect mode for trex, fix all bugs, commnets

* final change

* polish after the final comment

* add readme/test

* add test for serial entry of trex/gcl

* sort style

* change mujoco to cartpole for test for trex_onppo

* remove files generated by testing

* revise tests for entry

* sort style

* revise tests

* modify pytest

* fix(nyz): speed up ppg/ppo and marl algo unittest

* polish(nyz): speed up trex unittest and fix trex entry default config bug

* fix(nyz): fix same name bug

* fix(nyz): fix remove conflict bug(ci skip)
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

f089d02a

N

style(nyz): update zh doc link and add more env tutorial zh(ci skip) · 973e33e2
由 niuyazhe 提交于 12月 14, 2021

973e33e2
W
polish(nyp):add R2d2 comments (#149) · a2edf6a2
由 Will-Nie 提交于 12月 14, 2021
```
* add comments for r2d2

* sort style

* revise according to the comments

* fix style
```
a2edf6a2

13 12月, 2021 1 次提交

feature(nyz): add delay reward mujoco env (#145) · 490691fb

由 Swain 提交于 12月 13, 2021

* feature(nyz): add delay reward mujoco env

* test(nyz): add delay reward mujoco env test and fix bug

490691fb

12 12月, 2021 1 次提交
- M
  
  style(zm): add conda auto release (#148) · bc0102ba
  由 Ming Zhang 提交于 12月 12, 2021
  
  bc0102ba
09 12月, 2021 2 次提交

N

style(nyz): update intro and env doc link(ci skip) · 147d56f3
由 niuyazhe 提交于 12月 09, 2021

147d56f3

feature(xjx): refactor buffer (#129) · a490729f

由 Xu Jingxin 提交于 12月 09, 2021

* Init base buffer and storage

* Use ratelimit as middleware

* Pass style check

* Keep the return original return value

* Add buffer.view

* Add replace flag on sample, rewrite middleware processing

* Test slicing

* Add buffer copy middleware

* Add update/delete api in buffer, rename middleware

* Implement update and delete api of buffer

* add naive use time count middleware in buffer

* Rename next to chain

* feature(nyz): add staleness check middleware and polish buffer

* feature(nyz): add naive priority experience replay

* Sample by indices

* Combine buffer and storage layers

* Support indices when deleting items from the queue

* Use dataclass to save buffered data, remove return_index and return_meta

* Add ignore_insufficient

* polish(nyz): add return index in push and copy same data in sample

* Drop useless import

* Fix sample with indices, ensure return size is equal to input size or indices size

* Make sure sampled data in buffer is different from each other

* Support sample by grouped meta key

* Support sample by rolling window

* Add import/export data in buffer

* Padding after sampling from buffer

* Polish use_time_check

* Use buffer as dataset

* Set collate_fn in buffer test

* feature(nyz): add deque buffer compatibility wrapper and demo

* polish(nyz): polish code style and add pong dqn new deque buffer demo

* feature(nyz): add use_time_count compatibility in wrapper

* feature(nyz): add priority replay buffer compatibility in wrapper

* Improve performance of buffer.update

* polish(nyz): add priority max limit and correct flake8

* Use __call__ to rewrite middleware

* Rewrite buffer index

* Fix buffer delete

* Skip first item

* Rewrite buffer delete

* Use caller

* Use caller in priority

* Add group sample
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

a490729f

08 12月, 2021 4 次提交

N

fix(nyz): disable trex unittest · a7de696a
由 niuyazhe 提交于 12月 08, 2021

a7de696a
N

fix(nyz): fix trex unittest bugs · 234de26b
由 niuyazhe 提交于 12月 08, 2021

234de26b

feature(nyp): add Trex algorithm (#119) · 63105fef

由 Will-Nie 提交于 12月 08, 2021

* add trex algorithm for pong

* sort style

* add atari, ll,cp; fix device, collision; add_ppo

* add accuracy evaluation

* correct style

* add seed to make sure results are replicable

* remove useless part in cum return  of model part

* add mujoco onppo training pipeline; ppo config

* improve style

* add sac training config for mujoco

* add log, add save data; polish config

* logger; hyperparameter;walker

* correct style

* modify else condition

* change rnd to trex

* revise according to comments, add eposode collect

* new collect mode for trex, fix all bugs, commnets

* final change

* polish after the final comment

* add readme/test

* add test for serial entry of trex/gcl

* sort style

63105fef

feature(wyh):add masac algorithms (#112) · 18b3720a

由 Weiyuhong-1998 提交于 12月 08, 2021

* fix(wyh):masac

* feature(wyh):single agent discrete sac

* feature(wyh):single agent discrete sac td

* fix(wyh):fix pong bug

* fix(wyh):fix smac bug

* fix(wyh):masac_5m6m best config

* env(wyh):allow SMAC env return ippo/isac obs

* fix(wyh):masac polish

* fix(wyh):masac style

* fix(wyh):masac test

18b3720a

06 12月, 2021 1 次提交
- N
  
  style(nyz): update kaggle link and algo table · 100ea314
  由 niuyazhe 提交于 12月 06, 2021
  
  100ea314
03 12月, 2021 4 次提交

N

v0.2.2 · 312f274d
由 niuyazhe 提交于 12月 03, 2021

312f274d
N

fix(nyz): rename sum keepdims to keepdim for compatiblity and remove sql wrapper · 2b181eda
由 niuyazhe 提交于 12月 03, 2021

2b181eda

feature(lk): implement multi pass DQN (#131) · f087d2c7

由 Ke Li 提交于 12月 03, 2021

* feature(lk): add initial version of MP-PDQN

* fix(lk): fix expand function bug

* refactor(nyz): refactor mpdqn continuous args inputs module

* fix(nyz): fix pdqn scatter index generation

* fix(lk): fix pdqn scatter assignment bug

* feature(lk): polish mpdqn code and style format

* feature(lk): add mpdqn config and test file

* feature(lk): polish mpdqn code and style format

* fix(lk): fix import bug

* polish(lk): add test for mpdqn

* polish(lk): polish code style and format

* polish(lk): rm print debug info

* polish(lk): rm print debug info

* polish(lk): polish code style and format

* polish(lk): add MPDQN in readme.md
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

f087d2c7

benchmark(davide): Bsuite memory benchmark (#138) · 5ee17ad1

由 Davide Liu 提交于 12月 03, 2021

* added r2d2 + a2c configs

* changed convergence reward for some env

* removed configs that don't converge

* removed 'on_policy' param in 2rd2 configs

5ee17ad1

OpenDILab开源决策智能平台 / DI-engine 上一次同步 2 年多

OpenDILab开源决策智能平台 / DI-engine
上一次同步 2 年多