1. 09 12月, 2021 1 次提交
    • X
      feature(xjx): refactor buffer (#129) · a490729f
      Xu Jingxin 提交于
      * Init base buffer and storage
      
      * Use ratelimit as middleware
      
      * Pass style check
      
      * Keep the return original return value
      
      * Add buffer.view
      
      * Add replace flag on sample, rewrite middleware processing
      
      * Test slicing
      
      * Add buffer copy middleware
      
      * Add update/delete api in buffer, rename middleware
      
      * Implement update and delete api of buffer
      
      * add naive use time count middleware in buffer
      
      * Rename next to chain
      
      * feature(nyz): add staleness check middleware and polish buffer
      
      * feature(nyz): add naive priority experience replay
      
      * Sample by indices
      
      * Combine buffer and storage layers
      
      * Support indices when deleting items from the queue
      
      * Use dataclass to save buffered data, remove return_index and return_meta
      
      * Add ignore_insufficient
      
      * polish(nyz): add return index in push and copy same data in sample
      
      * Drop useless import
      
      * Fix sample with indices, ensure return size is equal to input size or indices size
      
      * Make sure sampled data in buffer is different from each other
      
      * Support sample by grouped meta key
      
      * Support sample by rolling window
      
      * Add import/export data in buffer
      
      * Padding after sampling from buffer
      
      * Polish use_time_check
      
      * Use buffer as dataset
      
      * Set collate_fn in buffer test
      
      * feature(nyz): add deque buffer compatibility wrapper and demo
      
      * polish(nyz): polish code style and add pong dqn new deque buffer demo
      
      * feature(nyz): add use_time_count compatibility in wrapper
      
      * feature(nyz): add priority replay buffer compatibility in wrapper
      
      * Improve performance of buffer.update
      
      * polish(nyz): add priority max limit and correct flake8
      
      * Use __call__ to rewrite middleware
      
      * Rewrite buffer index
      
      * Fix buffer delete
      
      * Skip first item
      
      * Rewrite buffer delete
      
      * Use caller
      
      * Use caller in priority
      
      * Add group sample
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      a490729f
  2. 19 11月, 2021 1 次提交
    • D
      polish(davide) add example of GAIL entry + config for Mujoco and Cartpole (#114) · d1bc1387
      Davide Liu 提交于
      * added gail entry
      
      * added lunarlander and cartpole config
      
      * added gail mujoco config
      
      * added mujoco exp
      
      * update22-10
      
      * added third exp
      
      * added metric to evaluate policies
      
      * added GAIL entry and config for Cartpole and Walker2d
      
      * checked style and unittest
      
      * restored lunarlander env
      
      * style problems
      
      * bug correction
      
      * Delete expert_data_train.pkl
      
      * changed loss of GAIL
      
      * Update walker2d_ddpg_gail_config.py
      
      * changed gail reward from -D(s, a) to -log(D(s, a))
      
      * added small constant to reward function
      
      * added comment to clarify config
      
      * Update walker2d_ddpg_gail_config.py
      
      * added lunarlander entry + config
      
      * Added Atari discriminator + Pong entry config
      
      * Update gail_irl_model.py
      
      * Update gail_irl_model.py
      
      * added gail serial pipeline and onehot actions for gail atari
      
      * related to previous commit
      
      * removed main files
      
      * removed old comment
      d1bc1387
  3. 31 10月, 2021 1 次提交
  4. 16 10月, 2021 1 次提交
    • W
      feature(nyp): add DQfD algorithm (#48) · e2ca8738
      Will-Nie 提交于
      * add_dqfd
      
      * Is_expert to is_expert
      
      * modify according to the last commnets
      
      * value_gamma; done; marginloss; sqil compatibility
      
      * finally shorten the code, revise config
      
      * revise config, style
      
      * add_readme/two_more_config
      
      * correct format
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      e2ca8738
  5. 07 9月, 2021 1 次提交
  6. 11 8月, 2021 1 次提交
  7. 01 8月, 2021 1 次提交
    • S
      add ACER algorithm(szj) (#14) · dd4de1a0
      simonat2011 提交于
      * add endoro env config. add enduro's ppo,dqn,drdqn,rainbow,impala config.
      
      * modified as reviewer mentions
      
      * add qacd network
      
      * fix bugs
      
      * fix bugs
      
      * update acer algorithm
      
      * update ACER code
      
      * update acer config
      
      * fix bug
      
      * update pong acer's config
      
      * edit commit
      
      * update code as mention
      
      * fix the comment table and trust region
      
      * fix format
      
      * fix typing lint
      
      * fix format,flake8
      
      * fix format
      
      * fix whitespace problem
      
      * test(nyz): add acer unittest and algotest
      
      * style(nyz): correct flake8 style
      Co-authored-by: Nshenziju <simonshen2011@foxmail.com>
      Co-authored-by: NSwain <niuyazhe314@outlook.com>
      dd4de1a0
  8. 08 7月, 2021 1 次提交