-
由 Will-Nie 提交于
* add trex algorithm for pong * sort style * add atari, ll,cp; fix device, collision; add_ppo * add accuracy evaluation * correct style * add seed to make sure results are replicable * remove useless part in cum return of model part * add mujoco onppo training pipeline; ppo config * improve style * add sac training config for mujoco * add log, add save data; polish config * logger; hyperparameter;walker * correct style * modify else condition * change rnd to trex * revise according to comments, add eposode collect * new collect mode for trex, fix all bugs, commnets * final change * polish after the final comment * add readme/test * add test for serial entry of trex/gcl * sort style * change mujoco to cartpole for test for trex_onppo * remove files generated by testing * revise tests for entry * sort style * revise tests * modify pytest * fix(nyz): speed up ppg/ppo and marl algo unittest * polish(nyz): speed up trex unittest and fix trex entry default config bug * fix(nyz): fix same name bug * fix(nyz): fix remove conflict bug(ci skip) Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
f089d02a