论文下载地址
https://arxiv.org/pdf/2310.02635
本文针对 RL 在真实世界的应用的两个问题:
- 需要大量的训练数据
- 需要奖励函数的设计
Reinforcement Learning with Foundation Priors (RLFP)
FAC Algorithm
效果
Across 5 dexterous tasks with real robots, FAC achieves an average success rate
of 86% after one hour of real-time learning. Across 8 tasks in the simulated Metaworld, FAC achieves 100% success rates in 7/8 tasks under less than 100k frames
(about 1-hour training), outperforming baseline methods with manual-designed
rewards in 1M frames.
去创作