all AI news
Extremum-Seeking Action Selection for Accelerating Policy Optimization
April 3, 2024, 4:41 a.m. | Ya-Chien Chang, Sicun Gao
cs.LG updates on arXiv.org arxiv.org
Abstract: Reinforcement learning for control over continuous spaces typically uses high-entropy stochastic policies, such as Gaussian distributions, for local exploration and estimating policy gradient to optimize performance. Many robotic control problems deal with complex unstable dynamics, where applying actions that are off the feasible control manifolds can quickly lead to undesirable divergence. In such cases, most samples taken from the ambient action space generate low-value trajectories that hardly contribute to policy improvement, resulting in slow or …
abstract arxiv continuous control cs.ai cs.lg cs.ro deal dynamics entropy exploration gradient optimization performance policies policy reinforcement reinforcement learning robotic spaces stochastic type
More from arxiv.org / cs.LG updates on arXiv.org
Efficient Data-Driven MPC for Demand Response of Commercial Buildings
2 days, 22 hours ago |
arxiv.org
Testing the Segment Anything Model on radiology data
2 days, 22 hours ago |
arxiv.org
Calorimeter shower superresolution
2 days, 22 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US