all AI news
Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning. (arXiv:2209.11596v2 [cs.LG] UPDATED)
Sept. 29, 2022, 1:12 a.m. | Kang Xu, Yan Ma, Wei Li
cs.LG updates on arXiv.org arxiv.org
Training a robust policy is critical for policy deployment in real-world
systems or dealing with unknown dynamics mismatch in different dynamic systems.
Domain Randomization~(DR) is a simple and elegant approach that trains a
conservative policy to counter different dynamic systems without expert
knowledge about the target system parameters. However, existing works reveal
that the policy trained through DR tends to be over-conservative and performs
poorly in target domains. Our key insight is that dynamic systems with
different parameters provide different …
arxiv dynamics quantification reinforcement reinforcement learning
More from arxiv.org / cs.LG updates on arXiv.org
Generalized Schr\"odinger Bridge Matching
1 day, 11 hours ago |
arxiv.org
Tight bounds on Pauli channel learning without entanglement
1 day, 11 hours ago |
arxiv.org
Automated mapping of virtual environments with visual predictive coding
1 day, 11 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Integration Specialist
@ Accenture Federal Services | San Antonio, TX
Geospatial Data Engineer - Location Intelligence
@ Allegro | Warsaw, Poland
Site Autonomy Engineer (Onsite)
@ May Mobility | Tokyo, Japan
Summer Intern, AI (Artificial Intelligence)
@ Nextech Systems | Tampa, FL
Permitting Specialist/Wetland Scientist
@ AECOM | Chelmsford, MA, United States