all AI news
Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning. (arXiv:2201.04612v1 [cs.MA])
Jan. 13, 2022, 2:10 a.m. | Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran
cs.LG updates on arXiv.org arxiv.org
This paper considers multi-agent reinforcement learning (MARL) tasks where
agents receive a shared global reward at the end of an episode. The delayed
nature of this reward affects the ability of the agents to assess the quality
of their actions at intermediate time-steps. This paper focuses on developing
methods to learn a temporal redistribution of the episodic reward to obtain a
dense reward signal. Solving such MARL problems requires addressing two
challenges: identifying (1) relative importance of states along the …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Senior AI Engineer, EdTech (Remote)
@ Lightci | Toronto, Ontario
Data Scientist for Salesforce Applications
@ ManTech | 781G - Customer Site,San Antonio,TX
AI Research Scientist
@ Gridmatic | Cupertino, CA
Data Engineer
@ Global Atlantic Financial Group | Boston, Massachusetts, United States
Machine Learning Engineer - Conversation AI
@ DoorDash | Sunnyvale, CA; San Francisco, CA; Seattle, WA; Los Angeles, CA