Aug. 14, 2023, noon | code_your_own_AI

code_your_own_AI www.youtube.com

Two simple examples to optimize reward functions (transformer based) for RL of a fleet of taxis in New York (learning from their environment interactions) and Reinforcement Learning (RL multi-agents) for swarm intelligence of 100 drones exploring Jupiter's stormy atmosphere.

Open Problems and Fundamental Limitations of
Reinforcement Learning from Human Feedback
https://arxiv.org/pdf/2307.15217.pdf

#ai
#reinforcementlearning
#datascience

agents atmosphere drones environment examples feedback functions game game theory human human feedback intelligence interactions jupiter limitations optimization policy reinforcement reinforcement learning reinforcementlearning robotics simple theory transformer

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Reporting & Data Analytics Lead (Sizewell C)

@ EDF | London, GB

Data Analyst

@ Notable | San Mateo, CA