Eureka: With GPT-4 overseeing training, robots can learn much faster

On Friday, researchers from Nvidia, UPenn, Caltech, and the University of Texas at Austin announced Eureka, an algorithm that uses OpenAI’s GPT-4 language model for designing training goals (called “reward functions”) to enhance robot dexterity. The work aims to bridge the gap between high-level reasoning and low-level motor control, allowing robots to learn complex tasks rapidly using massively parallel simulations that run through trials simultaneously. According to the team, Eureka outperforms human-written reward functions by a substantial margin.

“Leveraging state-of-the-art GPU-accelerated simulation in Nvidia Isaac Gym,” writes Nvidia on its demonstration page, “Eureka is able to quickly evaluate the quality of a large batch of reward candidates, enabling scalable search in the reward function space.

Blog