Toggle light / dark theme

Learning while Deploying: Fleet-Scale Reinforcement Learning for Generalist Robot Policies

Even the best-trained robots struggle when they leave the lab. They face “distribution shifts”—situations they didn’t see in training, like a brand of cereal with a new box design or a human suddenly walking into their personal space. Static datasets (fixed instructions) simply can’t prepare a robot for every “what if” scenario.

To make sense of all this messy real-world data, the researchers introduced two key technical innovations to the robot’s “Vision-Language-Action” (VLA) brain.


Imagine bringing home a single robot to be your all-in-one kitchen assistant—you want it to brew your morning Gongfu tea, make fresh juice in the afternoon, and mix the perfect cocktail at night. While it might have been trained extensively in a lab, in your house, the counter is slightly higher, the fruit is shaped differently, and your cocktail shaker is transparent. Pre-trained Vision-Language-Action (VLA) models provide an incredible starting point, yet real-world deployment is never a fixed test distribution. This leaves a critical, unsolved challenge: how do we take the heterogeneous experience generated across a fleet of robots and use it to post-train a single, generalist model across a wide range of tasks simultaneously?

We present Learning While Deploying (LWD), a fleet-scale offline-to-online RL framework for continual post-training of generalist VLA policies. Instead of treating deployment as the finish line where a policy is merely evaluated, LWD turns it into a training loop through which the policy improves. A pre-trained policy is deployed across a robot fleet, and both autonomous rollouts and human interventions are aggregated into a shared replay buffer for offline and online updates. The updated policy is then redeployed, enabling continuous improvement by leveraging interaction data from the entire fleet.

A Generalist Learns Beyond Demonstrations

Some robot learning systems have explored data flywheels: deploying a policy, collecting new robot data, extracting high-quality behaviors, and training the next policy to imitate them. While this supports scalable improvement, it still treats deployment mainly as a source of expert demonstrations. Prior post-training systems mainly focus on specialist policies, leaving fleet-scale post-training of a single generalist policy across diverse tasks unresolved.

Home security giant ADT data breach affects 5.5 million people

The ShinyHunters extortion group stole the personal information of 5.5 million individuals after breaching the systems of home security giant ADT earlier this month, according to data breach notification service Have I Been Pwned.

Founded in 1874 as American District Telegraph, ADT is the oldest and largest home security company in the United States, currently providing monitored security and smart home solutions to over 6 million residential and small-business customers.

ADT has previously disclosed two other data breaches in August 2024 and October 2024 that exposed employee and customer information.

Using Moon Regolith to Build Lunar Habitats

“Our results show that you can take a material that is inherently challenging and convert it into something structurally beneficial,” said Dr. Denizhan Yavas. [ https://www.labroots.com/trending/space/30488/using-moon-reg…habitats-2](https://www.labroots.com/trending/space/30488/using-moon-reg…habitats-2)


How can lunar dust (officially called regolith) be used to build future habitats on the Moon? This is what a recent study published in Advanced Engineering Materials hopes to address as a pair of researchers investigated how a novel technique for how lunar regolith could strengthen advanced composite materials. This study has the potential to help reduce the cost of shipping building materials to the Moon for future habitats by using available resources.

For the study, the researchers used lunar regolith simulant, a common substitute for lunar regolith since the latter is in low supply, to examine whether it could be used as a reinforcer for a common aerospace building material called polymer composites. The motivation for this study came from previous lunar regolith research that explored repelling lunar dust using nanoscale polymer surfaces. This is because lunar dust is highly abrasive, as the Apollo astronauts found out, and repelling it could prove beneficial for future astronauts.

Now, the researchers aspired to exploit this abrasiveness to their benefit for developing next generation building material on the Moon. In the end, the researchers found the lunar regolith simulant strengthened both the impact resistance and toughness of the polymers between 30 to 40 percent. Both attributes will be crucial to maintaining lunar habitats due to the Moon’s much harsher environment than Earth, specifically regarding micrometeorite strikes and solar radiation.

Gerard k. O’neill Was Not Honored as Deserved, so Far… But Maybe It’s Not Too Late!

While doing research during the works of the SRI 4th World Congress, I am trying to deepen my knowledge of the immense work done by Gerard K. O’Neill and his Space Studies Institute (SSI) during the second half of the past century.

Gerry took the work where Tsiolkovsky, Oberth, von Braun, and others had left it, on the great theme of rotating habitats in free space. And more, the SSI, founded by him, has developed an incredible amount of very high-profile studies about space manufacturing [1], covering many aspects of living in free-space habitats. Not only scientific and technical issues. According to the O’Neill teachings—as his main references, like Krafft Ehricke and others, had done—human requirements, attention to life and health protection, human rights, and social needs informed all of the developed studies and conceptual design.

Great outreachers like Isaac Asimov, Arthur Clarke, and Stanley Kubrick were ready to follow O’Neill and promote his concepts in their artworks and in their interviews to TV and media magazines.

Explainable Deep Reinforcement Learning for Anomaly Detection in IoT-Enabled Metaverse Healthcare: Toward Trustworthy Cyber Threat Intelligence

JUST PUBLISHED:Click here to read the latest free, Open Access article from Research.


Home Research.

Table Of Contents

Toei Company launches publishing label Toei Games

Japanese entertainment company Toei has established Toei Games, an in-house publishing label.

The company aims to make its games business a new pillar alongside its film, television, and events divisions.

Toei Games will initially release titles on Steam, entering the PC market. The company plans to expand soon to home consoles such as the Nintendo Switch, PlayStation, and Xbox.

Underwater architects: Nest-building in cichlids reveals more than hardwired instinct

We associate nests with shelter, warmth, and a safe retreat—and usually picture a bird’s nest made out of twigs, grass and feathers. Yet many other animals take advantage of such refuges, with nests being built by a diversity of species ranging from termites to great apes, which impress with their hugely varied forms and the wide array of materials used to construct them.

For fish, nest-building comes with an added challenge as they must put together their underwater nests equipped with “only” their fins. Yet fish too have developed a remarkable variety of nest-building innovations, burrowing into sandy lake beds, creating masses of floating bubbles on the water’s surface, or setting up camp in abandoned snail shells repurposed as nests—as is the case with the shell-dwelling cichlid Lamprologus ocellatus.

Endemic to Lake Tanganyika in Africa, these cichlids use empty snail shells for shelter and to raise their young. To do so, the snail shell is positioned and covered in sand in a very specific way, leaving just the opening exposed—only then does it become the perfect home.

Analysis finds geometric thinking may come from wandering, not a human-only math module

Debates over how geometry is understood and learned date back at least to the days of Plato, with more recent scholars concluding that only humans possess the foundations of this understanding. However, a new analysis by New York University psychology professor Moira Dillon concludes that geometry’s foundations are shared by humans and a variety of other animals—from rats to chickens to fish.

“Our ability to think geometrically may not come from a built-in, uniquely human ‘math module’ in the brain, but rather from the same cognitive systems that help humans, as well as animals, find their way home,” explains Dillon, whose work appears in the journal Trends in Cognitive Sciences. “Put another way, our understanding of geometry may very well come from wandering rather than from worksheets.”

While Plato and, later, Descartes and Kant all debated the origins of geometry and the role of cognition in its beginnings, only in the latter half of the 20th century did scientists start testing how it is learned.

/* */