Google began training robots using the Robotic Transformer neural network

Short description

Summarize this content to 100 words In 2022, the Google DeepMind team presented a system for teaching robots new neural network tasks – Robotics Transformer (RT-1). Now RT-2 came out, with which the company began to train its robots.RT-1 was used to train the Everyday Robot on more than 700 tasks. The system included a database of 130,000 demonstrations, which, according to the DeepMind team, led to successful completion of tasks 97% of the time.Now, DeepMind’s head of robotics, Vincent Vanhoek, has revealed that RT-2 allows robots to effectively transfer concepts learned on relatively small data sets to different scenarios. It’s a new version of what the company calls the Vision-Language-Action (VLA) model. The model teaches robots to better recognize visual and speech patterns, interpret instructions, and make inferences about which objects are best suited for a query. She trained on web and robotics data, using research advances in large language models such as Google’s Bard and combining them with robotics data. The model understands instructions in languages ​​other than English.”RT-2 demonstrates improved generalization capabilities, as well as semantic and visual understanding beyond training data,” Google explains. The neural network can interpret new commands and respond to user commands by performing elementary reasoning, in particular about categories of objects. It is able to find the best tool for a specific new task based on existing contextual information.Vanhoek presents a scenario in which a robot is asked to take out the trash. In many models, the user must be trained to identify what is considered trash and then trained to pick up the waste and dispose of it. “RT-2 already has an idea of ​​what debris is and can identify it without special training,” Vanhoek writes. — “He even has an idea of ​​how to throw away the garbage, although he was never taught this action. And think about the abstract nature of trash—what was a bag of chips or a banana peel becomes trash after you eat the contents. RT-2 can understand this from its visual language training data and do its job.”The researchers tested the RT-2 with a manipulator in a kitchen, asking the robot to decide what to make a good makeshift hammer out of (it was a rock) and choose a drink for an exhausted person (Red Bull). They also told the job to move the can from under the Coca-Cola to the picture of Taylor Swift. The team says that the efficiency level for new tasks improved from 32 to 62% when moving from RT-1 to RT-2.In 2022, Google introduced a robot that understands natural speech and generates machine code on its own. The Code as Policies (CaP) project is based on the Google Pathways Language Model (PaLM) artificial intelligence algorithm.In June, DeepMind created a robot called RoboCat with artificial intelligence. The development team said it had achieved a breakthrough in RoboCat’s learning of new tasks, as well as improved performance by having the robot build its own data about it.

Google began training robots using the Robotic Transformer neural network

In 2022, the Google DeepMind team presented a system for teaching robots new neural network tasks – Robotics Transformer (RT-1). Now RT-2 came out, with which the company began to train its robots.

RT-1 was used to train the Everyday Robot on more than 700 tasks. The system included a database of 130,000 demonstrations, which, according to the DeepMind team, led to successful completion of tasks 97% of the time.

Now, DeepMind’s head of robotics, Vincent Vanhoek, has revealed that RT-2 allows robots to effectively transfer concepts learned on relatively small data sets to different scenarios. It’s a new version of what the company calls the Vision-Language-Action (VLA) model. The model teaches robots to better recognize visual and speech patterns, interpret instructions, and make inferences about which objects are best suited for a query. She trained on web and robotics data, using research advances in large language models such as Google’s Bard and combining them with robotics data. The model understands instructions in languages ​​other than English.

“RT-2 demonstrates improved generalization capabilities, as well as semantic and visual understanding beyond training data,” Google explains. The neural network can interpret new commands and respond to user commands by performing elementary reasoning, in particular about categories of objects. It is able to find the best tool for a specific new task based on existing contextual information.

Vanhoek presents a scenario in which a robot is asked to take out the trash. In many models, the user must be trained to identify what is considered trash and then trained to pick up the waste and dispose of it. “RT-2 already has an idea of ​​what debris is and can identify it without special training,” Vanhoek writes. — “He even has an idea of ​​how to throw away the garbage, although he was never taught this action. And think about the abstract nature of trash—what was a bag of chips or a banana peel becomes trash after you eat the contents. RT-2 can understand this from its visual language training data and do its job.”

The researchers tested the RT-2 with a manipulator in a kitchen, asking the robot to decide what to make a good makeshift hammer out of (it was a rock) and choose a drink for an exhausted person (Red Bull). They also told the job to move the can from under the Coca-Cola to the picture of Taylor Swift.

The team says that the efficiency level for new tasks improved from 32 to 62% when moving from RT-1 to RT-2.

In 2022, Google introduced a robot that understands natural speech and generates machine code on its own. The Code as Policies (CaP) project is based on the Google Pathways Language Model (PaLM) artificial intelligence algorithm.

In June, DeepMind created a robot called RoboCat with artificial intelligence. The development team said it had achieved a breakthrough in RoboCat’s learning of new tasks, as well as improved performance by having the robot build its own data about it.

Related posts