{"id":503062,"date":"2026-03-26T20:56:17","date_gmt":"2026-03-27T00:56:17","guid":{"rendered":"https:\/\/deultimominuto.net\/en\/uncategorized\/microsoft-develops-the-ai-that-most-robots-lack-the-ability-to-make-good-decisions\/"},"modified":"2026-03-26T20:56:17","modified_gmt":"2026-03-27T00:56:17","slug":"microsoft-develops-the-ai-that-most-robots-lack-the-ability-to-make-good-decisions","status":"publish","type":"post","link":"https:\/\/deultimominuto.net\/en\/uncategorized\/microsoft-develops-the-ai-that-most-robots-lack-the-ability-to-make-good-decisions\/","title":{"rendered":"Microsoft develops the AI that most robots lack: the ability to make good decisions"},"content":{"rendered":"The\u00a0robotics<\/strong> is advancing rapidly, but most robots still face a fundamental limitation: the difficulty in\u00a0making precise decisions<\/strong>\u00a0about what action to take and where to carry it out.\u00a0Microsoft<\/strong>, along with a consortium of academic researchers, has presented a new standard,\u00a0GroundedPlanBench<\/strong>, which seeks to solve this challenge and bring the artificial intelligence of robots closer to efficient and contextualized decision-making.\n\nIn conventional robotic systems, the decision-making process is divided into two stages<\/strong>. First, a vision and language model generates a plan in natural language. Then, another system translates that plan into physical actions. This fragmented approach causes frequent errors, as the disconnection between the plan and the execution allows mistakes in one stage to be carried over to the next.\n\nTypical errors include confusion about which object to manipulate<\/strong> or the invention of unnecessary steps. For example, if a robot is asked to discard paper cups, it may not correctly identify which cup to pick up or even perform unsolicited actions. These failures are aggravated in cluttered environments, where objects are similar or numerous.\n\n\n

We recommend reading:OpenAI will close Sora, its AI video generation platform<\/a><\/strong><\/h2>\n\n\n\n

GroundedPlanBench: A New Standard for Improving Decision-Making<\/strong><\/h2>\n\n\nTo address this challenge, Microsoft and its partners<\/strong> have developed GroundedPlanBench<\/strong>, a system that evaluates whether AI models can plan tasks while accurately identifying where each action should be performed.\n\n\n

Unlike traditional systems that only use text, this standard links each action to a specific location in an image<\/strong>. Actions such as grabbing, placing, opening, or closing are associated with specific objects or positions, forcing the AI to connect the decision with the real physical environment.<\/p>\n\n\nThe benchmark includes more than a thousand tasks<\/strong> based on real robot interactions. Some instructions are direct, such as placing a spoon on a plate, while others are open-ended, such as tidying a table. This variety is crucial, as robots often fail when instructions are not clear enough.\n\nIn one of the experiments, a robot had to place four napkins on a sofa<\/strong>. The lack of specificity in the instruction caused the system to repeat the action on the same napkin, even with seemingly more precise descriptions such as \u201cupper left napkin\u201d. This shows that ambiguous language continues to be an obstacle to the reliable execution of complex tasks.\n\n\n

Learning based on real tasks<\/strong><\/h2>\n\n\nTo improve decision-making capabilities, the team developed a training method called Video-to-Spatially Grounded Planning (V2GP)<\/strong>. This system analyzes videos of robots performing tasks, detects interactions with objects, identifies those objects, and tracks their locations, thus generating structured plans that link each action to a specific point.\n\n\n

Using this approach, researchers generated more than 40,000 \"grounded\" plans<\/strong>, ranging from simple actions to complex sequences of up to 26 steps. The models trained with this method demonstrated a better ability to choose appropriate actions and associate them with the correct objects, as well as reduce repetitive errors such as acting multiple times on the same element.<\/p>\n\n\n\n

A Paradigm Shift for Robotics<\/strong><\/h2>\n\n\nDespite the advances, challenges persist, especially in long tasks and with indirect instructions. Researchers warn that models must be able to reason about extensive sequences<\/strong> and maintain coherence throughout multiple steps. When comparing the new approach with traditional systems, it was observed that the latter tend to assign multiple actions to the same object or place<\/strong>, especially when the orders are ambiguous.\n\nThe integration of planning and localization<\/strong> into a single process reduces these mismatches and allows for more precise decisions. The Microsoft team suggests that future research could combine this method with predictive models capable of anticipating the consequences of each action, which would help robots avoid errors in real time.\n\nYou can also read:\n\nThe study's conclusions point to a clear direction for the future of robotics: systems that jointly consider action and location<\/strong> are more likely to operate successfully in real environments. This innovation represents a key step<\/strong> for robots to be able to decide and act reliably in everyday tasks, bringing them closer to a true applied artificial intelligence.","protected":false},"excerpt":{"rendered":"

The\u00a0robotics is advancing rapidly, but most robots still face a fundamental limitation: the difficulty in\u00a0making precise decisions\u00a0about what action to take and where to carry it out.\u00a0Microsoft, along with a consortium of academic researchers, has presented a new standard,\u00a0GroundedPlanBench, which seeks to solve this challenge and bring the artificial intelligence of robots closer to efficient […]<\/p>\n","protected":false},"author":133556,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":{"0":"post-503062","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-uncategorized"},"acf":[],"jetpack_featured_media_url":"","dum_api":{"author_name":"Yerandi Santana","author_image":"https:\/\/deultimominuto.net\/wp-content\/uploads\/2026\/02\/cropped-WhatsApp-Image-2026-02-13-at-5.35.07-PM-96x96.jpeg","categories_name":["Uncategorized"],"featured_media_url":null},"_links":{"self":[{"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/posts\/503062","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/users\/133556"}],"replies":[{"embeddable":true,"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/comments?post=503062"}],"version-history":[{"count":0,"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/posts\/503062\/revisions"}],"wp:attachment":[{"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/media?parent=503062"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/categories?post=503062"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/deultimominuto.net\/en\/wp-json\/wp\/v2\/tags?post=503062"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}