Hi everyone!
Amazon joins the agent competition!
Sharing Amazon Nova Act, a new AI model (currently in research preview) from Amazon's AGI lab. This model is designed to power agents that can reliably take actions inside a web browser.
While many AI agents struggle with complex web tasks, Nova Act focuses on reliable building blocks. Developers can use the associated Nova Act SDK to leverage the model, breaking down workflows into dependable "atomic" commands (like filling forms or picking dates). The SDK also allows combining these AI-driven commands with custom Python code, API calls, or direct Playwright manipulation for more control.
Key aspects:
🤖 Powers Browser Agents: The underlying AI model for web automation.
✅ Focus on Reliability: Designed for accurate execution of individual web interaction steps (claims >90% on internal tests).
💻 SDK for Developers: The primary way to build with Nova Act right now, offering fine-grained control.
📊 Strong Benchmarks: Amazon reports strong performance on UI interaction benchmarks like ScreenSpot and GroundUI Web.
🔬 Research Preview: This is an early release for experimentation.
Amazon sees this focus on reliable building blocks as key for future agent capabilities.
Really cool. https://browser-use.com/ definitely needs some competition. Being really into ui-automation my self and know how hard this is to pull off reliably. I do wonder with the buzz around MCP for apps, if there is a similar opportunity to make an MCP layer for websites. Like a standard every website could implement. We support MCP-web. Sort of like how RSS was built into every blog back in the days. Just some food for thought. 😸 As we say on PH congrats on the launch Amazon 🚀
Really cool. https://browser-use.com/ definitely needs some competition. Being really into ui-automation my self and know how hard this is to pull off reliably. I do wonder with the buzz around MCP for apps, if there is a similar opportunity to make an MCP layer for websites. Like a standard every website could implement. We support MCP-web. Sort of like how RSS was built into every blog back in the days. Just some food for thought. 😸 As we say on PH congrats on the launch Amazon 🚀
@sentry_co Browser Use is definitely influential right now. Open-source multi-agent frameworks like OWL are also worth watching.
Grimo
another actionable model, congrats! as a user of Operator & Proxy & Manus, just cant wait to try it out and give a full comparison!
@stainlu Super cool. Manus also has an App recently!