Chris Messina

AI Diplomacy - We made AIs battle for world domination

We gave seven AIs command of Europe's great powers to battle for global supremacy. Would o3 betray Claude? Could Gemini outwit DeepSeek? In AI Diplomacy, language models lie, scheme, and form shaky alliances in a high-stakes strategy game.

Add a comment

Replies

Best
Brandon Gell

Hey Product Hunt! 👋

We've been working on AI Diplomacy for months and are excited to make it public today.

We built this because traditional AI benchmarks are challenging to understand and don't actually reflect how we, as humans, interact with AI. We wanted a way to understand the quality of how AI communicates and its ability to strategize long term. The cherry on top was seeing if it was able to lie and betray!

So we tried something different: What if we just let AI models play Diplomacy against each other and we exposed their communication and thinking behind each move?

The results are both entertaining and insightful. We tested out 18 different models across countless games to understand how each AI performs. One of our favorite insights: OpenAI's o3 turned into a master manipulator, lying and backstabbing its way to victory. Meanwhile, Anthropic's Claude 4 Opus refused to betray anyone—even when losing.

It's completely open source, and we'd love your help making it better! Try different model combinations, suggest new features, or just enjoy watching AIs negotiate (and betray) each other.

Huge thanks to Alex Duffy, Tyler Marques, Sam Paech, The TextArena team, Oam Patel, and countless others for leading the build, and the entire team at Every for making this launch possible.

Nimalan Mahendran

@brandon_gell nice work! I love this! It looks like the link posted 404s 😓 I'd love to look at the code and see how it's built and/or help out. Even having an open source Diplomacy UI and game engine would be amazing.

Chris Messina
Top Hunter
Hunter
📌

This is awesome — I want to see the LLMs hooked up to Civilization next! 😜

Nimalan Mahendran

@chrismessina omg my second biggest gripe with the civ series is they make it harder at higher levels by giving the AIs ridiculous bonuses so I'm usually getting my swordsmen bodied by knights in 500AD 🫠 Real tunable AI would just actually make it fun.

Alex Cloudstar

Congratulations on the launch of AI Diplomacy! 🎉 It's intriguing to see how different AI models perform in complex interactions like negotiation and strategy. The insights you’ve shared about their varied behaviors are fascinating! Looking forward to exploring how these AIs negotiate and plot! 🚀

Evgenii Zaitsev

This is a really fun and unique approach to understanding AI behavior! The idea of using AI Diplomacy to test models’ strategic thinking, decision-making, and communication skills is brilliant.

Michael Taylor

How well do you think they would do against a human player?

Brandon Gell

@hammer_mt we're doing to try and launch taht next!

Nimalan Mahendran

I love this! Diplomacy is my favorite board game. It's ruthless, and the only element of chance is that players won't do what they said they'd do. I think that this would be an incredible benchmark/arena for alignment and reasoning, including reasoning about the actions of other agents.

It also lowers the barrier to playing Diplomacy for humans. I'd love to have AI players with tunable difficulty, especially because I wouldn't have to worry about ruining any IRL relationships. It reminds me of the chess bots on Chess.com that all have different difficulty levels, personalities, and play styles.

Brandon Gell

@nimalan Diplomacy is the best! Checkout the article we just wrote on Every.to... we're thinking the same