Claude 3.5 vs. Claude 3.7: Fresh Test
Hey everyone! With the release of Claude 3.7 today, a hybrid reasoning model, I decided to test it against Claude 3.5 using @Trickle to generate websites from the same single prompt.
Check out the results:👇
Trickle + Claude 3.5: View Website
Trickle + Claude 3.7: View Website
Findings:🔎
Claude 3.7 creates a more polished, complete website with better details and interactivity.
Clearly, Claude 3.7’s coding capabilities are superior to 3.5, handling more complex structures and finer details.
However, it’s more token-heavy than 3.5.
I’d love to hear if anyone else has tried Claude 3.7 yet or has insights into these trade-offs.
Let’s discuss!🙂
Replies
We added it to our AI-powered spreadsheet, and the results are insane! See here
@cole_at_quadratic can't wait to check out your launch! I'm dying for the robots to come for financial models!
Claude 3.7 is definitely groundbreaking and impressive, but I believe Claude can go even further and evolve even more! Let's go 🚀🚀
Revealio: Discover & Connect
I used Claude 3.5 extensively during the development cycle of my latest app. Recently, I tried Claude 3.7 and noticed something important: Claude 3.5 actually performed better in one key area.
While 3.7 is more advanced in many ways, I've found it to be too docile - it rarely pushes back on questionable requests. Claude 3.5, on the other hand, would often suggest alternative (and frankly better) approaches when my initial idea wasn't optimal.
This constructive resistance from 3.5 led to better outcomes in my development process. I value an AI assistant that acts as a thoughtful collaborator rather than just following instructions without question.
Has anyone else noticed this shift in behavior between versions?
This update is great! When you say you used them in combination, how exactly did you do that?
Claude generates decent code that represents a Trickle style website, however I'm wondering how you put these two together as Trickle seems to have their own Chat AI Assistant.
Pinch
From testing 3.7 vs 3.5 in Cursor Agent, I've found it's a lot better at estimating when it has enough context to implement or it needs to go check more files first. Looking forward to trying it some more
The improved structure and detail sound amazing. I wonder how it handles more creative website designs.