Blog

Three AI Tools, One Recommendation, No Clarifying Questions

02/05/2026

Three AI tools came back with the same recommendation this week.

I’d asked each of them — Claude Opus 4.7 in adaptive mode, Gemini, and Cursor in Ask mode with auto-premium models selected — to advise on a new mobile project. Frontend framework, architecture, an agentic AI workflow, complementary backend or BaaS. Three options each, pros and cons, a pick. Be thorough.

What I’d quietly hoped for was that at least one of them would push back. What frameworks are you most familiar with? What’s the timeline? Solo or team? The kind of clarifying question a peer engineer would ask before holding forth.

None of them did. Each produced a confident, structured answer immediately. And they all picked the same three options — React Native + Expo, Flutter, and Kotlin Multiplatform with Compose — and the same winner. React Native, every time.

That was the first interesting thing. Triple consensus is information. It isn’t telling you which framework is best; it’s telling you which framework the AI tooling layer has been most heavily trained on. Lovable, v0, Bolt — the vibe-coding tools all ship React. So when you ask any of them what to build a new app with, the gravitational pull is towards the framework where the synthetic training corpus is densest. It’s not wrong, exactly. It’s just not a recommendation about you.

I’m not a React expert. So I picked Flutter — the option I knew best — and asked each tool to rewrite the architecture and agentic workflow accordingly.

This is where things diverged.

Gemini got the brief wrong. “Agentic AI heavy workload” got read as “I’m building an AI product,” and half the project structure that came back was carved out for agent code that had no business being in a consumer app. I’d like to think the original prompt was unambiguous. It wasn’t to Gemini.

Cursor and Claude both understood the question, but they answered it in oddly complementary ways.

Cursor leaned structural. It defined agent roles — Planner, Domain, Data, UI, QA, Ops — and a merge-gated sequence: contracts first, data second, UI third, tests fourth. It described task packets the agents should receive: goal, allowed file boundaries, acceptance criteria, do-not-touch list. Generic in the sense that none of it was Flutter-specific, but architecturally serious. The kind of scaffolding that keeps multiple AI agents from stepping on each other’s work.

Claude leaned ecosystem. The response was dense with Flutter-world specifics: Riverpod over BLoC for state, freezed for value classes, melos for the monorepo, golden_toolkit for visual regression, Patrol over raw integration_test for native interactions. Most usefully, it surfaced an operational gotcha that anyone building a Flutter codebase with AI agents will hit eventually: agents will skip running build_runner after generating a new @freezed or @riverpod class, leaving the codebase in a broken state that doesn’t surface until compile time. That belongs in an AGENTS.md rules file on day one.

Neither response was complete on its own. Cursor told me how to organise an agentic workflow but not what to build it out of. Claude told me what to build it out of but underplayed the workflow structure. The right output was a Frankenstein of the two — Cursor’s agent roles and merge gates wrapped around Claude’s specific tool choices and ecosystem caveats.

Two takeaways from a Saturday morning of comparing AI architecture advice.

First: identical recommendations from multiple tools aren’t a stronger signal than one. They’re often the same signal, sourced from the same dominant slice of the training data. If you want a tool to recommend something other than the modal answer, you have to give it the constraints that pull it off-distribution — what you know, what you’ve shipped before, what the team will maintain. None of the three asked for any of that.

Second: the “best” answer to an architecture question doesn’t live in any one tool right now. It lives in the overlap between what each one is good at — and the engineer’s job is increasingly to be the one who can read three answers and assemble the right one. That’s not a coding job. That’s an architecture job, and a judgement job, and a have-I-shipped-this-kind-of-thing-before job. The AI is fast. It isn’t deciding anything.

If you’re staring at a fresh project and wondering whether the AI’s first answer is the right one, that’s a conversation I have most weeks.

Back to all posts