I went through 100 AI conversations — mine, colleagues', and volunteers who let me look at their chat histories. I wasn't looking for prompt quality. I was looking for collaboration patterns.
Here's what I found.
TL;DR
Five patterns emerged. 78% of conversations were three exchanges or fewer. 84% were pure delegation. Only 11% produced a genuine surprise. But the single strongest predictor of insight quality was simple: did the human push back at least once?
The Data
I categorized each conversation across several dimensions: length (number of exchanges), primary mode (delegation, exploration, challenge, etc.), whether the human pushed back on AI's response, and whether the outcome included something the human described as "surprising" or "I hadn't thought of that."
The sample wasn't scientific — it was 100 conversations from about 30 people, mostly knowledge workers, product managers, engineers, and founders. But the patterns were striking.
Finding 1: Most Conversations Are Absurdly Short
| Exchanges | % of conversations |
|---|---|
| 1–2 | 52% |
| 3–4 | 26% |
| 5–8 | 14% |
| 9+ | 8% |
Over half of all AI conversations were one or two exchanges. Ask, receive, done. These weren't trivial questions — many were about strategy, hiring, product decisions.
The problem isn't that short conversations are bad. Sometimes one exchange is enough. The problem is that people default to short conversations even when the topic deserves more depth.
Example
One conversation started with "Should we build feature X?" AI gave a nuanced 400-word response with pros, cons, and alternatives. The user's response: "Thanks." End of conversation. When I asked him about it later, he said: "Yeah, I think I just took the first answer and ran with it. I didn't even consider pushing back."
Finding 2: Delegation Dominates Everything
I tagged each conversation's primary mode:
| Mode | % of conversations |
|---|---|
| Delegation ("do this for me") | 84% |
| Exploration ("help me understand") | 8% |
| Challenge ("argue against me") | 3% |
| Synthesis ("connect these dots") | 3% |
| Other (metacognition, cross-domain, etc.) | 2% |
84% of conversations were pure delegation. AI as a production assistant. Write this, summarize that, generate these.
Again — delegation isn't wrong. But when 84% of your interactions with the most versatile thinking partner in history are "do this for me," you're leaving enormous value on the table.
Finding 3: Surprise Is Rare
Only 11 out of 100 conversations produced something the user described as genuinely surprising — an insight they hadn't considered, a connection they hadn't made, a perspective that shifted their thinking.
This is the saddest number. Not because AI can't surprise — it can. But because the way most people interact with AI prevents surprise. You have to create the conditions for surprise: ask open questions, explore unfamiliar territory, invite challenge. If you only delegate, you'll only get what you already know you wanted.
Finding 4: Length Correlates With Insight
When I cross-referenced conversation length with surprise rate, the pattern was clear:
| Conversation length | Surprise rate |
|---|---|
| 1–2 exchanges | 3% |
| 3–4 exchanges | 8% |
| 5–8 exchanges | 29% |
| 9+ exchanges | 63% |
Conversations with 5+ exchanges were 10x more likely to produce a genuine surprise than 1–2 exchange conversations. Not because longer is inherently better — but because depth creates the conditions for novel connections.
The most valuable insights don't come from AI's first response. They come from the fourth or fifth exchange, after the initial framing has been questioned and refined.
Finding 5: Pushback Is the Strongest Predictor
This was the most important finding.
In conversations where the human pushed back on AI's response at least once — challenged an assumption, asked for the opposite argument, said "I'm not sure that's right" — the surprise rate was 41%.
In conversations without any pushback: 4%.
That's a 10x difference. One single behavior — pushing back — predicted insight quality better than conversation length, topic complexity, or prompt sophistication.
Example
A product manager got AI's recommendation for a pricing change. Instead of accepting, she said: "That assumes our users are price-sensitive. What if they're not? What if the real barrier is trust, not price?" That pushback led to a completely different analysis. AI mapped out a trust-building strategy that included transparent pricing (higher, but with guarantees). The new approach outperformed the original recommendation by 2x in conversion.
What This Means
The 100 conversations told a consistent story. Most people interact with AI in the shallowest possible way — short, delegation-focused, no pushback. And the conversations that produce the most value look completely different — longer, more exploratory, with active human engagement.
The gap between how most people use AI and how they could use it isn't small. It's a 10x gap in insight quality, sitting right there in the data.
What You Can Do With This
Three specific changes based on the data:
- 1.Add two more exchanges. Whatever your conversation length default is, add two. Ask a follow-up. Push on something.
- 2.Push back once. Before accepting AI's response, challenge one part of it. Even if you agree — challenge it anyway. See what happens.
- 3.Try a non-delegation mode once per day. Instead of "do this," try "what am I not seeing about this?" Just once.
If you want to see your own patterns across multiple conversations — that's exactly what we built the AI Leverage Mirror for. Paste a conversation. See where you fall on these dimensions.
But the three changes above will shift your results even without any tool.