All posts
psychologyproductivitypersonal_knowledge_management

How to Think Better

How to Think Better

Is there a right way to think? A fascinating topic that I don’t believe many people consider. I’m not referring to ethics or philosophy, simply when approaching creativity, research, or decision-making. Is there a right way to do it?

I started asking this question seriously about two years ago. I was making decisions at work that felt right in the moment but looked obviously wrong three months later. Not because I was stupid but because I had no system for catching my own blind spots. I was just trusting my gut and calling it analysis.

So I went looking for what actually works. Not productivity hacks or mental models from Twitter threads. Peer-reviewed research. Stuff with measured effect sizes and honest caveats about when it doesn’t work. What I found surprised me. Most of the popular “thinking better” advice has no evidence behind it. But a handful of techniques have been tested rigorously and they’re not the ones people talk about.

Here’s everything I found that’s actually backed by science. It’s a lot of information so buckle up.

Techniques

Keep a Prediction Log

The single most robust finding across all thinking research is that reliable and accurate judgement requires feedback loops. Without this important calibration step hindsight bias slowly distorts memory. If you don’t explicity record predictions before the final outcome is known, you slowly start thinking your prediction was the outcome all along, even when it wasn’t. A prediction log solves this.

How to Do It Write down any meaningful judgment or decision with an explicit probability (“I think X will happen; 70% confident”). Set a calendar reminder to review outcomes monthly. Over time, look for systematic patterns: Are you consistently overconfident at 80%? Do you underestimate how long things take?

A one-hour training module covering calibration basics improved accuracy by ~10% with effects lasting at least a year.1

Caveat The superforecasting results come from forecasting geopolitical events with clear, objectively scored outcomes. Most real decisions — career moves, strategic calls, relationship choices — are ambiguous and nearly impossible to score cleanly. The feedback loop that makes calibration work may be weaker for the decisions that feel most important.

Pre-Mortem Before Big Decisions

The pre-mortem technique inverts normal planning. Instead of asking “how will this succeed?”, you assume the project has already failed — it’s a year from now, and things went badly — and write down why. This framing shift is called prospective hindsight and it works for two reinforcing reasons.

First, imagining an event as already having occurred increases access to explanatory reasoning. Mitchell, Russo, and Pennington’s 1989 study found prospective hindsight increased identification of potential failure causes by approximately 30%.2 Second, it creates psychological cover for dissent: framing criticism as an intellectual exercise about an imagined failure legitimises pessimism in organisational contexts that would otherwise suppress it.3 Kahneman called it his single most valuable technique for combating the planning fallacy.

How to Do It Before committing to a plan, set aside 10–15 minutes. Write at the top of a page: “It is one year from now. This project failed completely. Why?” Generate as many reasons as you can. Share results with others before finalising the plan.
Measured Effect ~30% increase in identification of potential failure causes vs. standard planning (Mitchell et al., 1989).
Caveat The 30% improvement was measured in lab conditions on the task of generating reasons — not on actual decision quality outcomes. You might get better at articulating risks without meaningfully changing the decisions you make.

Sources

  • Mitchell, D.J., Russo, J.E., & Pennington, N. (1989). Back to the Future: Temporal Perspective in the Explanation of Events. Journal of Behavioral Decision Making, 2(1), 25–38.
  • Klein, G. (2007). Performing a Project Premortem. Harvard Business Review.
  • Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

Separate Idea Generation from Evaluation

Mixing divergent and convergent thinking degrades both. When you evaluate ideas as they emerge, two things happen: you prematurely discard options that would have been valuable if developed further, and you anchor the group to whoever spoke first. This is one of the most replicated findings in creativity research.

Mullen, Johnson, and Salas’s meta-analysis quantified the damage: individuals working alone and pooling ideas afterward outperformed interactive brainstorming groups by approximately 83%.4 The main culprit is production blocking — having to wait your turn interrupts idea flow and causes forgetting. A secondary culprit is evaluation apprehension — the fear of being judged that prevents people from voicing unpolished ideas in a group setting.

What actually works: brainwriting (silent individual generation, then sharing), electronic brainstorming allowing simultaneous anonymous contribution, and hybrid approaches where individual ideation strictly precedes group discussion.

How to Do It For creative or strategic work: generate ideas alone first, in writing, without filtering. Only then bring them together for evaluation. In group settings, have everyone write independently for 5–10 minutes before anyone speaks.
Measured Effect Individual-then-pooled ideation outperforms interactive brainstorming by ~83% in quantity of non-redundant ideas generated.
Caveat Most real thinking happens in meetings and fast-moving conversations where enforcing phase separation requires deliberate effort and organisational buy-in. The advice is structurally sound but practically fragile without intentional process design.

Sources

  • Mullen, B., Johnson, C., & Salas, E. (1991). Productivity Loss in Brainstorming Groups: A Meta-Analytic Integration. Basic and Applied Social Psychology, 12(1), 3–23.
  • Diehl, M. & Stroebe, W. (1987). Productivity Loss in Brainstorming Groups. Journal of Personality and Social Psychology, 53(3), 497–509.
  • Paulus, P.B. & Yang, H.C. (2000). Idea Generation in Groups. Organizational Behavior and Human Decision Processes, 82(1), 76–87.

Use Reference Class Forecasting

Reference class forecasting is the practice of anchoring estimates to how similar situations have actually played out historically, before adjusting for the specifics of your case. It directly counteracts the planning fallacy — the systematic tendency to underestimate time, cost, and risk by focusing exclusively on the internal details of a plan rather than the base rate of comparable projects.

Bent Flyvbjerg analysed more than 2,000 major infrastructure projects and found reference class forecasting roughly halved typical cost overrun rates.5 The technique has since been adopted as official UK government policy for transport project planning. Kahneman endorsed it as one of the few debiasing approaches with real-world institutional traction — he calls it the “outside view.”6

How to Do It Before estimating how long something will take or how well a plan will go, ask: “What is the reference class of projects like this one? How have they historically turned out?” Find actual base rates, anchor your estimate there, and only then adjust for the specific features of your case.
Measured Effect Reference class forecasting roughly halved cost overrun rates across 2,000+ infrastructure projects (Flyvbjerg, 2006). Now mandatory policy in the UK Treasury and Denmark’s Ministry of Finance.
Caveat Choosing the right reference class is genuinely difficult. Flyvbjerg’s more recent work acknowledges that “reference class selection” remains an open methodological challenge — if you pick the wrong comparison set, you’re anchoring to the wrong base rate.

Sources


Walk When You’re Stuck on a Creative Problem

Walking increases activity in the default mode network — the brain’s associative processing system — while reducing prefrontal inhibition of unusual connections. This is why things that seem unrelated start linking up when you’re moving.

Oppezzo and Schwartz (2014) ran four experiments measuring creative output via the Alternate Uses Test before and after sitting vs. walking.7 Walking increased creative output in 81% of participants. Critically, walking on a treadmill in a boring room produced nearly as strong an effect as walking outdoors — suggesting movement itself, not scenery, drives the benefit. A residual creative boost also persisted after sitting back down.

Incubation research provides convergent support: Sio and Ormerod’s 2009 meta-analysis of 117 studies found that the most effective breaks involve low-demand activities — exactly what a walk provides. The leading mechanism is fixation forgetting: stepping away allows incorrect solution paths to fade from working memory, freeing access to new approaches.8

How to Do It When stuck on a creative or strategic problem, take a 20-minute walk without headphones. Do not switch to another demanding task. This works specifically for divergent thinking (generating options, new angles) — not for analytical or convergent tasks where a specific correct answer is required.
Measured Effect 81% of participants showed increased divergent thinking output while walking (Oppezzo & Schwartz, 2014). Incubation meta-analysis found d = 0.21 for divergent tasks specifically.
Caveat The Oppezzo & Schwartz findings come from a single lab and have not been independently replicated at scale. Effect sizes this clean in psychology often shrink under replication. Walking does not help with convergent thinking.

Sources


Quantify Your Uncertainty

Translating vague feelings (“I think probably…”) into specific probabilities forces you to confront your actual confidence — and routinely reveals overconfidence you weren’t aware of. It also enables calibration: you can track whether things you said were 65% likely happened about 65% of the time.

Vague verbal expressions of probability are interpreted wildly differently by different people. Mosteller and Youtz found “likely” maps to anywhere from 55% to 99% depending on the person — making numerical probabilities the only reliable shared language for uncertainty.9 Tetlock’s superforecasters distinguished not just between 60% and 40% but between 60% and 55%. That granularity forces genuine reasoning about what evidence actually supports.

How to Do It When you form a belief or make a prediction, assign a probability to it. Use the prediction log to track these. When making decisions in a group, have everyone write their probability estimate before sharing — otherwise anchoring dominates.
Measured Effect A one-hour calibration training module improved forecasting accuracy by ~10%, sustained over at least a year (Mellers et al., 2015). Superforecasters who use probabilistic language consistently outperform those relying on verbal uncertainty expressions.
Caveat Precise numbers create an illusion of rigour if the underlying reasoning is flawed. Saying 68% confident is just confident noise if your reference class is wrong or you’re missing key information. Quantifying uncertainty is a tool for surfacing and recording reasoning — not a substitute for it.

Sources

  • Tetlock, P. & Gardner, D. (2015). Superforecasting. Crown.
  • Galef, J. (2021). The Scout Mindset. Portfolio/Penguin.
  • Mosteller, F. & Youtz, C. (1990). Quantifying Probabilistic Expressions. Statistical Science, 5(1), 2–12.
  • Mellers, B. et al. (2015). The Psychology of Intelligence Analysis. Journal of Experimental Psychology: General.

Consider the Opposite — But Keep It Brief

Generating reasons why your initial judgment might be wrong is one of the few debiasing techniques with consistent empirical support across multiple biases. It reduces overconfidence, anchoring, and hindsight bias. The mechanism is straightforward: it forces engagement with disconfirming evidence that confirmation bias would otherwise cause you to ignore or underweight.

Lord, Ross, and Lepper’s seminal 1979 study showed that asking people to “be objective and unbiased” produced zero effect.10 What worked was the structured prompt to actively generate counterarguments. The critical dose caveat comes from Sanna et al.: asking people to generate ten counterarguments actually backfired — reinforcing the original belief. When generating counterarguments feels hard, that difficulty gets misread as evidence the original position is correct.11 Two to three is the effective sweet spot.

How to Do It Before finalising any significant judgment, write down 2–3 specific reasons you could be wrong. Not “maybe I’m missing something” — actual concrete reasons. Stop at three.
Measured Effect Structured consider-the-opposite prompts reduce anchoring bias and overconfidence across multiple studies (Mussweiler et al., 2000; Koriat et al., 1980). Most robust effects for anchoring and overconfidence specifically.
Caveat Effects are demonstrated in lab settings on isolated judgments. Real-world transfer — whether this practice changes reasoning in embedded, emotionally loaded decisions — is not well-established. The hard stop at 2–3 reasons is not a suggestion; exceeding it reliably backfires.

Sources

  • Lord, C.G., Ross, L., & Lepper, M.R. (1979). Biased Assimilation and Attitude Polarization. Journal of Personality and Social Psychology, 37(11).
  • Sanna, L.J. & Schwarz, N. (2003). Debiasing the Hindsight Bias. Psychological Science, 14(5).
  • Mussweiler, T., Strack, F., & Pfeiffer, T. (2000). Overcoming the Inevitable Anchoring Effect. Personality and Social Psychology Bulletin, 26(9).
  • Larrick, R.P. (2004). Debiasing. In Blackwell Handbook of Judgment and Decision Making.

Incubate Intentionally

Stepping away from a problem genuinely helps — but only under specific conditions that most people don’t observe. Sio and Ormerod’s 2009 meta-analysis of 117 studies confirmed incubation improves performance on divergent thinking tasks, with key conditions: substantial initial engagement before the break (short preparation yields no benefit), and a low-demand activity during the break (complete rest and high-demand tasks both underperform).

The leading mechanism is fixation forgetting — intensive work on a problem causes incorrect solution approaches to dominate working memory and block access to better ones. A low-demand break allows those fixated paths to fade.8 Gilhooly et al. found positive incubation occurs without intermittent conscious engagement, supporting the forgetting account over the folk theory that “your brain keeps working on it in the background.”12

How to Do It Work hard on a problem first — don’t skip this step, it’s required for incubation to work. Then do something genuinely low-demand: walk, shower, cook, do routine chores. Don’t switch to another demanding cognitive task. Return to the problem after at least 20 minutes.
Measured Effect Meta-analytic effect of d = 0.21 for incubation on problem-solving overall; stronger for divergent tasks (d = 0.33) than insight problems.
Caveat The effect is real but modest. It does not work for problems requiring fundamental structural restructuring — classical insight puzzles (nine-dot, matchstick) show no reliable incubation benefit. Most applicable to generative, associative thinking tasks.

Sources


Collect Independent Views Before Group Discussion

Group discussion systematically degrades judgment quality through two mechanisms: anchoring to whoever speaks first, and social conformity pressure that causes people to defer rather than disagree. Both effects are well-documented and large.

Larrick and Soll demonstrated that averaging independent estimates consistently outperforms the best single estimate in a group — the wisdom of crowds effect only works when estimates are genuinely independent.13 Once people hear each other, estimates converge toward whoever spoke first, destroying the diversity of perspective that makes group judgment valuable. Lorenz et al.’s 2011 study found social influence can account for 50%+ of final judgment variance.14

Kahneman’s Noise (2021) argues that this variability from social influence may exceed the variability from systematic bias — making independent pre-collection one of the highest-leverage structural interventions available.15

How to Do It Before any high-stakes group discussion, ask everyone to write their assessment privately and submit it before the meeting. Reveal the distribution of views first, then discuss. Consider keeping at least one person explicitly in a devil’s advocate role throughout.
Measured Effect Independent aggregation consistently outperforms interactive group discussion (Larrick & Soll, 2006). Social influence accounts for 50%+ of final judgment variance in groups (Lorenz et al., 2011).
Caveat This requires deliberate process design and willingness to slow down. In practice, whoever called the meeting usually speaks first and frames the discussion. Overcoming this requires explicit commitment from whoever holds the senior role in the room.

Sources


Build Checklists for High-Stakes Recurring Processes

Checklists work in procedural domains by offloading memory and sequencing demands from working memory onto a reliable external system. They don’t make people think better — they ensure that things people already know how to do are actually done, consistently, under conditions of fatigue, time pressure, and distraction.

The evidence is strongest in medicine. Haynes et al.’s 2009 NEJM study found the WHO Surgical Safety Checklist reduced major complications by 36% and deaths by 47% across eight hospitals in eight countries.16 Pronovost et al.’s Michigan ICU study found a simple five-item central line checklist nearly eliminated catheter-related bloodstream infections, saving an estimated 1,500 lives and $200 million over 18 months.17

How to Do It Identify 2–3 recurring, high-stakes processes in your work where a missed step has real consequences — not novel decisions, established procedures. Build a short checklist of 5–9 items maximum. Items should be things people might actually forget under pressure, not things they couldn’t possibly overlook.
Measured Effect 36% reduction in surgical complications and 47% reduction in deaths (Haynes et al., NEJM 2009). Near-elimination of catheter bloodstream infections in Michigan ICUs (Pronovost et al., 2006).
Caveat A population-level Ontario study mandating checklist use across hospitals found no effect — implementation quality is the critical moderator. A checklist completed mindlessly or under compulsion is not the same as one people trust and use deliberately. Checklists also don’t help with novel, unstructured decisions — only with recurring procedural tasks.

Sources


What to Skip: Collecting Mental Models

The practice of systematically collecting frameworks across disciplines — inversion, second-order thinking, the map is not the territory, etc. — has no direct empirical evaluation and is undermined by the most consistent finding in cognitive training research: far transfer is essentially zero.

A second-order meta-analysis by Sala & Gobet (2019) aggregating 119 cognitive training studies found that when placebo effects and publication bias were controlled, the overall effect size for far transfer equalled zero.18 Thorndike established the “identical elements” constraint on transfer in 1901. Detterman’s 1993 review found near-universal agreement among reviewers that little transfer occurs without shared structural elements between training and application domains.19

What mental model practice may actually be doing is building metacognitive awareness — the habit of asking yourself how you’re thinking — rather than providing transferable cross-domain frameworks. If that’s the mechanism, then the model collection is not the active ingredient; the deliberate reflection is.

The Far Transfer Problem “Reviewers are in almost total agreement that little transfer occurs.” — Detterman, 1993. The evidence that reading about inversion or second-order thinking automatically improves unrelated decisions simply does not exist.

Sources


The Thread Running Through Everything

Across all of the above — and across cognitive science, decision-making research, and creativity studies broadly — the single most replicated principle is this:

Thinking improves when judgments are made explicit, recorded, and evaluated against outcomes.

The prediction log creates this structure for decisions. The pre-mortem makes implicit assumptions about failure explicit before a decision is made. Quantifying uncertainty creates a record to compare against reality. Collecting independent views before group discussion preserves what people actually thought before social influence changed it.

Everything else, walking, incubating, separating generation from evaluation, checklists, is in service of the same underlying structure: making the invisible visible, so that feedback can do its job.

I don’t do all of these perfectly. I keep a prediction log that I forget to review half the time. I run pre-mortems on big projects but skip them on medium ones that probably need them just as much. I still catch myself evaluating ideas the second they come out of someone’s mouth in a meeting.

But the handful of times I’ve actually followed through on these techniques, the results were noticeable. Not in a dramatic “I unlocked 10x thinking” way. More like I stopped making the same mistake twice. I started noticing when I was overconfident before the outcome proved me wrong instead of after. I got better at sitting with uncertainty instead of rushing to a conclusion just to stop the discomfort.==

That’s what thinking better actually looks like. Not being smarter. Just being less wrong, more often, and noticing sooner when you are.


Footnotes

See Also

Footnotes

  1. AI Impacts. (2015). Evidence on Good Forecasting Practices from the Good Judgment Project. aiimpacts.org

  2. Mitchell, D.J., Russo, J.E., & Pennington, N. (1989). Back to the Future: Temporal Perspective in the Explanation of Events. Journal of Behavioral Decision Making, 2(1), 25–38.

  3. Klein, G. (2007). Performing a Project Premortem. Harvard Business Review.

  4. Mullen, B., Johnson, C., & Salas, E. (1991). Productivity Loss in Brainstorming Groups: A Meta-Analytic Integration. Basic and Applied Social Psychology, 12(1), 3–23.

  5. Flyvbjerg, B. (2006). From Nobel Prize to Project Management: Getting Risks Right. Project Management Journal, 37(3).

  6. Kahneman, D. & Lovallo, D. (1993). Timid Choices and Bold Forecasts. Management Science, 39(1).

  7. Oppezzo, M. & Schwartz, D.L. (2014). Give Your Ideas Some Legs: The Positive Effect of Walking on Creative Thinking. Journal of Experimental Psychology: Learning, Memory, and Cognition, 40(4), 1142–1152.

  8. Sio, U.N. & Ormerod, T.C. (2009). Does Incubation Enhance Problem Solving? A Meta-Analytic Review. Psychological Bulletin, 135(1), 94–120. 2

  9. Mosteller, F. & Youtz, C. (1990). Quantifying Probabilistic Expressions. Statistical Science, 5(1), 2–12.

  10. Lord, C.G., Ross, L., & Lepper, M.R. (1979). Biased Assimilation and Attitude Polarization. Journal of Personality and Social Psychology, 37(11).

  11. Sanna, L.J. & Schwarz, N. (2003). Debiasing the Hindsight Bias. Psychological Science, 14(5).

  12. Gilhooly, K.J. et al. (2013). Incubation and Creativity: Do Something Different. Thinking & Reasoning, 19(2).

  13. Larrick, R.P. & Soll, J.B. (2006). Intuitions About Combining Opinions. Management Science, 52(1).

  14. Lorenz, J. et al. (2011). How Social Influence Can Undermine the Wisdom of Crowds. PNAS, 108(22).

  15. Kahneman, D., Sibony, O., & Sunstein, C.R. (2021). Noise: A Flaw in Human Judgment. Little, Brown Spark.

  16. Haynes, A.B. et al. (2009). A Surgical Safety Checklist to Reduce Morbidity and Mortality in a Global Population. NEJM, 360(5), 491–499.

  17. Pronovost, P. et al. (2006). An Intervention to Decrease Catheter-Related Bloodstream Infections in the ICU. NEJM, 355(26).

  18. Sala, G. & Gobet, F. (2019). Cognitive Training Does Not Enhance General Cognition. Trends in Cognitive Sciences, 23(1).

  19. Detterman, D.K. (1993). The Case for the Prosecution: Transfer as an Epiphenomenon. In Transfer on Trial. Ablex.