research-review 2026-03-24

Paper Review: Resumption Strategies for Interrupted Programming Tasks

Only 10% of sessions see a first edit within one minute of resuming

Primary paper: Parnin, C. & Rugaber, S. (2009/2011). “Resumption Strategies for Interrupted Programming Tasks.” Originally presented at ICPC 2009 (17th IEEE International Conference on Program Comprehension), pp. 80-89. Extended version published in Software Quality Journal, 19(1):5-34, 2011. DOI: 10.1007/s11219-010-9104-9

Companion paper: Parnin, C. & Rugaber, S. (2012). “Programmer Information Needs after Memory Failure.” ICPC 2012 (20th IEEE International Conference on Program Comprehension), pp. 123-132. Provides the cognitive neuroscience framework and proposes five memory aids.

Blog post: Parnin, C. (2013). “Programmer, Interrupted.” NINlabs Blog / Game Developer Magazine (Gamasutra top story of 2013). The practitioner summary that made the research famous.

Follow-up: Parnin, C. & DeLine, R. (2010). “Evaluating Cues for Resuming Interrupted Programming Tasks.” CHI ‘10, pp. 93-102. CHI Honorable Mention Best Paper. Microsoft Research collaboration testing actual resumption cues.

Status: Real data from real programmers in real settings. Not a framework, not a survey-only study. 10,000 sessions of instrumented IDE usage from 85 programmers + a 414-person survey. This is the empirical foundation for everything the “Programmer, Interrupted” meme rests on.

TL;DR

Programmers almost never resume coding immediately after an interruption. Only 10% of sessions see a first edit within one minute. Only 7.5% involve no navigation to other code locations before editing. In the typical case, programmers spend several minutes wandering through their codebase – visiting 2-12 locations, performing 15-150 selection events – just to rebuild enough context to make a single edit. The strategies programmers use to cope (TODO comments, intentional compile errors, sticky notes, source diffs) are all ad-hoc prosthetic memory devices that map cleanly onto five distinct types of human memory failure.

Why This Paper Matters

This is the missing bridge between the cognitive science papers (Altmann & Trafton’s goal activation model, Leroy’s attention residue) and practical tool design for developers. The cognitive papers tell us why interruptions hurt. This paper tells us what programmers actually do about it and where the tools fail them. Any proposed feature for interruption recovery – whether in an IDE, an AI coding assistant, or a standalone “Where Was I?” tool – can trace directly to a finding in this paper.

For anyone building developer tools or ADHD-aware software, this is the empirical ground truth. It tells you what information is lost during interruptions, what ad-hoc workarounds developers already use, and where automated tooling can close the gap.

Study Design

The Session Data (Quantitative)

Three datasets, all from instrumented IDE usage in natural settings:

Dataset Users Sessions Filtered* w/ Edits Events
Visual Studio 12 1,972 1,561 1,213 573,998
Eclipse 73 7,927 5,931 3,962 3,937,526
Total 85 9,899 7,492 5,175 4,511,524

*Filtered = removed sessions <1 minute duration.

  • Eclipse dataset (2005): Collected by Murphy et al. using the Mylyn Monitor tool. Captures fine-grained IDE events – navigation, selection, editing – from volunteer open-source Eclipse developers.
  • Visual Studio dataset (2005): Collected by Parnin & Gorg at an industrial site. 12 professional developers over several months. Richer navigation data (within-file navigation events recorded, which Eclipse dataset lacked).
  • UDC dataset (Eclipse Usage Data Collector, 2008): Publicly available data from 10,000+ Eclipse Ganymede users. Counts command usage (refactoring, task tracking, etc.) across the population.

Session segmentation: A break of 15+ minutes defines a session boundary. Well-supported by the data – 98% of the 4.5 million events have inter-event times under one minute, creating natural tight clusters.

Key methodological note: They can only observe IDE activity. If a developer spent 10 minutes reading docs in a browser, checking Slack, or staring at a whiteboard, that shows up as dead time. This means the measured “edit lag” is an upper bound – some of it is genuine task-related work outside the IDE. But the navigation patterns within the IDE are genuine resumption behavior.

The Survey (Qualitative, in extended SQJ version)

414 programmers surveyed about their interruption recovery practices. Key findings from the survey:

  • Developers estimated using compile errors as resumption cues in ~43% of resumptions
  • Source diffs viewed 39% of the time when resuming
  • TODO comments, sticky notes, and emails to self all commonly reported
  • Developers described intentionally leaving code in a broken state as a “roadblock” – forcing a visible prompt on return

The survey confirms what the session data implies: programmers are actively building ad-hoc memory prosthetics because the tools don’t do it for them.

The “Edit Lag” Metric

This is the paper’s key contribution – a specialized measure of resumption cost. Edit lag = time between returning to a programming task and making the first edit.

Standard resumption lag (from cognitive science) measures seconds – the time between being told to resume and the first mouse click. Edit lag measures minutes – the time until a programmer has rebuilt enough context to actually change code. This distinction matters: a programmer can click around for 15 minutes before they know enough to type a single character.

Edit Lag Distribution (Figure 3)

Edit Lag % of Sessions
< 1 min ~10%
1-5 min ~20%
5-10 min ~15%
10-15 min ~12%
15-30 min ~13%
30-60 min ~15%
> 60 min ~15%

The famous “10-15 minutes” claim from the blog post is a simplification. The real picture is worse: it’s a fat distribution with a long right tail. About 30% of sessions have edit lag over 30 minutes. The median is somewhere around 10-15 minutes, which is where the blog post number comes from, but a substantial chunk of sessions involve half an hour or more of context recovery.

The Resumption Strategies Taxonomy

Strategy 1: Return to Last Method Edited (7.5%)

The simplest possible approach: just pick up where you left off. Measured as sessions where the first edit happens with no navigation to other methods or classes.

Only 7.5% of sessions. This is remarkably low. It means the “just reopen where you were” approach that most IDEs offer is insufficient 92.5% of the time.

When this strategy does work, it works fast:

Edit Lag % of Sessions (last-method strategy)
< 1 min 35%
1-5 min 22%
5-15 min 23%
15-30 min 12%

35% resume within a minute – 3.5x the overall rate. The method itself serves as an environmental cue that triggers recall. This is exactly Altmann & Trafton’s prediction: when the cue is present and applicable, resumption is fast. But it’s only applicable 17% of the time.

Strategy 2: Navigate Then Return (17%)

In 209/1213 sessions (17%), programmers navigated to other locations first but eventually returned to edit the last method. In 118 of those 209 sessions (56%), the programmer navigated elsewhere even when they were going to end up back at the same method.

Why navigate away if you’re coming back? Because you need context from other parts of the code to understand what you were doing here. You can see the method, you can see the half-written code, but you can’t remember why you were writing it or what the next step was.

Strategy 3: Navigate to New Location (83%)

In the vast majority of sessions, the programmer navigated to a different method or class entirely. The previous work site was insufficient as a launching point.

Navigation scope before first edit (Visual Studio data):

Metric Range (75th percentile) Mean
Distinct locations visited 2-12 7
Navigation distance (code elements) 4-40 27
Selection events (Eclipse) 15-150 135

This is a lot of wandering. Seven distinct code locations visited, 27 navigation hops, 135 selection events – all before a single character is typed. This is the cost of rebuilding a mental model from scratch.

Strategy 4: Check Task/Bug Tracker (9%)

Developers viewed task information (Bugzilla, Mylyn Task List) in 9% of sessions. But the associated edit lag was very high – for 75% of these sessions, edit lag exceeded 30 minutes.

Strategy 5: Check Problem View / Compile Errors (9%)

Developers checked the Problem View (compile errors/warnings) in 9% of sessions. Same pattern: associated edit lag > 30 minutes for 75% of these sessions.

This is counterintuitive. Compile errors should be good resumption cues – they tell you exactly what’s broken. But sessions that start by checking errors take longer to reach an edit. Parnin hypothesizes these sessions involve planning/refining tasks or resolving configuration issues, not straightforward code resumption.

Strategy 6: Review Source History (4%)

Only 4% of sessions involved checking revision history (CVS/SVN) during the edit lag period. But relative to commit commands, history commands are used frequently, and equally likely during edit lag as during normal coding. Developers may be checking teammate activity or using diffs to remind themselves of their own changes.

Summary Table (Table 9 from paper)

Strategy Usage
Continue Last Edit (no navigation) 7.5%
Navigate Then Continue Last Edit 17%
Navigate to New Location 83%
View Revision History 4%
View Problem List 9%
View Task or Bug List 9%

The “Dabbling Sessions” Pattern

A pattern the authors observed in session visualizations: developers start the day with several short sessions (navigate, look around, make no changes). After “warming up” for an hour or two, they enter a long productive session. Very productive days with many long sessions are followed by several days of low productivity.

This matches ADHD patterns perfectly. The warm-up period is context rebuilding. The crash afterward is mental fatigue. The variability in day-to-day productivity maps directly to the Sankesara finding that variability, not slowness, is the ADHD signal.

Connection to Cognitive Science

Altmann & Trafton: Memory for Goals (2002)

The theoretical backbone. Their model has three constraints:

  1. Interference level: Old goals create residual activation that interferes with retrieving current goals
  2. Strengthening constraint: Encoding a new goal takes time
  3. Priming constraint: Environmental cues help retrieve suspended goals

Parnin’s data validates all three:

  • Interference: Developers navigate to many locations, fighting through interference from old contexts
  • Strengthening: The edit lag is the strengthening period – time needed to re-encode the current goal
  • Priming: When the last-method cue is applicable, resumption is 3.5x faster. Environmental cues work.

Leroy: Attention Residue (2009)

Leroy showed that part of your attention stays with a prior task after switching, especially if the prior task was incomplete. Parnin’s data shows the downstream effect: developers spend 10-15 minutes cleaning up the residue, navigating around to unstick their minds from whatever they were doing before.

Leroy’s “Ready to Resume Plan” intervention (writing down where you are and what you’ll do next before switching) maps directly to the tool proposals in Parnin 2012: explicit externalization of the information that would otherwise be lost.

Parnin 2012: The Five Memory Failures Framework

The ICPC 2012 paper (“Programmer Information Needs after Memory Failure”) provides the cognitive neuroscience explanation for what the 2009 data shows. Five memory systems, five failure modes, five tool proposals:

Memory Type Programming Activity Failure Mode Tool Proposal
Prospective Resuming blocked tasks Monitor/engage failure Smart reminders
Attentive Refactoring large code Concentration/limit failure Touch points
Associative Navigating unfamiliar code Retention/association failure Associative links
Episodic Learning new API Source/recollection failure Code narratives
Conceptual Forming concepts Activation/formation failure Memlets

This framework is genuinely useful for tool design. Every ad-hoc strategy programmers use maps to a specific memory failure:

  • TODO comments = prospective memory prosthetic (but passive – no trigger mechanism, so engage failure persists)
  • Intentional compile errors = constrictive smart reminder (forces engagement, but blocks task-switching on same codebase)
  • Source diffs = episodic memory aid (but unstructured, verbose, cognitively demanding)
  • Sticky notes = prospective memory + attentive memory (but not linked to code, easy to lose)
  • Tab flipping = associative memory failure (tabs only provide lexical cues, not enough for association)

The CHI 2010 Follow-Up: Cues That Actually Work

Parnin & DeLine (2010) tested two types of automated cues vs. note-taking in a controlled lab study with 371 survey respondents + lab experiments:

  1. Aggregate cue: Summarized recent activity by program element
  2. Chronological cue: Timeline of recent actions (edits, searches, builds) in order of occurrence

Result: Both cues performed well – developers using either cue completed tasks with twice the success rate of note-taking alone. The chronological cue (timeline of actions) was particularly effective.

This is direct evidence that automated context capture works. The question is not whether to build it – it’s what to capture and how to present it.

The “10-15 Minutes” Claim in Context

The blog post headline – “a programmer takes 10-15 minutes to start editing code after resuming work” – is the most widely cited finding. Here’s what it actually means:

  1. It’s the median edit lag, not a universal constant
  2. About 30% of sessions have edit lag > 30 minutes
  3. The distribution has a fat right tail
  4. Some of the edit lag is productive non-coding work (debugging, reading docs), not pure context recovery
  5. When the last-method cue works, 35% resume in under a minute

The number is still shocking and useful. But it’s an average across all interruption types, all task complexities, and all programmers. For ADHD programmers – who have measurably worse working memory, prospective memory, and set-shifting – the real number is likely worse.

Design Implications for ADHD-Aware Tools

What a “Where Was I?” Implementation Should Capture

Based on Parnin’s findings, the ideal resumption aid needs to address all five memory systems:

1. Prospective memory: What was I going to do next?

  • The intent that was in working memory when the interruption hit.
  • AI coding assistants with persistent memory features can capture this automatically – session summaries, project documentation, auto-generated notes. But passive storage isn’t enough; the information must be surfaced proactively on re-entry, not buried in a file the developer has to remember to check.
  • The compile-error strategy tells us: make the “what’s next” prompt unavoidable, not something you have to remember to check.

2. Attentive memory: What locations was I tracking?

  • The 2-12 locations developers visit before resuming suggests they’re rebuilding a working set of relevant code locations.
  • git diff and git log are the modern source diffs. But raw diffs are what Parnin’s developers complained about – “unordered, verbose, time-consuming, and cognitively demanding.”
  • Better: a structured summary of recent changes, organized by intent rather than file. AI assistants with conversation history already capture this – it just needs to be surfaced at session start.

3. Associative memory: What does this code look like?

  • Developers flip through tabs because filenames alone don’t trigger recall. They need visual, structural, and operational cues.
  • This applies broadly to any tool that indexes past work. Conversations indexed by title alone have the same problem as tabs indexed by filename. Adding context (what files were edited, what was decided) improves recall.

4. Episodic memory: What happened in my last session?

  • The chronological timeline of actions was the most effective cue in the CHI 2010 study. Not a summary – a narrative.
  • Full conversation histories and activity logs store the raw material. The missing piece is a distilled narrative – “last session you were working on X, you hit a wall on Y, and decided to try Z next.” Persistent context features in AI tools can do this, but need to distinguish between long-settled context and what-happened-last-time.

5. Conceptual memory: What mental model was I building?

  • The most expensive thing to lose. The Parnin 2012 paper notes that experts use less brain activity than novices because they exploit conceptual memory – their mental models are pre-built.
  • Project documentation files serve as conceptual memory externalization. When they’re good, they let a returning developer (or a new AI session) start with the right mental model pre-loaded. When they’re stale, time is wasted rebuilding from scratch.

Specific Tool Ideas From This Research

Session-end hook: “Ready to Resume Plan” Leroy’s intervention + Parnin’s findings point toward an automated hook that, when a coding session ends, writes a 3-line summary: (1) what was accomplished, (2) what was in progress, (3) what the next step is. Stored in persistent project context. Surfaced at next session start. This addresses prospective and episodic memory simultaneously.

Structured diff on session start Not git diff – an AI-summarized “here’s what changed since your last session, organized by intent.” Addresses attentive memory (what locations matter) and episodic memory (what happened).

“Dabbling mode” detection Track whether early-session behavior is navigation-heavy with no edits. If detected, offer to surface the last session’s context summary proactively. Don’t wait for the programmer to ask “where was I?”

Limitations

No interruption type data. They can’t distinguish between “was pulled into a meeting” (involuntary) and “decided to take a break” (voluntary). Self-interruptions probably have lower resumption cost because the developer had time to encode their state.

IDE-only observation. Anything happening outside the IDE (browser, terminal, whiteboard, conversation) is invisible. Some of the “dead time” may be productive work that just isn’t captured.

2005 data. The Eclipse and VS data are from 2005. Development practices have changed significantly (Git, modern IDEs, AI assistants, Slack). The UDC data is from 2008. The fundamental cognitive constraints haven’t changed, but the surface patterns might look different today.

No ADHD analysis. The study doesn’t stratify by neurodivergent status. Given that ADHD affects working memory, prospective memory, and set- shifting, the resumption costs for ADHD programmers are almost certainly higher than the population averages reported here.

Upper bound on edit lag. As the authors acknowledge, edit lag includes time that might be spent on a new task, not just resumption. The experimental values are an upper bound on genuine resumption cost.

The Broader Literature This Spawned

Parnin’s work sits at a critical intersection and has been cited extensively (~500+ citations across the papers):

  • Kersten & Murphy (Mylyn): Used task context (interaction history) to filter IDE views, reducing navigation. Parnin’s data validates the approach but shows it’s insufficient for full resumption.
  • Ko, DeLine, Venolia (Microsoft Research): The information needs work at MSR, including the 62% of developers who say recovery from interruptions is a serious problem.
  • ContextKeeper (commercial tool): Auto-saves and restores IDE state (open files, breakpoints, window layout). Direct application of Parnin’s “environmental cues” finding.
  • Gloria Mark’s 23-minute finding: Mark measured total time to return to an interrupted task (including intervening tasks). Parnin measured the cost within a resumed session. Together they paint the full picture: 23 minutes to get back to the task + 10-15 minutes to start being productive in it.
  • The “Programmer, Interrupted” blog post became one of the most shared articles in software engineering, appearing in Game Developer Magazine (Gamasutra top story 2013) and cited by Paul Graham, Joel Spolsky, and countless engineering managers arguing for fewer meetings.

Bottom Line

This is the most directly actionable research for designing interruption recovery tools. Not because it tells us something surprising – we all know interruptions are bad – but because it tells us exactly what information is lost and what programmers do to try to recover it.

The five memory types from the 2012 paper give us a design checklist. The CHI 2010 cue evaluation gives us empirical evidence that automated cues work (2x success rate). The session data tells us that the simple approach (reopen last file) fails 92.5% of the time.

The key insight for ADHD tooling: every strategy programmers naturally use (TODO comments, compile errors, sticky notes, source diffs) is a manual externalization of volatile mental state. ADHD brains have more volatile mental state. The tools should do this externalization automatically, and a “Where Was I?” feature should surface it at the moment of maximum need – the start of the next session.