Why Vim :g//d Freezes on Large Files — Root Cause and Fixes

Why Vim :g//d · :v//d Freezes on Large Files

Text Editors · Performance Analysis · Vim/Neovim Comparison

You open a multi-hundred-thousand-line log or generated text file in Vim, run :g/pattern/d to strip out unwanted lines, and watch a high-spec machine grind to a halt. The command looks deceptively simple — one line in the Ex buffer. The freeze, however, is real and reproducible. The actual culprit is tens or hundreds of thousands of hidden side-effect operations silently piggybacking on each individual deletion. The fix is not switching editors; it is changing how the operation is structured. This article walks through the root cause, ranked workarounds, and an honest Vim vs. Neovim comparison.

🧩 TL;DR — The freeze comes from d firing individually for every matched line, touching the clipboard, numbered registers, and undo history on each hit. The quickest fix is the blackhole register; the definitive fix for millions of lines is an external stream filter (grep, rg, or sed). Switching to Neovim does not solve this.

How :global Actually Works

:g/pattern/d reads like a single atomic operation — "delete every line matching this pattern." In reality, Vim's :global command runs as a two-pass algorithm. Understanding this structure is the prerequisite for understanding every performance pathology it can produce.

Pass 1 (Mark) — Vim scans the entire buffer once, setting an internal flag on every line that matches the pattern.

Pass 2 (Execute) — For each flagged line, Vim dispatches the specified Ex command (d) individually, in a sequential loop.

:v (equivalently :g!) simply inverts the match predicate — flagging lines that do not match — but the two-pass mechanism is identical. The key insight is in Pass 2: if 900,000 of 1,000,000 lines match, d is dispatched 900,000 times, one invocation per line, not once for the full selection.

This matters because each d invocation is not a lightweight pointer swap — it carries a full set of side-effect bookkeeping that is perfectly reasonable for interactive editing but catastrophic inside a batch loop at scale.


graph LR
  A[Full buffer scan] --> B[Matching lines
flagged internally] B --> C[d dispatched
per flagged line] C --> D[900k lines =
900k iterations] style A fill:#e8f8f5,stroke:#16a085 style B fill:#fef9e7,stroke:#f39c12 style C fill:#fdedec,stroke:#e74c3c style D fill:#fdedec,stroke:#c0392b,color:#c0392b

🔗 Diagram summary: :global scans the buffer once to mark matching lines, then iterates through those marks dispatching the delete command one at a time. The iteration count scales linearly with matched-line count — 900k matches means 900k sequential d invocations.

Root Cause Analysis: Four Layers of Hidden Overhead

The problem is not raw deletion count. The problem is that every d invocation drags along layered side-effects that are individually cheap but compound into unacceptable latency at scale. Here are the four culprits, ordered by impact.

① System Clipboard Synchronization — The Dominant Cost

When set clipboard=unnamed (or unnamedplus) is active, Vim copies the deleted text to the OS clipboard on every single deletion. Each write is an OS API round-trip — not a simple in-process memory copy. Serializing that call hundreds of thousands of times accumulates into wall-clock minutes. The following figures come from a documented case: 150,000 lines deleted from a 200,000-line buffer.

Same operation, one setting changed — elapsed time comparison (lower is faster):

🪟 clipboard=unnamed (Windows)
minutes
🍎 clipboard=unnamed (macOS)
≈ 30 sec
⛔ clipboard="" (disabled)
under 2 sec

A single config toggle produces a 100× or greater difference in elapsed time. gVim on Windows is particularly susceptible: its clipboard provider relies on Win32 API calls that carry non-trivial per-invocation overhead. The cost is super-linear — each round-trip serializes the loop, and accumulated latency grows faster than line count alone suggests.

② Numbered Register Rotation ("1 through "9)

Vim maintains a ring of nine numbered registers ("1"9) that store recent deletion history, enabling the "2p, "3p retrieval workflow. On each d, Vim rotates the ring: content in "1 shifts to "2, on through "9, and the newly deleted text lands in "1. Each rotation is a lightweight in-memory copy, but 900,000 rotations of potentially multi-kilobyte payloads accumulates into significant memory bandwidth consumption. This overhead is secondary to clipboard sync but non-negligible once you cross tens of thousands of deletions.

③ Undo Tree Accumulation

Despite appearing to be a single reversible operation (one u undoes the whole :global), Vim internally records a separate undo entry per deletion. For millions of lines, the undo tree expands to tens or hundreds of MB, driving heap pressure and memory allocator contention that manifests as lag during the operation — not just at undo time. Each allocation happens inline, inside the :global loop, contributing latency proportional to the deletion count.

④ Syntax Re-evaluation and Swap File Writes

Each deletion shifts buffer line numbers, forcing Vim's regex-based syntax engine to re-evaluate highlighting context from the deletion point forward. On large files, locating the safe re-parse boundary is O(n) in the worst case. Concurrently, Vim flushes state changes to the swap file (.swp) for crash recovery, adding I/O pressure on every iteration.

These four layers stack inside a single-threaded synchronous loop — CPU-bound register work, memory-bound undo allocation, I/O-bound clipboard and swap writes — all serialized on a single core. The result is the paradox of a modern multi-core machine appearing completely frozen by a single editor command.

Workarounds, Ranked by Impact

Understanding the cause makes the fix obvious: either cut off the side-effects attached to each deletion, or move the work outside Vim's loop entirely. The following six approaches are ordered from quickest to most thorough.

Fix 1 · Blackhole Register (Apply Immediately)

The blackhole register "_ is a write-only sink: deleted text is discarded without being stored in any numbered register or synced to the clipboard. A single flag eliminates two of the four overhead sources simultaneously. This is the recommended default for any :global deletion on files larger than a few thousand lines.

:g/pattern/normal "_dd
:g/pattern/d _    " shorter equivalent

Fix 2 · Temporarily Disable the Clipboard (Best for gVim on Windows)

:set clipboard=
:g/pattern/d
:set clipboard=unnamed  " restore afterward

This targets the single most expensive side-effect directly. On Windows gVim, where Win32 clipboard calls are particularly slow, this alone can reduce a multi-minute stall to seconds. Pair it with Fix 1 for maximum effect.

Fix 3 · Disable Undo Temporarily (Significant Memory Relief)

:set undolevels=-1
:g/pattern/d
:set undolevels=1000  " restore

⚠️ This window is non-reversible with u. Always back up the file before using this approach.

Fix 4 · External Filter via :%! (Optimal for Millions of Lines)

:%! pipes the entire buffer to an external process and replaces it with the output — bypassing all of Vim's internal deletion machinery. Stream-processing tools like grep, ripgrep, and sed execute the equivalent operation orders of magnitude faster than Vim's internal loop. The tradeoff: Vim treats the result as a single buffer replacement, so fine-grained undo history before the operation is lost.

:%! grep -v "pattern"  " delete matching lines (≡ :g/.../d)
:%! grep "pattern"     " keep only matching lines (≡ :v/.../d)
:%! rg "pattern"       " ripgrep — fastest on large files
:%! sed '/pattern/d'
:%! awk '!/pattern/'

Fix 5 · Preprocess Outside the Editor (Definitive for Massive Files)

For truly large files — gigabytes of logs or generated data — the most robust approach is to filter before opening. A single sequential read to a new output file involves zero undo allocation, zero clipboard calls, and zero register bookkeeping. ripgrep has been benchmarked at approximately 6× faster than GNU grep on a 13.5 GB corpus.

grep -v "pattern" big.txt > filtered.txt
rg -v "pattern" big.txt > filtered.txt  " fastest option

Fix 6 · Automatic Large-File Optimization in .vimrc

The LargeFile plugin (vim.org #1506) or a custom BufReadPre autocmd can automatically set undolevels=-1, noswapfile, and syntax off whenever a file exceeds a configured size threshold. This is a one-time setup that eliminates the need to remember manual toggles every time a large file is opened.

Which fix to reach for depends almost entirely on line count:


flowchart TD
  A([Large-file :g//d stalls]) --> B{Line count?}
  B -->|tens to hundreds of thousands| C[Blackhole register
:g/pattern/d _] B -->|millions+| D[External filter
:%! grep -v] C --> E([Fast completion]) D --> E style A fill:#3498db,stroke:#2980b9,color:#ffffff style B fill:#fef9e7,stroke:#f39c12 style C fill:#eafaf1,stroke:#27ae60,color:#1e8449 style D fill:#eafaf1,stroke:#27ae60,color:#1e8449 style E fill:#3498db,stroke:#2980b9,color:#ffffff

🔁 Diagram summary: For tens to hundreds of thousands of lines, the blackhole register (:g/pattern/d _) is sufficient. For millions of lines, pipe through an external filter (:%! grep -v). Both paths cut the repeated side-effects and complete quickly.

Does Switching to Neovim Solve This?

A common intuition is that Neovim's architectural improvements would eliminate this class of problem. The answer is nuanced, and for :global specifically, it is effectively no.

Aspect Vim Neovim
:global implementation Two-pass mark-and-execute Identical (inherited codebase)
Clipboard handling Built-in, direct OS calls Delegated to external provider
Syntax highlighting Regex, full re-evaluation Tree-sitter, incremental parsing
Startup time ~28 ms ~12 ms
Plugin execution Single-threaded Async / Lua coroutines
:global on large files Slow Equally slow

The critical row is the last one. Neovim inherited the :global implementation from Vim — the two-pass loop, the per-deletion side-effects, and all associated overhead are identical. Tree-sitter's incremental parsing is a display-layer optimization; it has no bearing on the :global execution loop. In fact, Neovim 0.3.0 was reported as 40–62% slower than Vim 8 on certain workloads (issue #8657), illustrating that migration can introduce regressions on specific operations rather than universally improving them.

💡 The performance bottleneck in :g//d is not a Vim-specific bug — it is a fundamental consequence of the two-pass algorithm combined with interactive-editor bookkeeping. Neovim carries the same algorithm. The fix lives in the approach, not the editor choice.

Quick Reference: Which Fix to Use When

Situation Recommended Fix
Need a fast fix right now :g/pattern/d _ (blackhole register)
gVim on Windows, nearly frozen :set clipboard= before operating
Millions of lines, speed critical :%! grep -v "pattern"
File needs permanent cleanup Preprocess with sed/rg in the shell
Regularly working with large files Add LargeFile auto-config to .vimrc

🧠 One-line takeaway — The freeze is not the command itself; it is clipboard sync, register rotation, and undo allocation repeating hundreds of thousands of times. The most reliable fix is an external stream filter (grep/rg/sed) that bypasses all of it in a single pass. Switching to Neovim does not change this equation.

References

Vim developer mailing list — clipboard synchronization benchmark data

LearnVim: The Global Command — two-pass structure explained

Vim Tips Wiki: LargeFile — automatic large-file optimization

ripgrep vs grep benchmark — 6× speed difference on 13.5 GB corpus

Neovim vs Vim 2026 — performance comparison

This article is based on publicly available Vim/Neovim documentation, developer mailing list discussions, and benchmark data. Actual performance may vary depending on your OS, editor version, and plugin configuration. Always back up your files before running bulk delete operations.

S
SW Develope
Software Development Notes

Collecting and curating software development resources, with a final review before every post.

Written based on publicly available data and sources. Last updated: June 8, 2026

댓글

이 블로그의 인기 게시물

Cutting Claude Code Token Usage by 75%: What the Caveman Technique Actually Delivers

Claude Code ultracode — What It Is, How to Enable It, and Who Can Use It

Does Open-Source Headroom Cut LLM Costs by 90%? A Fact Check