Systems Atlas

Chapter 4.7: Data Foundation

Deletes, Partial Updates & Reindexing

In Search, "Mutability" is a leaky abstraction. Why deleting data might actually increase your disk usage, and why updates burn CPU.


The Lie: "Delete"

When you send a DELETE request, the search engine lies to you. It says "200 OK", but nothing was removed. Because search segments are highly compressed and optimized for read speed, they are Immutable (Write-Once). Modifying a file on disk to remove data is impossible without corrupting the index.

Instead, Lucene maintains a parallel file called the Bitset (Live Docs). A delete is simply flipping a bit from 1 to 0. The document is still on disk, still in memory, and still being processed by your search queries it is merely "tombstoned" and filtered out at the very last step.

Lifecycle of a Deleted Doc

T+0s
Doc 123 (Active)Alive
T+1s
Doc 123 (Deleted)Tombstone ⚰️
⚠️ Still searchable! Still taking disk space!
T+4h
Segment MergePhysically Removed

The Performance Tax

  • Latency:Your query matches 1,000,000 documents. The engine calculates scores for ALL of them. Only at the very end does it check the Bitset to hide the 500,000 deleted ones.You pay CPU for ghosts.
  • Heap:The Bitset must be loaded into JVM Heap for fast access. Heavy deletes = Heavy Heap pressure.

The Solution: Force Merge?

You can manually trigger `_forcemerge` to clean up, but be careful. It works like a "Garbage Collection" for disk extremely I/O intensive.

POST /index/_forcemerge?max_num_segments=1

Partial Updates

"I just want to update the view count. Why is my CPU hitting 100%?"

Because specific fields cannot be modified in place. Lucene stores documents in Compressed Blocks (LZ4/Deflate). You cannot just "seek and overwrite" a few bytes. To change even a single counter, the engine must decompress the whole block, reconstruct the JSON, apply the change, and re-index the result as a new document. This turns a tiny update into a heavy Read-Modify-Write cycle.

The "Update" Pipeline
1. GET

Retrieve `_source` JSON from disk

2. MERGE

Parse JSON + Apply Diff in Memory

3. DELETE

Soft-delete old doc ID

4. INDEX

Write NEW doc to buffer

Constraint: If you disabled _source to save disk space, you CANNOT use the Update API. You must provide the full document from your application side every time.

Reindexing at Scale

Changing a data type (e.g., `string` → `date`) requires a full reindex because the Inverted Index is built once. You cannot "ALTER TABLE" on an inverted index. You must rebuild it from scratch. You have two choices: The way that causes downtime, or the way that doesn't.

The Rookie Way (In-Place)
1Delete Index
DELETE /products

🚨 DOWNTIME STARTS (Search returns 404)

2Create New Index
PUT /products { "mappings": ... }

Index exists but is empty.

3Push Data

Script running for 4 hours... users see 0 results.

The Pro Way (Aliasing)
Alias: products
products_v1

✅ Live traffic matches v1 (Old Data)

1Background Build
Reindexing...
products_v2

Zero impact on users.

2Atomic Switch
POST /_aliases// Remove v1, Add v2 instanly

✅ Users instantly see new data. No 404s.

Production Tuning Guide

wait_for_completion=false

Never hold a connection open for long jobs. Fire asynchronously and poll the Task API (`GET /_tasks/task_id`) to check progress.

slices=auto

Parallelizes the reindex by splitting the work into sub-slices (usually equal to shard count). speeds up large jobs significantly.

requests_per_second=500

Essential. Throttles the write rate to ensure the reindex job doesn't consume all I/O and CPU, starving live search traffic.

The Hidden Trap: Nested Objects

Why "Nested" is mostly a trap

Developers love `type: nested` because it preserves object relationships (e.g. `comments.author` linked to `comments.text`). But Lucene doesn't actually support "nested" objects. It pulls a sleight of hand.

Logical View (What you see)
{
  "id": 1,
  "title": "Blog Post",
  "comments": [
    { "user": "Alice", "text": "Nice!" },
    { "user": "Bob",   "text": "Cool!" }
  ]
}
Physical View (On Disk)
Doc 1: { user: Alice, text: Nice, _root: 3 }
Doc 2: { user: Bob, text: Cool, _root: 3 }
Doc 3: { id: 1, title: Blog Post }

* Hidden "Shadow Documents" created for every single list item.

The Amplification Factor

To update 1 comment in a post with 50,000 comments:
Cost = Reindex Parent + Reindex ALL 50,000 Children

Result: Massive CPU burn and eventual cluster instability.

Key Takeaways

01

Deletes are Forever (Almost)

Deleted docs persist as 'tombstones' until a Segment Merge event. They consume heap (bitsets) and slow down search.

02

Partial Updates = Full Rewrites

Updating 1 byte requires retrieving the full JSON, parsing it, modifying it, and indexing a whole new document.

03

The _source tax

You cannot do partial updates if you disable the `_source` field to save disk. You must have the original JSON.

04

Reindex with Care

Use `slices` for parallel speed, but throttle `requests_per_second` to avoid taking down your primary cluster.