From 8d8f8751b4ab856515099eab230d67565bcb5965 Mon Sep 17 00:00:00 2001
From: Feror <contact@feror.fr>
Date: Sat, 30 May 2026 19:29:58 +0200
Subject: [PATCH] docs(perf): document incremental line index impact

---
 Docs/editor-investigation.md   | 72 +++++++++++++++++++++++++
 Docs/performance-report-5mb.md | 96 ++++++++++++++++++++++++++++++++++
 2 files changed, 168 insertions(+)

diff --git a/Docs/editor-investigation.md b/Docs/editor-investigation.md
index 35c0420..89416c6 100644
--- a/Docs/editor-investigation.md
+++ b/Docs/editor-investigation.md
@@ -601,6 +601,78 @@ Lesson:
 
 Future editor systems must not infer logical lines by searching Swift `String` values for newline characters. Editor line segmentation must go through `DocumentLineIndex`, which uses explicit UTF-16 line-ending handling compatible with TextKit selection ranges and `NSRange`.
 
+## Finding #11 — Incremental Line Index
+
+Milestone 2.9 implemented the highest-impact optimization identified in Milestone 2.8: a persistent line index used by active-line lookup, selection mapping, dirty invalidation, and dirty styling.
+
+Architecture:
+
+```mermaid
+flowchart TD
+    TextKit["NSTextView / UITextView edited range"] --> Edit["DocumentLineIndexEdit"]
+    Edit --> Index["Persistent DocumentLineIndex"]
+    Index --> Active["Active-line lookup"]
+    Index --> Dirty["Dirty-line invalidation"]
+    Index --> Styling["Dirty-line styling"]
+    Index --> State["EditorState line count and active column"]
+```
+
+`DocumentLineIndex` now provides:
+
+- line count
+- line boundary lookup
+- offset-to-line mapping
+- line-to-offset mapping
+- affected-line mapping with neighboring-line expansion
+- incremental local replacement using `DocumentLineIndexEdit`
+
+Update strategy:
+
+1. Capture the native edited range and replacement string from the platform text view delegate.
+2. Expand the scan window to the edited line, adjacent lines, and line-ending boundary neighbors.
+3. Rescan only that local window using UTF-16 line-ending rules.
+4. Replace the affected boundary slice in the persistent index.
+5. Use the updated index for active-line lookup, dirty-line invalidation, and dirty styling.
+
+The editor still stores the full source string because TextKit and persistence require it. The optimized path now accepts TextKit's already-edited source instead of reconstructing it from the old source inside the index.
+
+Complexity impact:
+
+| Operation | Before | After |
+| --- | --- | --- |
+| Active-line lookup | O(document) scan | O(log line count) binary lookup |
+| Selection update | O(document) line rebuild | O(log line count) active-line lookup |
+| Dirty invalidation for typing | O(document) prefix/suffix diff + line rebuild | O(log line count + affected lines) |
+| Dirty styling setup | O(document) line-list rebuild | O(dirty lines) |
+| Source edit index update | O(document) full line rebuild | local rescan plus boundary offset maintenance |
+
+5 MB benchmark impact:
+
+| Measurement | Before Milestone 2.9 | After Milestone 2.9 | Improvement |
+| --- | ---: | ---: | ---: |
+| `active_line_lookup` | 191.751 ms | 0.001 ms | 99.999% |
+| `selection_update` | 398.262 ms | 0.002 ms | 99.999% |
+| `dirty_line_invalidation_click` | 224.418 ms | 0.001 ms | 99.999% |
+| `typing_state_update` | 416.540 ms | 0.102 ms | 99.976% |
+| `dirty_line_invalidation_typing` | 1,019.380 ms | 0.003 ms | 99.999% |
+| `render_update_typing_dirty` | 53.512 ms | 0.796 ms | 98.512% |
+
+Tradeoffs:
+
+- The line index is now stateful and must remain synchronized with native text edits.
+- Programmatic full-source replacement still falls back to a full index rebuild.
+- Correctness depends on preserving the Milestone 2.7 UTF-16 newline semantics for LF, CRLF, CR, mixed endings, and trailing blank lines.
+
+Validation:
+
+- Incremental line-index tests compare local edits against full rebuilds.
+- Tests cover insertion, deletion, replacement, CRLF boundary edits, mixed line endings, and a 10,000-line edit.
+- Cursor, dirty invalidation, scroll stability, and large-document tests continue to pass.
+
+Conclusion:
+
+The main Milestone 2.8 Sapling-side interaction bottlenecks have been removed from the measured 5 MB typing and selection paths. After this change, the dominant measured costs are TextKit full/cold layout and full-document initial rendering/planning work, not active-line tracking or dirty invalidation.
+
 ## AttributedString and NSAttributedString
 
 Swift `AttributedString` is useful for renderer-facing APIs and SwiftUI previews.
diff --git a/Docs/performance-report-5mb.md b/Docs/performance-report-5mb.md
index 912f383..d0947b4 100644
--- a/Docs/performance-report-5mb.md
+++ b/Docs/performance-report-5mb.md
@@ -542,6 +542,102 @@ The core scalability question has this answer:
 - Cached viewport glyph lookup scales with the viewport after layout exists.
 - Cold/deep TextKit layout queries scale with layout work required to reach the queried position.
 
+## Incremental Line Index Results
+
+Milestone 2.9 implemented a persistent incremental `DocumentLineIndex` and migrated active-line lookup, selection updates, dirty invalidation, and dirty styling setup onto it. The native text adapters now capture edited ranges and replacement text, then pass `DocumentLineIndexEdit` into editor state and dirty invalidation.
+
+This was an optimization milestone only. It did not replace NSTextView, redesign the editor, or add Markdown rendering features.
+
+### Architecture
+
+```mermaid
+flowchart TD
+    Edit["Native edited range + replacement"] --> Index["Persistent DocumentLineIndex"]
+    Index --> Active["Active-line lookup"]
+    Index --> Selection["Selection mapping"]
+    Index --> Dirty["Dirty-line invalidation"]
+    Dirty --> Styling["Dirty-line styling"]
+    Styling --> TextStorage["NSTextStorage attributes"]
+```
+
+The incremental update strategy is:
+
+1. Capture the native edit range and replacement string.
+2. Expand the affected scan window to adjacent line boundaries so CRLF, LF, CR, and mixed endings remain correct.
+3. Rescan only the local affected window.
+4. Replace that boundary slice in the cached index.
+5. Use the cached index for active-line lookup, dirty-line lookup, and dirty styling.
+
+Programmatic full-source replacement still rebuilds the index. User text edits use the incremental path.
+
+### Benchmark Run
+
+Release benchmark command:
+
+```sh
+swift run -c release SaplingEditorBenchmark
+```
+
+Run date: May 30, 2026.
+
+| Scenario | Lines | Measured total |
+| --- | ---: | ---: |
+| Sample document | 54 | 9.451 ms |
+| 2,100-line prototype | 2,101 | 290.723 ms |
+| 5 MB benchmark | 51,482 | 3,817.507 ms |
+
+Tracked interaction metrics after Milestone 2.9:
+
+| Scenario | Active-line lookup | Selection update | Dirty click invalidation | Typing state update | Dirty typing invalidation | Dirty typing render |
+| --- | ---: | ---: | ---: | ---: | ---: | ---: |
+| Sample document | 0.000 ms | 0.000 ms | 0.000 ms | 0.002 ms | 0.004 ms | 0.119 ms |
+| 2,100-line prototype | 0.000 ms | 0.001 ms | 0.001 ms | 0.012 ms | 0.003 ms | 0.211 ms |
+| 5 MB benchmark | 0.001 ms | 0.002 ms | 0.001 ms | 0.102 ms | 0.003 ms | 0.796 ms |
+
+### 5 MB Before / After
+
+Before values are the Milestone 2.8 corrected-CRLF baseline. After values are the Milestone 2.9 release benchmark.
+
+| Measurement | Before | After | Improvement |
+| --- | ---: | ---: | ---: |
+| `active_line_lookup` | 191.751 ms | 0.001 ms | 99.999% |
+| `selection_update` | 398.262 ms | 0.002 ms | 99.999% |
+| `dirty_line_invalidation_click` | 224.418 ms | 0.001 ms | 99.999% |
+| `typing_state_update` | 416.540 ms | 0.102 ms | 99.976% |
+| `dirty_line_invalidation_typing` | 1,019.380 ms | 0.003 ms | 99.999% |
+| `render_update_typing_dirty` | 53.512 ms | 0.796 ms | 98.512% |
+| Measured total | 6,537.243 ms | 3,817.507 ms | 41.601% |
+
+The measured total includes synthetic full-layout and cold-layout probes. The interaction-specific improvement is larger than the total improvement because full-document open/layout/render probes still dominate the benchmark.
+
+### Complexity Review
+
+| Subsystem | Before | After | Evidence |
+| --- | --- | --- | --- |
+| Active-line lookup | O(document) scan | O(log line count) binary lookup | 191.751 ms -> 0.001 ms |
+| Selection mapping | O(document) line rebuild | O(log line count) lookup | 398.262 ms -> 0.002 ms |
+| Dirty invalidation click | O(document) current-line rebuild | O(1) when source and active line are unchanged | 224.418 ms -> 0.001 ms |
+| Dirty invalidation typing | O(document) prefix/suffix diff + line rebuild | O(log line count + affected lines) | 1,019.380 ms -> 0.003 ms |
+| Dirty styling setup | O(document) line-list rebuild before styling dirty lines | O(dirty lines) | 53.512 ms -> 0.796 ms |
+| Programmatic full source replacement | O(document) | O(document) fallback | unchanged by design |
+| Initial render planning | O(line count) | O(line count) | still 578.083 ms |
+| Initial attributed styling | O(line count) | O(line count) | still 689.766 ms |
+| TextKit full layout | O(layout fragments) | O(layout fragments) | still 1,201.918 ms |
+
+### Remaining Bottlenecks
+
+After Milestone 2.9, the top 5 measured operations for the 5 MB benchmark are:
+
+| Rank | Operation | Time |
+| ---: | --- | ---: |
+| 1 | `layout_generation_full_document` | 1,201.918 ms |
+| 2 | `cold_line_fragment_calculation_midpoint` | 745.530 ms |
+| 3 | `attributed_string_generation_initial` | 689.766 ms |
+| 4 | `render_plan_generation_all_lines` | 578.083 ms |
+| 5 | `line_model_generation` | 190.920 ms |
+
+Validation answer: TextKit is now the dominant measured bottleneck for full-document layout and cold layout-to-position work. It is not the only remaining bottleneck: initial attributed styling and full render-plan generation are still substantial Sapling-side document-wide costs during open/full render paths. For core typing, cursor movement, selection updates, and dirty invalidation, the previous Sapling-side document-wide bottlenecks are no longer dominant in the benchmark.
+
 ## Recommendations For Future Work
 
 These are recommendations from evidence, not implemented changes: