Engineering Journal
Ginexys
Ginexys

Pdf Cad Journey

2026-05-22

The Document That Wouldn't Stay Still

_Build in public — PDF Processor, Part 3_


The geometry extraction worked. Tables were coming out as real <table> elements. Headings were <h3> and <h4>. Columns were CSS Grid. I had a pipeline that turned PDFs into structured HTML you could actually read.

And then someone asked the question I'd been half-ignoring:

"Can I edit it after it's extracted?"

Of course you can. The Doc tab is contenteditable. Click in, type, format with the toolbar. Done.

"No, I mean can I move sections around. Like, the table extracted into the wrong column. Can I drag it to the right one?"

That question sent me somewhere I hadn't planned to go.


The first trap: thinking it needed a canvas

My first instinct was a canvas overlay. Draw bounding boxes over the rendered HTML, handle drag events on the canvas, update the DOM positions. This is how most "visual editor" tools work — Fabric.js, Konva, that kind of thing.

I got two paragraphs into the implementation before I stopped. The moment I put something in position: absolute, it falls out of the document flow. It doesn't affect other elements. It doesn't respond to the container. It doesn't export to Markdown or XML as a paragraph — it exports as nothing, or as a positioned artifact.

We're not building a poster editor. We're building a document editor. Absolute positioning is the wrong model.


The realization: the browser already did the work

Here's what I'd been overlooking. The extracted HTML is structured. It has zones (div.pdf-zone) with CSS Grid. It has regions (div.pdf-region) that are flow children. It has semantic elements inside those regions.

The browser already knows the layout. getBoundingClientRect() gives me every element's position without any math on my side. CSS Grid reflows automatically when I move a node with insertBefore. The DOM hierarchy is the coordinate system.

HTML5 drag-and-drop + DOM insertion is the entire implementation. No canvas. No coordinate math. No absolute positioning.


Building the mode toggle

The first thing I built was the mode switch. Two modes, same DOM surface:

Edit ModecontentEditable = 'true', no drag handles. You type, format, use the toolbar. This is what was already there.

Selection ModecontentEditable = 'false', drag handles injected on every zone and region. You grab elements and move them.

One button. One class on #html-preview. CSS does the rest.

The mistake I made on the first pass: I forgot to remove draggable = true when switching back to Edit Mode. You'd click into a paragraph to type and accidentally start dragging it. Fixed by setting el.draggable = false on every element when the mode is deactivated.


The 10,000 drop-indicator problem

The drop indicator — the little 2px line that shows where a dragged element will land — was trickier than it looked.

My first version created a new indicator div on every dragover event. dragover fires constantly. After two seconds of dragging, there were hundreds of indicator divs stacked in the DOM. The page lagged. Everything turned bright blue.

The fix: one indicator at a time. _removeIndicator() removes whatever currently exists before creating a new one. And dragend always removes it, even if drop wasn't called.


Marquee select: the part I almost skipped

I almost skipped marquee select. It seemed complicated and I wasn't sure it was necessary.

Then I tried to group three regions from different columns into a single zone without it. Click, Ctrl+click, Ctrl+click, find Group button. Repeat. For a 20-page document with a bad two-column extraction, that's dozens of clicks.

Marquee select took an afternoon. The trick is that the marquee div lives inside #html-preview, which needs position: relative so the absolute-positioned marquee is scoped to the preview. On mouseup, intersect the marquee rect (from getBoundingClientRect()) against every .pdf-region. If they overlap — add to selection.

The Group button appears when two or more regions are selected. It wraps them in a new div.pdf-zone.pdf-zone--cols-1 and inserts it before the first selected region's parent zone. The zones and columns reflow around it.


Edit Code: the escape hatch

Selection Mode handles layout. But what about content that extracted wrong? A heading classified as a paragraph. A table cell with garbled Unicode. A list item that kept its bullet character as literal text.

The solution was an "Edit Code" option in the right-click context menu. Click it, and a Monaco editor opens with the outerHTML of whatever element you right-clicked. Edit it. Click Apply. The element is replaced in the live DOM. The change syncs to the source editor and the visual diff tab automatically.

This took me longer to get working than I expected. Two bugs:

Bug 1: window.monaco was undefined. I'd written window.monaco.editor.create(...) assuming Monaco was a global. But Vite bundles Monaco as an ES module. It's not on window. The fix was a one-line change: import * as monaco from 'monaco-editor'.

Bug 2: the Monaco editor was blank. I was calling editor.setValue(content) before calling editor.layout(). Monaco needs to measure its container to render correctly. The container has zero dimensions until the dialog is painted. The fix: requestAnimationFrame after the dialog opens, then layout(), then setValue(). This is the same deferred pattern that TAFNE uses for its multi-cell edit modal — I'd already solved this problem once and forgotten I'd solved it.


Lists: the thing nobody talks about

While I was in pageAssembler.js to test the list wrapping, I noticed something annoying. Adjacent numbered lists in contenteditable collapse into one list when you backspace between them. The browser treats two adjacent <ol> elements as the same list and merges them.

The fix: wrap each extracted list in <div class="pdf-list-wrap">. The div is a block barrier. The browser can't merge across it.

While I was in there, I also noticed ordered lists weren't preserving their start numbers. A list that started at item 3 in the PDF was rendering starting at 1. The fix: read the numeric prefix from the first <li> text content, parse the start number, emit <ol start="3">. Two lines of regex.

Neither of these was in the plan. Both took less than an hour. Both made the output noticeably more correct.


What I learned

The best layout editors aren't the ones that give you the most control. They're the ones that make the common operations effortless and keep the escape hatches close.

Drag-to-reorder covers 80% of layout fixes. The Group button covers another 15%. Edit Code covers the rest. The DOM is the source of truth throughout. Every change flows through the same sync coordinator. What you see is what you export.

I didn't set out to build a CAD editor for HTML. I set out to make the extracted output editable. The CAD structure emerged from taking the extraction model seriously — zones, regions, columns — and realizing those were already the right primitives for a layout tool.


_Ginexys PDF Processor is free and offline at ginexys.com. No account required._

Read this post in the full Engineering Journal →