Engineering Journal
Pdf Processor
Pdf Processor

Path Reconciler Deepdive

2026-05-30

Under the Hood: Building a Correct PDF Path Reconciler

TLDR: PDF path operators are not segments. They are instructions for a stateful drawing machine. Turning them into clean, classified segments requires a three-stage pipeline: capture the CTM per-subpath in the adapter, reconcile subpath geometry analytically in the reconciler, merge fragmented dashes via global partition. Every stage has a correctness trap that most implementations miss.


The Problem

PDF.js gives you an operator list: a flat array of drawing commands. moveTo, lineTo, curveTo, rectangle, fill, stroke, closePath, and their variants. The path state machine is implicit -- moveTo opens a new subpath, fill/stroke closes and paints it, save/restore pushes and pops the CTM.

To reconstruct table grids and figure bounding boxes, you need axis-aligned segments in viewport space. Getting there without losing provenance or correctness requires three non-obvious design decisions.


Stage 1: ctmAdapter -- SubpathRecords, Not Segments

The adapter's job is to read the operator list and emit SubpathRecord objects. The key constraint: do not convert to viewport space yet.

Every subpath captures the CTM at the time moveTo was called:

const openSubpath = (constructPathId) => {
    if (currentSubpath.segs.length > 0 || currentSubpath.curves.length > 0) {
        subpaths.push(currentSubpath);
    }
    currentSubpath = {
        segs: [],
        curves: [],
        closed: false,
        filled: false,
        strokeWidth,
        strokeColor: strokeColor.slice(),
        fillColor: fillColor.slice(),
        constructPathId,
        ctm: ctm.slice(),
        id: subpathIdCounter++
    };
};

Segments go into segs[] in PDF user-space (not viewport space):

const bufferSeg = (ax, ay, bx, by) => {
    currentSubpath.segs.push({ ax, ay, bx, by });
};

Curves go into curves[] as raw control points, never converted:

currentSubpath.curves.push({
    p0: [rawPendingX, rawPendingY],
    p1: [args[ai], args[ai+1]],
    p2: [args[ai+2], args[ai+3]],
    p3: [args[ai+4], args[ai+5]]
});

Why defer the viewport transform? Because the reconciler needs to classify subpaths (RECT vs FREE_PATH) and detect axis alignment. That classification is more reliable in PDF user-space before rotation is applied. The CTM is applied per-subpath in the reconciler where it is needed.

The constructPathId counter increments for every constructPath call. This gives compound paths (a single PDF op that draws multiple sub-shapes) a shared ID, which the reconciler can use to detect siblings without ever touching LatticeReconstructor.


Stage 2: pathReconciler -- Classification and Thin Rect Normalization

The reconciler receives all SubpathRecords and emits canonical segments in viewport space.

Classifying Subpaths

function classifySubpath(subpath, vpTransform) {
    const toViewport = (pdfX, pdfY) => {
        const [cx, cy] = applyMatrix(subpath.ctm, pdfX, pdfY);
        return [
            vpTransform[0]  cx + vpTransform[2]  cy + vpTransform[4],
            vpTransform[1]  cx + vpTransform[3]  cy + vpTransform[5],
        ];
    };
    // ... expand bbox from segs and curves
    // classify as RECT / ROUNDED_RECT / POLYGON / FREE_PATH
}

Each subpath gets its own CTM applied. This is the critical difference from a naive implementation that applies a single global transform to all segments.

Analytical Bezier Bounding Box

For curves[], the reconciler computes the exact bounding box by finding derivative roots:

export function curveBboxContrib(p0, p1, p2, p3) {
    const extrema = (a, b, c, d) => {
        const dA = -3a + 9b - 9c + 3d;
        const dB =  6a - 12b + 6*c;
        const dC = -3a + 3b;
        const roots = [];
        if (Math.abs(dA) > 1e-6) {
            const disc = dBdB - 4dA*dC;
            if (disc >= 0) {
                const sq = Math.sqrt(disc);
                const t1 = (-dB + sq) / (2*dA);
                const t2 = (-dB - sq) / (2*dA);
                if (t1 > 0 && t1 < 1) roots.push(t1);
                if (t2 > 0 && t2 < 1) roots.push(t2);
            }
        } else if (Math.abs(dB) > 1e-6) {
            roots.push(-dC / dB);
        }
        const eval_ = t => a(1-t)3 + 3bt(1-t)2 + 3ct2(1-t) + dt3;
        return [a, d, ...roots.map(eval_)];
    };
    const xs = extrema(p0[0], p1[0], p2[0], p3[0]);
    const ys = extrema(p0[1], p1[1], p2[1], p3[1]);
    return { xMin: Math.min(...xs), xMax: Math.max(...xs),
             yMin: Math.min(...ys), yMax: Math.max(...ys) };
}

This runs separately for X and Y. The derivative of a cubic Bezier B(t) is a quadratic. We solve it with the quadratic formula and evaluate B(t) at the real roots in (0, 1). Always correct, always O(1) per curve.

Thin Rect Normalization

A filled rectangle with h < eps is semantically a horizontal rule, not a box. The normalization:

function normalizeThinRect(classified, eps) {
    const w = classified.bbox.xMax - classified.bbox.xMin;
    const h = classified.bbox.yMax - classified.bbox.yMin;

if (h < eps) { const cy = (classified.bbox.yMin + classified.bbox.yMax) / 2; return [{ ax: classified.bbox.xMin, ay: cy, bx: classified.bbox.xMax, by: cy }]; } else if (w < eps) { const cx = (classified.bbox.xMin + classified.bbox.xMax) / 2; return [{ ax: cx, ay: classified.bbox.yMin, bx: cx, by: classified.bbox.yMax }]; } // return 4 edge segments from bbox corners }

The threshold is eps = 4 (the lattice reconstructor's default merge tolerance), not a hardcoded constant. Using eps means the normalization condition is exactly the condition under which the lattice would merge and average the two thin edges anyway.


Stage 3: Global Dash Partition

Dashed lines in PDF are not one path. They are many short single-segment subpaths, each drawn independently. The dash merge algorithm cannot assume adjacency.

function mergeDashes(classifiedSubpaths, eps) {
    const freePaths = classifiedSubpaths.filter(c => c.type === 'FREE_PATH');
    const partitions = new Map();

for (const c of freePaths) { if (c.segsViewport.length !== 1) continue; const s = c.segsViewport[0];

// determine orientation + band center const colorHex = c.strokeColor.map(v => Math.round(v * 255).toString(16).padStart(2, '0')).join(''); const swBucket = Math.round(c.strokeWidth * 2); const key = ${colorHex}|${swBucket}|${orientation}|${yBucket};

partitions.get(key).push({ c, s, posMin, posMax }); }

// sort each partition by posMin, gap-test, emit merged canonical segments }

The partition key has four components: strokeColor hex, strokeWidth bucketed to 0.5px precision, orientation (H or V), and Y-band rounded to eps. Within each partition, segments are sorted by start position, then gap-tested against max(8, 0.4 × avgDashLength). Runs with gaps smaller than this threshold are merged into one canonical segment.

This handles any number of non-adjacent dashes at arbitrary positions in the operator stream. It also correctly keeps apart lines that are collinear but have different stroke widths -- a situation that would have silently merged a thick border with a thin hairline under the original key.


What This Feeds

The output of reconcile() is { segments, filledRects }:

These go directly into contextClassifier.js for region typing, and into latticeReconstructor.js for table grid inference. The lattice reconstructor is completely unchanged -- it never needed to know about subpath provenance. The reconciler does all the isolation work upstream.

Read this post in the full Engineering Journal →