Path Reconciler Deepdive
Under the Hood: Building a Correct PDF Path Reconciler
TLDR: PDF path operators are not segments. They are instructions for a stateful drawing machine. Turning them into clean, classified segments requires a three-stage pipeline: capture the CTM per-subpath in the adapter, reconcile subpath geometry analytically in the reconciler, merge fragmented dashes via global partition. Every stage has a correctness trap that most implementations miss.
The Problem
PDF.js gives you an operator list: a flat array of drawing commands. moveTo, lineTo, curveTo, rectangle, fill, stroke, closePath, and their variants. The path state machine is implicit -- moveTo opens a new subpath, fill/stroke closes and paints it, save/restore pushes and pops the CTM.
To reconstruct table grids and figure bounding boxes, you need axis-aligned segments in viewport space. Getting there without losing provenance or correctness requires three non-obvious design decisions.
Stage 1: ctmAdapter -- SubpathRecords, Not Segments
The adapter's job is to read the operator list and emit SubpathRecord objects. The key constraint: do not convert to viewport space yet.
Every subpath captures the CTM at the time moveTo was called:
const openSubpath = (constructPathId) => {
if (currentSubpath.segs.length > 0 || currentSubpath.curves.length > 0) {
subpaths.push(currentSubpath);
}
currentSubpath = {
segs: [],
curves: [],
closed: false,
filled: false,
strokeWidth,
strokeColor: strokeColor.slice(),
fillColor: fillColor.slice(),
constructPathId,
ctm: ctm.slice(),
id: subpathIdCounter++
};
};
Segments go into segs[] in PDF user-space (not viewport space):
const bufferSeg = (ax, ay, bx, by) => {
currentSubpath.segs.push({ ax, ay, bx, by });
};
Curves go into curves[] as raw control points, never converted:
currentSubpath.curves.push({
p0: [rawPendingX, rawPendingY],
p1: [args[ai], args[ai+1]],
p2: [args[ai+2], args[ai+3]],
p3: [args[ai+4], args[ai+5]]
});
Why defer the viewport transform? Because the reconciler needs to classify subpaths (RECT vs FREE_PATH) and detect axis alignment. That classification is more reliable in PDF user-space before rotation is applied. The CTM is applied per-subpath in the reconciler where it is needed.
The constructPathId counter increments for every constructPath call. This gives compound paths (a single PDF op that draws multiple sub-shapes) a shared ID, which the reconciler can use to detect siblings without ever touching LatticeReconstructor.
Stage 2: pathReconciler -- Classification and Thin Rect Normalization
The reconciler receives all SubpathRecords and emits canonical segments in viewport space.
Classifying Subpaths
function classifySubpath(subpath, vpTransform) {
const toViewport = (pdfX, pdfY) => {
const [cx, cy] = applyMatrix(subpath.ctm, pdfX, pdfY);
return [
vpTransform[0] cx + vpTransform[2] cy + vpTransform[4],
vpTransform[1] cx + vpTransform[3] cy + vpTransform[5],
];
};
// ... expand bbox from segs and curves
// classify as RECT / ROUNDED_RECT / POLYGON / FREE_PATH
}
Each subpath gets its own CTM applied. This is the critical difference from a naive implementation that applies a single global transform to all segments.
Analytical Bezier Bounding Box
For curves[], the reconciler computes the exact bounding box by finding derivative roots:
export function curveBboxContrib(p0, p1, p2, p3) {
const extrema = (a, b, c, d) => {
const dA = -3a + 9b - 9c + 3d;
const dB = 6a - 12b + 6*c;
const dC = -3a + 3b;
const roots = [];
if (Math.abs(dA) > 1e-6) {
const disc = dBdB - 4dA*dC;
if (disc >= 0) {
const sq = Math.sqrt(disc);
const t1 = (-dB + sq) / (2*dA);
const t2 = (-dB - sq) / (2*dA);
if (t1 > 0 && t1 < 1) roots.push(t1);
if (t2 > 0 && t2 < 1) roots.push(t2);
}
} else if (Math.abs(dB) > 1e-6) {
roots.push(-dC / dB);
}
const eval_ = t => a(1-t)3 + 3bt(1-t)2 + 3ct2(1-t) + dt3;
return [a, d, ...roots.map(eval_)];
};
const xs = extrema(p0[0], p1[0], p2[0], p3[0]);
const ys = extrema(p0[1], p1[1], p2[1], p3[1]);
return { xMin: Math.min(...xs), xMax: Math.max(...xs),
yMin: Math.min(...ys), yMax: Math.max(...ys) };
}
This runs separately for X and Y. The derivative of a cubic Bezier B(t) is a quadratic. We solve it with the quadratic formula and evaluate B(t) at the real roots in (0, 1). Always correct, always O(1) per curve.
Thin Rect Normalization
A filled rectangle with h < eps is semantically a horizontal rule, not a box. The normalization:
function normalizeThinRect(classified, eps) {
const w = classified.bbox.xMax - classified.bbox.xMin;
const h = classified.bbox.yMax - classified.bbox.yMin;
if (h < eps) { const cy = (classified.bbox.yMin + classified.bbox.yMax) / 2; return [{ ax: classified.bbox.xMin, ay: cy, bx: classified.bbox.xMax, by: cy }]; } else if (w < eps) { const cx = (classified.bbox.xMin + classified.bbox.xMax) / 2; return [{ ax: cx, ay: classified.bbox.yMin, bx: cx, by: classified.bbox.yMax }]; } // return 4 edge segments from bbox corners }
The threshold is eps = 4 (the lattice reconstructor's default merge tolerance), not a hardcoded constant. Using eps means the normalization condition is exactly the condition under which the lattice would merge and average the two thin edges anyway.
Stage 3: Global Dash Partition
Dashed lines in PDF are not one path. They are many short single-segment subpaths, each drawn independently. The dash merge algorithm cannot assume adjacency.
function mergeDashes(classifiedSubpaths, eps) {
const freePaths = classifiedSubpaths.filter(c => c.type === 'FREE_PATH');
const partitions = new Map();
for (const c of freePaths) { if (c.segsViewport.length !== 1) continue; const s = c.segsViewport[0];
// determine orientation + band center const colorHex = c.strokeColor.map(v => Math.round(v * 255).toString(16).padStart(2, '0')).join(''); const swBucket = Math.round(c.strokeWidth * 2); const key = ${colorHex}|${swBucket}|${orientation}|${yBucket};
partitions.get(key).push({ c, s, posMin, posMax }); }
// sort each partition by posMin, gap-test, emit merged canonical segments }
The partition key has four components: strokeColor hex, strokeWidth bucketed to 0.5px precision, orientation (H or V), and Y-band rounded to eps. Within each partition, segments are sorted by start position, then gap-tested against max(8, 0.4 × avgDashLength). Runs with gaps smaller than this threshold are merged into one canonical segment.
This handles any number of non-adjacent dashes at arbitrary positions in the operator stream. It also correctly keeps apart lines that are collinear but have different stroke widths -- a situation that would have silently merged a thick border with a thin hairline under the original key.
What This Feeds
The output of reconcile() is { segments, filledRects }:
segments-- canonical H/V/diagonal segments in viewport coordinates, each withstrokeWidthandstrokeColorfilledRects-- closed filled rectangles (potential BOX regions, background fills, decorative panels)
contextClassifier.js for region typing, and into latticeReconstructor.js for table grid inference. The lattice reconstructor is completely unchanged -- it never needed to know about subpath provenance. The reconciler does all the isolation work upstream.