Image Extraction Errorfix
Error Fix: Cannot read properties of undefined (reading 'createElement') in a PDF.js Web Worker
TLDR: PDF.js's default canvas factory calls document.createElement('canvas') which crashes inside a Web Worker. The fix is to pass a class (not an instance) that implements the CanvasFactory interface using OffscreenCanvas. The option key is CanvasFactory with a capital C.
Symptom
Cannot read properties of undefined (reading 'createElement')
at DOMCanvasFactory.create (pdf.mjs:...)
at PDFPageProxy.render (pdf.mjs:...)
The error appears inside geometryWorker.js when calling page.render() to extract image pixel data.
Root Cause
PDF.js allocates scratch canvases internally during rendering via a canvas factory class. The default is DOMCanvasFactory, which does:
create(width, height) {
const canvas = document.createElement('canvas');
// ...
}
document is not defined in a Web Worker. The crash is synchronous and immediate.
The fix exists: PDF.js's getDocument() accepts a custom factory. But there are two wrong ways to pass it.
Wrong way 1 — lowercase option:
// Silently ignored — PDF.js doesn't read this key const canvasFactory = { create(w,h) { ... } }; pdfjsLib.getDocument({ data: bytes, canvasFactory });
Wrong way 2 — instance with capital C:
// Crashes — PDF.js does new CanvasFactory(...) internally const factory = new OffscreenCanvasFactory(); pdfjsLib.getDocument({ data: bytes, CanvasFactory: factory });
PDF.js source does this:
const CanvasFactory = src.CanvasFactory || DefaultCanvasFactory;
// later:
this._canvasFactory = new CanvasFactory({ ownerDocument, enableHWA });
The value at CanvasFactory is used as a constructor. It must be a class.
Fix
class OffscreenCanvasFactory {
create(width, height) {
const canvas = new OffscreenCanvas(width, height);
return { canvas, context: canvas.getContext('2d') };
}
reset(canvasAndCtx, width, height) {
canvasAndCtx.canvas.width = width;
canvasAndCtx.canvas.height = height;
}
destroy(canvasAndCtx) {
canvasAndCtx.canvas.width = 0;
canvasAndCtx.canvas.height = 0;
canvasAndCtx.canvas = null;
canvasAndCtx.context = null;
}
}
// Gate on OffscreenCanvas availability (not in Node, always in browser workers) const canvasFactoryOpt = typeof OffscreenCanvas !== 'undefined' ? { CanvasFactory: OffscreenCanvasFactory } : {};
const pdf = await pdfjsLib.getDocument({ data: bytes, ...canvasFactoryOpt }).promise;
After this change, page.render() inside the worker completes without errors and produces a fully composited page canvas ready for image cropping.
Guard
The typeof OffscreenCanvas !== 'undefined' gate serves two purposes. First, it protects Node.js test environments where OffscreenCanvas is not available. Second, it documents that the image extraction path is a browser-only feature — which is correct, because the whole geometry pipeline runs in a browser Web Worker.
The image extraction block also wraps the entire render in a try/catch:
try {
const imgViewport = page.getViewport({ scale: IMG_SCALE });
// ... render and crop
} catch (_) { / render failed — no images for this page / }
If render fails for any reason (corrupted page, unsupported PDF feature), the worker continues to the next page rather than aborting the entire document.
What to Watch For
getTextContent() crashes on the same issue. If you ever call page.getTextContent({ includeMarkedContent: true }) and render on the same page in a worker context, the same factory replacement applies. The fix is the same.
Safari versions before 16.4 don't support OffscreenCanvas. On those browsers, the image extraction path is gated off and images render as placeholder divs. The geometry/text pipeline still works.
destroy() must null the references. PDF.js calls destroy() on scratch canvases after use. If you don't null the canvas and context, you will get memory leaks in long documents. The implementation above handles this correctly.
Lesson
When passing configuration objects to large frameworks, always check whether the option expects a value or a constructor. In this case, the key CanvasFactory and the behavior new CanvasFactory(...) are documented only in the PDF.js source — not in any public API doc. If in doubt, grep the source for new followed by the option key.