SVG Pan/Zoom With CSS Transforms Breaks Every Coordinate API You Have
TLDR
CSS transform: scale() on an SVG wrapper looks like a simple pan/zoom shortcut. It breaks every SVG coordinate API the moment zoom is not 1. The fix is SVG-native viewBox manipulation backed by a single camera matrix object. One coordinate system, one source of truth.
The Problem Class
Any developer building an interactive SVG editor eventually needs pan and zoom. The naive path is a CSS transform on the wrapper element. It works visually. The diagram scales and translates correctly on screen.
The problem appears the moment you need to read coordinates back out: placing a selection handle, snapping a drawn shape to an existing element, mapping a mouse click to a world position, or rendering a pixel-perfect overlay. Every SVG API that returns coordinates, getBBox, getScreenCTM, createSVGPoint, getBoundingClientRect on an SVG child, speaks one coordinate system. Your CSS transform lives in a second one.
The Naive Approach
CSS transform pan/zoom is everywhere because it works for the display case:
function applyCamera(zoom, tx, ty) {
svgWrapper.style.transform =
scale(${zoom}) translate(${tx}px, ${ty}px);
}
This moves and scales the rendered output correctly. Mouse events fire at the right visual positions. Screenshots look right. The pattern appears in dozens of tutorials and it is not wrong for read-only SVG display.
The problem is that SVG's coordinate APIs do not know about CSS transforms applied to parent elements. They report coordinates in SVG space, which at zoom != 1 is no longer screen space.
Why It Breaks
Consider placing a selection handle at the corner of a selected element. The element's position in world space is correct. You call getBBox() to get its bounds. You convert those bounds to screen coordinates to position the handle. The conversion uses getScreenCTM() on the SVG element.
getScreenCTM() returns the matrix that maps SVG coordinates to screen coordinates. When pan/zoom lives in a CSS transform on the wrapper, getScreenCTM() does not include that transform. The browser's CSS rendering and the SVG coordinate system are separate stacks.
The result: your handle appears at the position the element would be at zoom 1.0, pan 0,0. At any other zoom or pan, the handle is somewhere else on screen.
The same problem hits symbol placement (palette items dropped onto the canvas), snap targets (the snap algorithm computes in SVG space, the mouse is in screen space), and hit testing (clicking an element works until zoom changes).
Every fix attempt is a compensation: multiply by zoom here, subtract the wrapper offset there. These compensations are correct at the moment you write them and break the next time something changes the transform chain.
The Better Model
SVG has a built-in pan/zoom mechanism: viewBox. The viewBox defines the region of SVG world space that maps to the element's screen rectangle. Pan is a shift in the viewBox origin. Zoom is a change in the viewBox dimensions.
A camera matrix object owns zoom and pan state and converts them to a viewBox string on every frame:
const camera = {
_zoom: 1,
_tx: 0,
_ty: 0,
toViewBox(containerW, containerH) { const vbW = containerW / this._zoom; const vbH = containerH / this._zoom; const vbX = -this._tx / this._zoom; const vbY = -this._ty / this._zoom; return ${vbX} ${vbY} ${vbW} ${vbH}; },
applyTo(svgEl, containerW, containerH) { svgEl.setAttribute('viewBox', this.toViewBox(containerW, containerH)); } };
No CSS transforms anywhere in the chain. getBBox, getScreenCTM, and createSVGPoint all return values in the correct coordinate space because SVG's own transform stack is intact.
World-to-screen mapping for an overlay canvas uses getScreenCTM() directly and is provably correct:
function worldToScreen(svgEl, cameraGroup, wx, wy, containerRect) {
const pt = svgEl.createSVGPoint();
pt.x = wx;
pt.y = wy;
// Use the camera group's CTM, not the SVG root's,
// so camera rotation is included correctly.
const m = cameraGroup
? cameraGroup.getScreenCTM()
: svgEl.getScreenCTM();
const sp = pt.matrixTransform(m);
return {
x: sp.x - containerRect.left,
y: sp.y - containerRect.top
};
}
This function works at any zoom, any pan, and any camera rotation without compensation math.
Tradeoffs
The viewBox approach has one real cost: the SVG viewBox attribute must be updated on every pan/zoom gesture. For large diagrams with many elements, this can trigger a layout pass. In practice, throttling via requestAnimationFrame eliminates the problem.
The CSS transform approach has one advantage: GPU compositing. CSS transforms can be composited by the GPU without triggering a paint. ViewBox changes always trigger a paint. For diagram editors where pan/zoom is a primary interaction, the paint cost is acceptable. For a background animation, it would not be.
CSS transforms remain correct for one use case: a static SVG that you want to scale visually without ever reading coordinates back out. If the SVG is interactive in any way, viewBox is the right tool.
The One Thing to Watch For
When you add a camera rotation group inside the SVG (a <g> that receives the rotation transform), always call getScreenCTM() on that group, not on the SVG root element. The SVG root's CTM does not include transforms applied to child groups. If your handles appear at the right position during straight pan/zoom but drift when the canvas is rotated, this is the cause.