CSS Transform Pan/Zoom on SVG Is Not a Shortcut. It Is a Trap.
TLDR
CSS transform: scale() on an SVG wrapper is documented everywhere as the simple pan/zoom approach. It produces a coordinate system that your SVG APIs cannot read. ViewBox does the same thing, works correctly with every SVG API, and is not meaningfully harder to implement.
What the Industry Does
Search for "SVG pan zoom JavaScript" and the first five results use the same pattern: wrap the SVG in a div, apply transform: scale(zoom) translate(tx, ty) to the wrapper, listen to wheel and drag events, update the transform. Done.
This pattern is in CodePen examples, MDN-adjacent tutorials, and production charting libraries. It is fast to implement, visually correct, and GPU-accelerated. For a read-only SVG display, it is fine.
Why It Fails for Interactive Editors
Interactive SVG editors need to read coordinates back out of the diagram constantly. Place a selection handle at an element corner. Snap a dragged shape to an existing element's edge. Map a mouse click to the world position it hit. Convert a palette drop to SVG coordinates.
Every SVG API that returns coordinates, getBBox, getScreenCTM, createSVGPoint, reports values in SVG coordinate space. CSS transforms applied to a parent element are outside the SVG coordinate system. The APIs do not see them.
The result is a permanent split: SVG space and "CSS-transformed screen space" are two different things. Every coordinate operation that crosses that boundary requires a manual correction. Multiply by zoom here. Subtract the wrapper offset there. Read getBoundingClientRect on the wrapper and compensate.
These corrections are correct when written. They break when the layout changes. They interact with each other in non-obvious ways. They hide the root problem behind a layer of math.
The Better Approach
SVG's viewBox attribute is the camera. It defines which region of SVG world space maps to the element's display rectangle. Changing the viewBox is pan and zoom. No wrapper div required, no CSS transforms, no coordinate split.
// The entire camera implementation
function applyCamera(svgEl, zoom, tx, ty, cW, cH) {
svgEl.setAttribute('viewBox',
${-tx/zoom} ${-ty/zoom} ${cW/zoom} ${cH/zoom}
);
}
With this, getScreenCTM() returns the correct matrix at any zoom. getBBox() values convert to screen coordinates with one line. No corrections needed anywhere because there is only one coordinate system.
What You Give Up
ViewBox changes trigger a layout pass in the browser. CSS transforms can be GPU-composited without a paint. For smooth 60fps pan/zoom on large diagrams, the difference is measurable. A ResizeObserver plus requestAnimationFrame throttle eliminates most of the cost, but the CSS transform approach does have a real performance edge for camera-only motion.
If you are building a viewer, not an editor, CSS transform pan/zoom is a reasonable choice. You will never need to map coordinates back, so the split does not matter.
When the Common Pattern Is Right
Use CSS transform pan/zoom when: the SVG is display-only, no handles or overlays need to be positioned relative to SVG elements, and you need the smoothest possible pan/zoom at the cost of coordinate API correctness.
For any editor where users interact with diagram content, viewBox is not the advanced option. It is the only option that does not require compensating for a problem you introduced yourself.