Camera geometry

Pinhole camera

Terminology

A world coordinate is projected onto an image plane at, say . The image plane is at a distance (the focal length) away from the camera center. By law of similar triangles;

The XY plane of the world coordinate system is called the principal plane, and the world Z-axis is called the principal axis. The intersection of the principal axis and the image plane is called the principal point.

As convention;

Mapping from 3D to 2D

A pinhole camera is essentially mapping a 3D space to a 2D one.

where,

Shifting of principal point

Now, consider an offset to the principal point, such that it is no longer at the origin but at . The projection matrix would now be;

Here, is called the camera calibration matrix.

Which is just a translation added on top of the original matrix with just the focal length. Look at: Image processing > Affine transformation.

Shifting of camera center

Let the coordinates after some translation and rotation be, . Then we have (here is the camera position, and is subtracted to apply a translation)

Basically, we translate and then rotate to get final camera center. Now, we can find the new projected point for a world coordinate, by using the camera calibration matrix on the new camera center.

Rewriting this, we get:

Important

is often called as the intrinsic matrix since the parameters it has like the focus and principal point offset are intrinsic to the camera itself. The other part of the projection matrix, is therefore somewhat obviously, referred to as the extrinsic matrix.

Projection Matrix

Based on the previous math, we consider a matrix . Since, the calibration and rotation matrices are both 3x3, is also a 3x3 matrix. Let . is the 4th column and is 3x1.