I was cleaning out some old notes from my previous job and found some math scribbles for computing CSS transforms and thought I would share it. For some context, I was working on a page which had an image that looked like this:

I wanted to add an easter egg where I could use the screens of those devices to display arbitrary things. I thought it would just be a simple matter of pixel pushing with translate/rotate/scale/etc using CSS transforms but couldn’t really get things to line up perfectly.

Frustrated, I started trying to solve it analytically instead. That means for any given shape, I need to solve for the perspective transform that warps an element into that shape. Once that is solved, it is easy to write a WYSIWYG helper script for outputting the CSS. Here’s the final result:

See the Pen ifnqH by Franklin Ta (@fta) on CodePen.

See the coffeescript tab for the code. Or paste this gist into console to try it out on any page that has jQuery. You will need to change the selector to whatever element you want to add the dots to.

Using that you can drag things into whatever (convex quadrilateral) shape you want:

I ended up not using this for anything but hope someone else will find it useful!

The rest of this post will explain how to derive the equation for the transform since I remember not being able to find much about it back then. Looking at the code you will see that the core logic is just a few lines to set up and solve a system of linear equations. We will now see how to derive that system.

So let’s say we have the 4 corners of the element we want to transform, \((x_i, y_i)\) where \(i \in \{0, 1, 2, 3\}\) and we want to map each \((x_i, y_i)\) to some \((u_i, v_i)\). From the docs of matrix3d, the transform we want is a homogeneous matrix so we have to represent each point using homogeneous coordinates. In homogenous coordinates, a point \((x, y)\) is represented by \((kx, ky, k)\) for any \(k \neq 0\). For example \((3, 2, 1)\) and \((6, 4, 2)\) both represent the point \((3, 2)\).

Thus the transformation matrix \(H\) that we want to solve for must satisfy

\(\underbrace{

\begin{pmatrix}

h_0 & h_1 & h_2\\

h_3 & h_4 & h_5\\

h_6 & h_7 & h_8\\

\end{pmatrix}

}_{H}

\begin{pmatrix}

x_i\\

y_i\\

1\\

\end{pmatrix}

=

k_i

\begin{pmatrix}

u_i\\

v_i\\

1\\

\end{pmatrix}

\)

for each \(i\), where the knowns are \(x_i, y_i, u_i, v_i\).

Notice that the \(H\) that can satisfy this is not unique. For example you can scale \(H\) by some constant and the resulting matrix will still map the points correctly (since you can also scale the \(k_i\) by the same amount and still represent the same homogeneous point). So assuming \(h_8 \neq 0\) (see note^{[1]}), we should always be able to scale both sides until \(h_8 = 1\), which will simplify the problem for us a little bit:

\begin{pmatrix}

h_0 & h_1 & h_2\\

h_3 & h_4 & h_5\\

h_6 & h_7 & 1\\

\end{pmatrix}

\begin{pmatrix}

x_i\\

y_i\\

1\\

\end{pmatrix}

=

k_i

\begin{pmatrix}

u_i\\

v_i\\

1\\

\end{pmatrix}

\)

Now we should try to get it into a form that we can solve. Multiplying out we get:

\(\begin{align*}

x_i h_0 + y_i h_1 + h_2 & = k_i u_i \\

x_i h_3 + y_i h_4 + h_5 & = k_i v_i \\

x_i h_6 + y_i h_7 + 1 & = k_i \\

\end{align*}

\)

We can get rid of \(k_i\) by substituting it from the third into the first two equations:

\(\begin{align*}

x_i h_0 + y_i h_1 + h_2 & = u_i x_i h_6 + u_i y_i h_7 + u_i \\

x_i h_3 + y_i h_4 + h_5 & = v_i x_i h_6 + v_i y_i h_7 + v_i \\

\end{align*}

\)

Remember we are trying to solve for \(h_i\) so we should try to separate them out:

\(\begin{array}{rcccl}

x_i h_0 + y_i h_1 + h_2 & & – u_i x_i h_6 – u_i y_i h_7 = u_i \\

& x_i h_3 + y_i h_4 + h_5 & – v_i x_i h_6 – v_i y_i h_7 = v_i \\

\end{array}

\)

Which in matrix notation is:

\(\begin{pmatrix}

x_i & y_i & 1 & 0 & 0 & 0 & -u_i x_i & -u_i y_i \\

0 & 0 & 0 & x_i & y_i & 1 & -v_i x_i & -v_i y_i \\

\end{pmatrix}

\begin{pmatrix}

h_0\\

h_1\\

h_2\\

h_3\\

h_4\\

h_5\\

h_6\\

h_7\\

\end{pmatrix}

=

\begin{pmatrix}

u_i\\

v_i\\

\end{pmatrix}

\)

Since we have 4 of these mappings we can write them like this:

\(\begin{pmatrix}x_0 & y_0 & 1 & 0 & 0 & 0 & -u_0 x_0 & -u_0 y_0 \\

0 & 0 & 0 & x_0 & y_0 & 1 & -v_0 x_0 & -v_0 y_0 \\

x_1 & y_1 & 1 & 0 & 0 & 0 & -u_1 x_1 & -u_1 y_1 \\

0 & 0 & 0 & x_1 & y_1 & 1 & -v_1 x_1 & -v_1 y_1 \\

x_2 & y_2 & 1 & 0 & 0 & 0 & -u_2 x_2 & -u_2 y_2 \\

0 & 0 & 0 & x_2 & y_2 & 1 & -v_2 x_2 & -v_2 y_2 \\

x_3 & y_3 & 1 & 0 & 0 & 0 & -u_3 x_3 & -u_3 y_3 \\

0 & 0 & 0 & x_3 & y_3 & 1 & -v_3 x_3 & -v_3 y_3 \\

\end{pmatrix}

\begin{pmatrix}

h_0\\

h_1\\

h_2\\

h_3\\

h_4\\

h_5\\

h_6\\

h_7\\

\end{pmatrix}

=

\begin{pmatrix}

u_0 \\

v_0 \\

u_1 \\

v_1 \\

u_2 \\

v_2 \\

u_3 \\

v_3 \\

\end{pmatrix}

\)

At this point we are done because it is in \(Ah = b\) form so we can just throw this at a matrix algebra library to solve for \(h\). It should spit back out the \(h_i\) which will let us recover the transform we want:

\(H =

\begin{pmatrix}

h_0 & h_1 & h_2\\

h_3 & h_4 & h_5\\

h_6 & h_7 & h_8\\

\end{pmatrix}

\)

One last wrinkle is that matrix3d actually takes in a 4 by 4 matrix rather than a 3 by 3. Since we don’t care about \(z\) values (because all our points are on the same plane, \(z=0\)) we can just make \(z\) map back to itself. Like so:

\(\begin{pmatrix}

h_0 & h_1 & 0 & h_2\\

h_3 & h_4 & 0 & h_5\\

0 & 0 & 1 & 0\\

h_6 & h_7 & 0 & h_8\\

\end{pmatrix}

\)

And that’s the final matrix you use for matrix3d. Remember to specify it in column major order and also set the transform-origin to whatever you measured your points with respect to.

I didn’t know what to google when I first did this so I had to derive this by hand. I have recently been reading a book on computer vision and it turns out this problem is actually a basic problem in that field (see getPerspectiveTransform in opencv) and the technique for solving this is called direct linear transformation.

Notes:

[1] If you want to solve this rigorously without making the assumption that \(h_8 \neq 0\) you can still follow the same steps outlined in this post. You will just end up with a homogeneous system \(Ah = 0\) instead (where \(A\) is now a 8 by 9 matrix and \(h\) is a 9-vector). This can be solved by doing a singular value decomposition and then finding the singular vector corresponding with a singular value of zero. Then any scalar multiple of that singular vector will be a solution. The js library I was using for math didn’t support SVD very well though!

EDIT (2016-11-09): As noted in the comments below, matrix3d used to be incorrect on Chrome under page zoom. I contributed a fix to chromium for it so it should be fine now.

The matrix you’re describing is called a “homography” matrix in computer vision. Generally, to describe the projection of a point (x,y,z,1) in the real 3D world onto an image point (u,v,1), you need a 3×4 operator, which rotates, scales, and translates the 3D point before normalizing the the final coordinate to 1. But as you point out, if you merely need to project a planar object to the image plane, a 3×3 matrix suffices. That’s because the plane has a local coordinate 2D system, where points take on the coordinates (s,t,1). A linear 4×3 transform maps (s,t,1) to 3D locations (x,y,z,1), which are then mapped to (u,v,1) via a projection matrix. The homography matrix is the product of these two matrices, and performs these two transformations in one fell swoop.

That is ******* awesome!!!! I just love the way HTML, CSS & Javascript is going now.

Apologies for the cursing but I felt anything less would not convey the awesomeness of it all.

Damn, boy! You’re the goddamn boss! I have no idea how I ended up here but I’m glad I did. Thanks <3

exactly the same, you said my words

Wow! This really looks nice and gives the experience of the user a whole new dimension. Simple and yet so powerful. Thanks for sharing!

Got here from Codrops and I’m in awe. Damn that’s cool!

Well done and good explained!

I set the top left point to (0,0) and set translation via CSS “left” and “top” attributes. Than I have to solve a 6×6 matrix instead of a 8×8 matrix.

It’s a pity that it works only for 100% zoom of a browser window ๐

Great post and solution, here you can see a commercial application for it:

http://placeit.net/stages/interactive-image-ipad-landscape?background=3_vi&f_types=interactive_image

Enjoy ๐

Great post! Saved a lot of time!

Are you a witch?

Awesome post! thank you!

Holy crap, thank you! This is the thoroughest (most thorough?) treatment of the matrix3d transform I’ve ever seen! You clearly didn’t sleep through linear algebra like I did.

First of all – really great work. But it nearly killed me – I had a really strange behavior – on my demo first everything was ok, but suddenly everything was shifted and misaligned. So I checked the math behind and if I messed up the code. After some long time of debugging and already close to give up I found my mistakeโฆ I accidentally zoomed in Google Chrome. And the messes up the transformation completely (give it a try). Surprisingly IE does it correct even if zoomed.

OMG… This is amazing thing….. Amazingly amazing…

you are awesome man. Keep bringing it on. Where is the subscribe box ๐

Hi,

I’m trying to do the same in C# using OpenGL to identify projection matrix starting from 4 points.

Is there anybody that has already done that kind of stuff?

I don’t know if someone has already done it but it’s not too hard to do yourself.

For example here I used the same math for three.js(a webgl framework): http://codepen.io/anon/pen/gbXdXm. That demo was used for reporting a bug in three.js r70 so you should see 3 red boxes, where the last two are wrong due to bugs in their matrix multiplication (EDIT: updated for r83). For your case the tl;dr; is to add an orthographic projection if you want to use the same math outlined in this post.

By the way, if your real problem is to solve for the position/rotation of an existing camera, then I would recommend using a solvePnP approach instead. Slightly touched upon in this blog post (using opencv/opengl):

http://franklinta.com/2014/09/30/6dof-positional-tracking-with-the-wiimote/