Quaternions: the unspoken heroes
If you've ever played a video game or watched a smooth 3D animation, you've likely enjoyed the benefits of quaternions without realizing it. These mathematical entities make all those rotations in 3D space feel natural. Consider your character in a game—whether it's turning their head or changing direction mid-jump, all those fluid movements are powered by some interesting math.
Quaternions are a 4-dimensional number system represented as $a + b\hat{i} + c\hat{j} + d\hat{k}$. These can help you avoid problems like gimbal lock, where the system loses its ability to rotate correctly.
We'll explain more about gimbal lock later, but imagine playing a game where the camera suddenly freezes when you tilt your camera beyond a certain angle. This would lead to either a loss in one axis of rotation or a complete freeze-up of the camera. Situations like these would completely ruin your experience. Quaternions prevent such issues, keeping everything fluid and smooth. They ensure your camera does not go kaput when you are in intense situations.
This is just one example of how quaternions operate behind the scenes in our daily lives, which is unknown to most people.
If you’re into math, quaternions are a fascinating topic. They help you think about the combination of numbers and rotations in ways that are not what we usually think of but have real applications like game development, animation, aerospace engineering, and robotics.
Quaternions are cool and they extend the idea of complex numbers to 4D space. They are not just some random tool we use because they’re convenient for understanding problems regarding rotation, stability, and control, but they also represent a deeper mathematical structure that connects algebra, geometry, and even physics. For instance:
Quaternion multiplication is non-commutative.
In most number systems order of multiplication does not matter:
$$ab=ba$$
However, with quaternions $\hat{i}\hat{j}=\hat{k}$ but $\hat{j}\hat{i}=-\hat{k}$. This is termed non-commutative and this is what sets them apart from real and complex numbers.
So, what are quaternions?
Quaternions are popularly defined as four-dimensional numbers or a four-dimensional extension of complex numbers. But, what does this actually mean? A dimension is essentially the number of unique values you would need to uniquely specify something in a system. An example to illustrate what this means is, let’s say you have a road full of potholes and you want to obtain a sort of address for each pothole. Then you will need only two dimensions- horizontal and vertical distance from a reference point.
Another important thing to note - when we talk about dimensions while relating them to real-life examples, is that dimensions are the minimum number of values required to specify the system in question. In the pothole example, we can add plenty of redundant values like the left half of its area, the top left sector and so on, however, we only need two distinct values. This is similar to how we use multiple parameters to describe an address - like the house number, street, area code and so on, however, we really only need the latitude and the longitude.
And an address must be unique right? In other words, one address should only correspond to one location. You can’t have two houses with the same address.
This is the basic idea. We can extend this to mathematical objects, as the number of linearly independent values necessary to specify a point. Linearly independent means you can’t get one value from the other. For example, a line needs only $1$ value- the length from a reference point to specify each point on it. So, it is one-dimensional (1D). The world we live in can be described as a three-dimensional (3D) space- as we need the length, width and height from a reference point to specify a location.
All right, now let's apply the idea of dimensions to numbers. We will be using the former definition of dimensions for this purpose. For a real number, all the information about it is contained in that one value like $2$ or $\pi$ or $e$ that we use to specify it. For a complex number however, two real numbers are required to specify it which is what makes it 2 dimensional (2D). Look below:
$$ a + bi $$
where, $a,b \in \mathbb{R}$ and $i$ is the imaginary unit $\sqrt{-1}$. Here $a$ is called the real part while $b$ is called the imaginary part of the complex number.
A complex number can in fact be represented on a plane, called the argand plane. The real part is represented on the x-axis and the imaginary part on the y-axis.
Understanding the quaternion group
A group is a set equipped with an operation which obeys a few unique rules that allow us to describe symmetries. Analyzing the structure of these groups can be very important to study the object we are applying it to as it can help us characterize and classify it. Studying groups can also help us make connections between different parts of math and science.
The set of basis elements in quaternions $\{1, -1, \hat{i}, -\hat{i}, \hat{j}, -\hat{j}, \hat{k}, -\hat{k}\}$ actually form a group called $\mathbb{Q}_{8}$, with the group operation as multiplication. But, what are these actually? Are they just some random abstract objects? Well, they are more than that, and we have a trick up our sleeve to show you what they mean practically. Matrices! Yes, that’s right quaternions can be represented as matrices. They are $2 \times 2$ complex matrices, this is analogous to how complex numbers are represented as $2 \times 2$ real matrices.
$$ q=\left(\begin{array}{cc}a+i d & -b-i c \\b-i c & a-i d\end{array}\right) $$
While switching to this matrix representation we have to make sure that the exact properties are retained, for instance, the algebra defines how $\hat{i}, \hat{j}$ and $\hat{k}$ interact and should thus be preserved. So, accordingly, the elements of the basis set of the $\mathbb Q_8$ group represented in terms of matrices are given as follows:
$$ 1=\left(\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right), \quad \hat{i}=\left(\begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array}\right), \quad \hat{j}=\left(\begin{array}{cc} 0 & -i \\ -i & 0 \end{array}\right), \quad \hat{k}=\left(\begin{array}{cc} i & 0 \\ 0 & -i \end{array}\right) $$
However, these matrices can also be structured differently to incorporate something that physics aficionados will definitely be familiar with: the Pauli matrices $\sigma_x, \sigma_y$ and $\sigma_z$. The quaternions can take the form of the following matrices:
$$ \begin{aligned}& \sigma_1=\sigma_x=\left(\begin{array}{ll}0 & 1 \\1 & 0\end{array}\right) \\& \sigma_2=\sigma_y=\left(\begin{array}{cc}0 & -i \\i & 0\end{array}\right) \\& \sigma_3=\sigma_z=\left(\begin{array}{cc}1 & 0 \\0 & -1\end{array}\right)\end{aligned} $$
Therefore our $\hat{i}, \hat{j} ,\hat{k}$ matrices can be changed as:
$$ \begin{aligned} & \hat{i}= i \sigma_z \\ & \hat{j} = -i \sigma_y \\ & \hat{k} = -i \sigma_x \end{aligned} $$
if you look carefully the quaternion algebra is still preserved (remember the Pauli matrix multiplication rules). Example: $\hat{i} \times \hat{j} = (i \times -i) \times (\sigma_z \times \sigma_y) = 1 \times -i\sigma_x = -i\sigma_x = \hat{k}$.
This is indeed a cool link between the mathematics of spin and quaternions!
Now that we know what these things are, lets verify that they form a group by checking if they obey the rules i.e. group axioms. But, before that let’s motivate why these rules are set to be so. We told you earlier that groups are related to symmetries. Thus, these axioms can be thought of as properties of symmetries:
- Closure: Symmetries leave the object unchanged, so doing one symmetry after the other also obviously leaves the object unchanged. This is what closure is all about, composing objects within a group stays within the group. The group is like a box with the lid closed.
- Associativity: Combining two symmetries first and then combining the third with the result of that is the same as combining the first with the result from combining the second and third. Like imagine a cow looking to your left which we can think of as a spear with a dot on its head pointing to your left, now flip the spear about the vertical axis and rotate it by 90 degrees clockwise, then rotate the result of that by 30 degrees. Then, restart from the beginning but this time flip it first and then do a 120 degree rotation and you will end up with the same thing.
- Identity: Its the element that does nothing when it combines with any other element. Its like that one teammate you have. They are someone who is just there but does nothing to change the outcome of the anything. Obviously doing nothing is also a symmetry. So, the group must contain an identity element.
- Inverses: Inverses are elements that ensure that every action can be undone, adding reversibility and completeness. Undoing an action has to be a symmetry as well, because if a symmetry leaves a certain aspect of an object unchanged then if that is changed by the inverse then it can’t be the inverse. Therefore, every element in a group has a unique undo button. The "ctrl + z" or inverse that takes you back to the identity element. For any element $a$, there’s another element $a^{-1}$ that, when combined, returns you to the identity.
Now that you guys have a concrete understanding of the group axioms, let us show you a cool way to visualize this group. This is done using a Cayley graph.
A Cayley graph is a graphical representation of a group where each node represents a group element, and the arrows represent the operation of composition by an element of the group.
Quaternions vs Matrices and Euler angles
When it comes to representing rotations in 3D space, Euler angles, rotation matrices, and quaternions have advantages and disadvantages of their own
Euler angles are the easiest to understand because they break rotations into three separate components: pitch (up/down), yaw (left/right), and roll (tilting). This simplicity makes Euler angles easy to work with and visualize, especially in applications like flight simulators or camera controls. However, Euler angles suffer from a major limitation, rotating about one axis affects the position of the other axes. What we mean is illustrated in the below diagram:
Let’s say the red, blue, and green rings represent the different axes(x, y, and z). Imagine that the rings are connected. Now, rotating one ring also changes the position of the other rings, and what happens when two of the rings align? Now, you have effectively lost an axis of rotation. Look at the gif below to see what we mean:
Here, after the green and pink axes align you can only rotate about two axes instead of three. Even if a gimbal lock doesn’t occur, changing axes is a problem, because for example if we want to rotate my object about the x-axis, it will rotate about some other random axis.
However, rotation matrices are a more reliable option for systems that need accurate and continuous rotations since they solve the issue of a gimbal lock. A rotation matrix is a $3 \times 3$ matrix that may be used to represent any 3D rotation. It can also be used to handle scaling and translation in addition to rotations when paired with other matrices, which makes it useful for applications such as computer graphics and physics simulations. Rotation matrices do have certain drawbacks, though. To start, they are inefficient when compared to quaternions, which only need four parameters to represent a rotation whereas rotation matrices require nine. Multiplying rotation matrices also takes longer to do for a computer compared to multiplying quaternions.
Rotations using quaternions
Now the question is how do these quaternions produce rotations? First of all what is a rotation? A rotation is a transformation that obeys three rules, and if you think about these with respect to rotations you are used to in daily life, it will make it very obvious.
- It must be a linear transformation
- Lengths and distances must be preserved: Imagine two arrows on a globe, rotating the globe preserves the lengths of both arrows and the distances between them.
- Orientations must be preserved: Using the same example, look at the fact that the first arrow is still on the left/right(however you imagined it) of the second.
The set of matrices which obey these conditions in three dimensions for Euclidean vectors form a group called $SO(3)$. For matrices the conditions are that $MM^T = I$ and $det(M) = 1$ where M is a matrix. This is just a mathematical representations of 2 and 3 above for matrices.
Okay, lets get back to quaternions, for them rotations are described by $qvq^{-1}$. Here $q$ is a unit quaternion i.e. a quaternion of unit length,
$$ |q| = \sqrt{a^2 + b^2 + c^2 + d^2} =1 $$
and $q^{-1}$ is its inverse given by,
$$ q^{-1} = \frac{a-b\hat{i}-c\hat{j}-d\hat{k}}{a^2+b^2+c^2+d^2} = \frac{q^{*}}{|q|^2} $$
While $v$ is what is called a pure quaternion, this is basically a quaternion with real part zero i.e. $a = 0$. This is literally how we bring the 4D to quaternions to our 3D world, because a pure quaternion can be thought of as a vector in $\mathbb{R}^3$ instead of $\mathbb{R}^4$, and conjugating it with a unit quaternions stays within the space.
If you noticed carefully, we still didn’t show you why this is a rotation, we just told you that it is. Here is why, conjugation by a unit quaternion always preserves distances(isometry), so consequently sandwiching a pure quaternion with two unit quaternions preserves distances. This sandwiching also preserves lengths because of the multiplicative absolute value property of quaternions($|q_1q_2| = |q_1||q_2|, \text{ so } |qvq^{-1}| = |q||v||q^{-1}|$). Finally, orientations are preserved because unit quaternions have determinant 1 ($det(q) = |q|^2 = 1$). So there you have it all properties of rotation are satisfied.
Many of the problems caused by rotation matrices and Euler angles are resolved by quaternions. They can describe any 3D rotation with just four real numbers and provide smooth rotations without the problem of a gimbal lock. Quaternions are perfect for real-time applications like video games, animation, and spacecraft navigation since they are also far less prone to numerical drift and are more computationally efficient than matrices. Quaternions have advantages, but because they contain four dimensions, they can be challenging to intuitively see and understand. It takes more computations to extract useful angles from a quaternion, such as pitch, yaw, and roll, which makes direct manipulation of these angles less convenient. However, quaternions are more efficient and stable when it comes to handling complex rotations hence they are more widely used in most applications.
In summary
Quaternions are a cool and useful mathematical idea in that they are 4 dimensional numbers, you have the whole thing about rotations, its structure as a group, but at the same time it is also useful in a lot of ways like we mentioned in this article. Their purpose however doesn’t stop in being useful to describe 3-dimensional rotations. They present a unified perspective in certain areas of physics like quantum mechanics, relativity and in the standard model of particle physics. For instance, in special relativity quaternions can be used to provide a compact alternative form of space-time transformations.
Our hope is that by the end of this article you have developed an appreciation for the beautiful math of quaternions and continue to carry forward this appreciation to other areas of math as well.