Proving Universality

1. Introduction

What does it mean for a computer to be able to do everything that it could possibly do? This was a question tackled by Alan Turing before we even had a good idea of what a computer was, or how to build one.

To ask this question for our classical computers, and specifically for our standard digital computers, we need to strip away all the screens, speakers and fancy input devices. What we are left with is simply a machine that converts input bit strings into output bit strings. If a device can perform any such conversion, taking any arbitrary set of inputs and converting them to an arbitrarily chosen set of corresponding outputs, we call it universal.

Quantum computers similarly take input states and convert them into output states. We will therefore be able to define universality in a similar way. To be more precise, and to be able to prove when universality can and cannot be achieved, it is useful to use the matrix representation of our quantum gates. But first we'll need to brush up on a few techniques.

2. Fun With Matrices

2.1 Matrices as outer products

In previous sections we have calculated many inner products, such as $\langle0|0\rangle =1$. These combine a bra and a ket to give us a single number. We can also combine them in a way that gives us a matrix, simply by putting them in the opposite order. This is called an outer product, and works by standard matrix multiplication. For example

$$ |0\rangle\langle0|= \begin{pmatrix} 1 \\\\ 0 \end{pmatrix} \begin{pmatrix} 1 & 0 \end{pmatrix} = \begin{pmatrix} 1&0 \\\\ 0&0 \end{pmatrix},\\\\ |0\rangle\langle1| = \begin{pmatrix} 1 \\\\ 0 \end{pmatrix} \begin{pmatrix} 0 & 1 \end{pmatrix} = \begin{pmatrix} 0&1 \\\\ 0&0 \end{pmatrix},\\\\ |1\rangle\langle0| = \begin{pmatrix} 0 \\\\ 1 \end{pmatrix} \begin{pmatrix} 1 & 0 \end{pmatrix} = \begin{pmatrix} 0&0 \\\\ 1&0 \end{pmatrix},\\\\ |1\rangle\langle1| = \begin{pmatrix} 0 \\\\ 1 \end{pmatrix} \begin{pmatrix} 0 & 1 \end{pmatrix} = \begin{pmatrix} 0&0 \\\\ 0&1 \end{pmatrix}.\\\\ $$

This also means that we can write any matrix purely in terms of outer products. In the examples above, we constructed the four matrices that cover each of the single elements in a single-qubit matrix, so we can write any other single-qubit matrix in terms of them.

$$ M= \begin{pmatrix} m_{0,0}&m_{0,1} \\\\ m_{1,0}&m_{1,1} \end{pmatrix} = m_{0,0} |0\rangle\langle0|+ m_{0,1} |0\rangle\langle1|+ m_{1,0} |1\rangle\langle0|+ m_{1,1} |1\rangle\langle1| $$

This property also extends to matrices for any number of qubits, $n$. We simply use the outer products of the corresponding $n$-bit strings.

2.2 Unitary and Hermitian matrices

The Hermitian conjugate $M^\dagger$ of a matrix $M$ is the combination of the transpose (replace the bottom left element with the top right, and so on) and the complex conjugate of each element. Two families of matrices that are very important to quantum computing are defined by their relationship with the Hermitian conjugate. One is the family of unitary matrices, for which

$$ U U^\dagger = U^\dagger U = 1. $$

This means that the Hermitian conjugate of a unitary is its inverse: another unitary $U^\dagger$ with the power to undo the effects of $U$. All gates in quantum computing, with the exception of measurement and reset operations, can be represented by unitary matrices.

Another consequence of unitarity is that it preserves the inner product between two arbitrary states. Specifically, take two states $\left| \psi_0 \right\rangle$ and $\left| \psi_1 \right\rangle$. The inner product of these is $\left\langle \psi_0 | \psi_1 \right\rangle$. If we apply the same unitary $U$ to both, the inner product of the resulting states is exactly the same,

$$ \left( \left\langle \psi_0 \right| U^\dagger \right) \left( U \left| \psi_1 \right\rangle \right) = \left\langle \psi_0 |U^\dagger U| \psi_1 \right\rangle = \left\langle \psi_0 | \psi_1 \right\rangle. $$

This property provides us with a useful way of thinking about these gates. It means that for any set of states $\{ \left| \psi_j \right\rangle \}$ that provide an orthonormal basis for our system, the set of states $\{ \left| \phi_j \right\rangle = U \left| \psi_j \right\rangle \}$ will also be an orthonormal basis. The unitary can then be thought of as a rotation between these bases, and can be written accordingly as

$$ U = \sum_j \left| \phi_j \right\rangle \left\langle \psi_j \right|. $$

This is essentially the quantum version of the 'truth tables' that describe the action of classical Boolean gates.

The other important family of matrices are the Hermitian matrices. These are those that are unaffected by the Hermitian conjugate

$$ H = H^\dagger. $$

The matrices $X$, $Y$, $Z$ and $H$ are examples of Hermitian matrices that you've already seen (coincidentally, they are also all unitary since they are their own inverses).

All unitary matrices and Hermitian matrices have the property of being diagonalizable. This means that they can be written in the form

$$ M = \sum_j \lambda_j |h_j\rangle\langle h_j|, $$

where the $\lambda_j$ are the eigenvalues of the matrix and $|h_j\rangle$ are the corresponding eigenstates.

For unitaries, applying the $U U^\dagger=1$ condition in this diagonal form imples that $\lambda_j \lambda_j^* = 1$. The eigenvalues are therefore always complex numbers of magnitude 1, and so can be expressed $e^{ih_j}$ for some real value $h_j$. For Hermitian matrices the condition $H = H^\dagger$ implies $\lambda_j = \lambda_j^*$, and hence that all eigenvalues are real.

These two types of matrices therefore differ only in that one must have real numbers for eigenvalues, and the other must have complex exponentials of real numbers. This means that, for every unitary, we can define a corresponding Hermitian matrix. For this we simply reuse the same eigenstates, and use the $h_j$ from each $e^{ih_j}$ as the corresponding eigenvalue.

Similarly, for each Hermitian matrix there exists a unitary. We simply reuse the same eigenstates, and exponentiate the $h_j$ to create the eigenvalues $e^{ih_j}$. This can be expressed as

$$ U = e^{iH} $$

Here we have used the standard definition of how to exponentiate a matrix, which has exactly the properties we require: preserving the eigenstates and exponentiating the eigenvalues.

2.3 Pauli decomposition

As we saw above, it is possible to write matrices entirely in terms of outer products.

$$ M= \begin{pmatrix} m_{0,0}&m_{0,1} \\\\ m_{1,0}&m_{1,1} \end{pmatrix} = m_{0,0} |0\rangle\langle0|+ m_{0,1} |0\rangle\langle1|+ m_{1,0} |1\rangle\langle0|+ m_{1,1} |1\rangle\langle1| $$

Now we will see that it is also possible to write them completely in terms of Pauli operators. For this, the key thing to note is that

$$ \frac{1+Z}{2} = \frac{1}{2}\left[ \begin{pmatrix} 1&0 \\\\0&1 \end{pmatrix}+\begin{pmatrix} 1&0 \\\\0&-1 \end{pmatrix}\right] = |0\rangle\langle0|,\\\\\frac{1-Z}{2} = \frac{1}{2}\left[ \begin{pmatrix} 1&0 \\\\0&1 \end{pmatrix}-\begin{pmatrix} 1&0 \\\\0&-1 \end{pmatrix}\right] = |1\rangle\langle1| $$

This shows that $|0\rangle\langle0|$ and $|1\rangle\langle1|$ can be expressed using the identity matrix and $Z$. Now, using the property that $X|0\rangle = |1\rangle$, we can also produce

$$ |0\rangle\langle1| = |0\rangle\langle0|X = \frac{1}{2}(1+Z)~X = \frac{X+iY}{2},\\\\ |1\rangle\langle0| = X|0\rangle\langle0| = X~\frac{1}{2}(1+Z) = \frac{X-iY}{2}. $$

Since we have all the outer products, we can now use this to write the matrix in terms of Pauli matrices:

$$ M = \frac{m_{0,0}+m_{1,1}}{2}~1~+~\frac{m_{0,1}+m_{1,0}}{2}~X~+~i\frac{m_{0,1}-m_{1,0}}{2}~Y~+~\frac{m_{0,0}-m_{1,1}}{2}~Z. $$

This example was for a general single-qubit matrix, but the corresponding result is true also for matrices for any number of qubits. We simply start from the observation that

$$ \left(\frac{1+Z}{2}\right)\otimes\left(\frac{1+Z}{2}\right)\otimes\ldots\otimes\left(\frac{1+Z}{2}\right) = |00\ldots0\rangle\langle00\ldots0|, $$

and can then proceed in the same manner as above. In the end it can be shown that any matrix can be expressed in terms of tensor products of Pauli matrices:

$$ M = \sum_{P_{n-1},\ldots,P_0 \in \{1,X,Y,Z\}} C_{P_{n-1}\ldots,P_0}~~P_{n-1} \otimes P_{n-2}\otimes\ldots\otimes P_0. $$

For Hermitian matrices, note that the coefficients $C_{P_{n-1}\ldots,P_0}$ here will all be real.

3. Defining Universality

Just as each quantum gate can be represented by a unitary, so too can we describe an entire quantum computation by a (very large) unitary operation. The effect of this is to rotate the input state to the output state.

One possible special case of this is that the input and output states describe bit strings expressed in quantum form. The mapping of each input $x$ to its output $f(x)$ could be described by some (reversible) classical computation. Any such computation could therefore be expressed as a unitary.

$$ U = \sum_j \left| f(x) \right\rangle \left\langle x \right|. $$

If we were able to implement any possible unitary, it would therefore mean we could acheive universality in the sense of standard digital computers.

Another special case is that the input and output states could describe a physical system, and the computation we perform is to simulate the dynamics of that system. This is an important problem that is impractical for classical computers, but is a natural application of quantum computers. The time evolution of the system in this case corresponds to the unitary that we apply, and the associated Hermitian matrix is the Hamiltonian of the system. Achieving any unitary would therefore correspond to simulating any time evolution, and engineering the effects of any Hamiltonian.

Combining these insights we can define what it means for quantum computers to be universal. It is simply the ability to achieve any desired unitary on any arbitrary number of qubits. If we have this, we know that we can reproduce anything a digital computer can do, simulate any quantum system, and do everything else that is possible for a quantum computer.

4. Basic Gate Sets

Whether or not we can build any unitary from a set of basic gates depends greatly on what basic gates we have access to. For every possible realization of fault-tolerant quantum computing, there is a set of quantum operations that are most straightforward to realize. Often these consist of single- and two-qubit gates, most of which correspond to the set of so-called Clifford gates. This is a very important set of operations, which do a lot of the heavy-lifting in any quantum algorithm.

4.1 Clifford Gates

To understand Clifford gates, let's start with an example that you have already seen many times: the Hadamard.

$$ H = |+\rangle\langle0|~+~ |-\rangle\langle1| = |0\rangle\langle+|~+~ |1\rangle\langle-|. $$

This gate is expressed above using outer products, as described above. When expressed in this form, its famous effect becomes obvious: it takes $|0\rangle$, and rotates it to $|+\rangle$. More generally, we can say it rotates the basis states of the z measurement, $\{ |0\rangle,|1\rangle \}$, to the basis states of the x measurement, $\{ |+\rangle,|-\rangle \}$, and vice versa.

In this way, the effect of the Hadamard is to move information around a qubit. It swaps any information that would previously be accessed by an x measurement with that accessed by a z measurement.

The Hadamard can be combined with other gates to perform different operations, for example:

$$ H X H = Z,\\\\ H Z H = X. $$

By doing a Hadamard before and after an $X$, we cause the action it previously applied to the z basis states to be transferred to the x basis states instead. The combined effect is then identical to that of a $Z$. Similarly, we can create an $X$ from Hadamards and a $Z$.

Similar behavior can be seen for the $S$ gate and its Hermitian conjugate,

$$ S X S^{\dagger} = Y,\\\\ S Y S^{\dagger} = -X,\\\\ S Z S^{\dagger} = Z. $$

This has a similar effect to the Hadamard, except that it swaps $X$ and $Y$ instead of $X$ and $Z$. In combination with the Hadamard, we could then make a composite gate that shifts information between y and z.

This property of transforming Paulis into other Paulis is the defining feature of Clifford gates. Stated explicitly for the single-qubit case: if $U$ is a Clifford and $P$ is a Pauli, $U P U^{\dagger}$ will also be a Pauli. For Hermitian gates, like the Hadamard, we can simply use $U P U$.

Further examples of single-qubit Clifford gates are the Paulis themselves. These do not transform the Pauli they act on. Instead, they simply assign a phase of $-1$ to the two that they anticommute with. For example,

$$ Z X Z = -X,\\\\ Z Y Z = -Y,\\\\ Z Z Z= ~~~~Z. $$

You may have noticed that a similar phase also arose in the effect of the $S$ gate. By combining this with a Pauli, we could make a composite gate that would cancel this phase, and swap $X$ and $Y$ in a way more similar to the Hadamard's swap of $X$ and $Z$.

For multiple-qubit Clifford gates, the defining property is that they transform tensor products of Paulis to other tensor products of Paulis. For example, the most prominent two-qubit Clifford gate is the CNOT. The property of this that we will make use of in this chapter is

$$ { CX}_{j,k}~ (X \otimes 1)~{ CX}_{j,k} = X \otimes X. $$

This effectively 'copies' an $X$ from the control qubit over to the target.

The process of sandwiching a matrix between a unitary and its Hermitian conjugate is known as conjugation by that unitary. This process transforms the eigenstates of the matrix, but leaves the eigenvalues unchanged. The reason why conjugation by Cliffords can transform between Paulis is because all Paulis share the same set of eigenvalues.

4.2 Non-Clifford Gates

The Clifford gates are very important, but they are not powerful on their own. In order to do any quantum computation, we need gates that are not Cliffords. Three important examples are arbitrary rotations around the three axes of the qubit, $R_x(\theta)$, $R_y(\theta)$ and $R_z(\theta)$.

Let's focus on $R_x(\theta)$. As we saw above, any unitary can be expressed in an exponential form using a Hermitian matrix. For this gate, we find

$$ R_x(\theta) = e^{i \frac{\theta}{2} X}. $$

The last section also showed us that the unitary and its corresponding Hermitian matrix have the same eigenstates. In this section, we've seen that conjugation by a unitary transforms eigenstates and leaves eigenvalues unchanged. With this in mind, it can be shown that

$$ U R_x(\theta)U^\dagger = e^{i \frac{\theta}{2} ~U X U^\dagger}. $$

By conjugating this rotation by a Clifford, we can therefore transform it to the same rotation around another axis. So even if we didn't have a direct way to perform $R_y(\theta)$ and $R_z(\theta)$, we could do it with $R_x(\theta)$ combined with Clifford gates. This technique of boosting the power of non-Clifford gates by combining them with Clifford gates is one that we make great use of in quantum computing.

4.3 Expanding the Gate Set

As another example of combining $R_x(\theta)$ with Cliffords, let's conjugate it with a CNOT.

$$ CX_{j,k} ~(R_x(\theta) \otimes 1)~ CX_{j,k} = CX_{j,k} ~ e^{i \frac{\theta}{2} ~ (X\otimes 1)}~ CX_{j,k} = e^{i \frac{\theta}{2} ~CX_{j,k} ~ (X\otimes 1)~ CX_{j,k}} = e^{i \frac{\theta}{2} ~ X\otimes X} $$

This transforms our simple, single-qubit rotation into a much more powerful two-qubit gate. This is not just equivalent to performing the same rotation independently on both qubits. Instead, it is a gate capable of generating and manipulating entangled states.

We needn't stop there. We can use the same trick to extend the operation to any number of qubits. All that's needed is more conjugates by the CNOT to keep copying the $X$ over to new qubits.

Furthermore, we can use single-qubit Cliffords to transform the Pauli on different qubits. For example, in our two-qubit example we could conjugate by $S$ on the qubit on the right to turn the $X$ there into a $Y$:

$$ \left( I \otimes S \right) ~e^{i \frac{\theta}{2} ~ X\otimes X}~\left( I \otimes S^\dagger \right) = e^{i \frac{\theta}{2} ~ X\otimes Y}. $$

With these techniques, we can make complex entangling operations that act on any arbitrary number of qubits, of the form

$$ U = e^{i\frac{\theta}{2} ~ P_{n-1}\otimes P_{n-2}\otimes...\otimes P_0}, ~~~ P_j \in \{I,X,Y,Z\}. $$

This all goes to show that combining the single and two-qubit Clifford gates with rotations around the x axis gives us a powerful set of possibilities. What's left to demonstrate is that we can use them to do anything.

5. Proving Universality

As for classical computers, we will need to split this big job up into manageable chunks. We'll need to find a basic set of gates that will allow us to achieve this. As we'll see, the single- and two-qubit gates of the last section are sufficient for the task.

Suppose we wish to implement the unitary

$$ U = e^{i(aX + bZ)}, $$

but the only gates we have are $R_x(\theta) = e^{i \frac{\theta}{2} X}$ and $R_z(\theta) = e^{i \frac{\theta}{2} Z}$. The best way to solve this problem would be to use Euler angles. But let's instead consider a different method.

The Hermitian matrix in the exponential for $U$ is simply the sum of those for the $R_x(\theta)$ and $R_z(\theta)$ rotations. This suggests a naive approach to solving our problem: we could apply $R_z(2b) = e^{i bZ}$ followed by $R_x(2a) = e^{i a X}$. Unfortunately, because we are exponentiating matrices that do not commute, this approach will not work.

$$ e^{i a X} e^{i b Z} \neq e^{i(aX + bZ)} $$

However, we could use the following modified version:

$$ U = \lim_{n\rightarrow\infty} ~ \left(e^{iaX/n}e^{ibZ/n}\right)^n. $$

Here we split $U$ up into $n$ small slices. For each slice, it is a good approximation to say that

$$ e^{iaX/n}e^{ibZ/n} = e^{i(aX + bZ)/n} $$

The error in this approximation scales as $1/n^2$. When we combine the $n$ slices, we get an approximation of our target unitary whose error scales as $1/n$. So by simply increasing the number of slices, we can get as close to $U$ as we need. Other methods of creating the sequence are also possible to get even more accurate versions of our target unitary.

The power of this method is that it can be used in complex cases than just a single qubit. For example, consider the unitary

$$ U = e^{i(aX\otimes X\otimes X + bZ\otimes Z\otimes Z)}. $$

We know how to create the unitary $e^{i\frac{\theta}{2} X\otimes X\otimes X}$ from a single qubit $R_x(\theta)$ and two controlled-NOTs.

qc.cx(0,2)
qc.cx(0,1)
qc.rx(theta,0)
qc.cx(0,1)
qc.cx(0,2)

With a few Hadamards, we can do the same for $e^{i\frac{\theta}{2} Z\otimes Z\otimes Z}$.

qc.h(0)
qc.h(1)
qc.h(2)
qc.cx(0,2)
qc.cx(0,1)
qc.rx(theta,0)
qc.cx(0,1)
qc.cx(0,2)
qc.h(2)
qc.h(1)
qc.h(0)

This gives us the ability to reproduce a small slice of our new, three-qubit $U$:

$$ e^{iaX\otimes X\otimes X/n}e^{ibZ\otimes Z\otimes Z/n} = e^{i(aX\otimes X\otimes X + bZ\otimes Z\otimes Z)/n}. $$

As before, we can then combine the slices together to get an arbitrarily accurate approximation of $U$.

This method continues to work as we increase the number of qubits, and also the number of terms that need simulating. Care must be taken to ensure that the approximation remains accurate, but this can be done in ways that require reasonable resources. Adding extra terms to simulate, or increasing the desired accuracy, only require the complexity of the method to increase polynomially.

Methods of this form can reproduce any unitary $U = e^{iH}$ for which $H$ can be expressed as a sum of tensor products of Paulis. Since we have shown previously that all matrices can be expressed in this way, this is sufficient to show that we can reproduce all unitaries. Though other methods may be better in practice, the main concept to take away from this chapter is that there is certainly a way to reproduce all multi-qubit unitaries using only the basic operations found in Qiskit. Quantum universality can be achieved!

This gate set is not the only one that can acheive universality. For example it can be shown that just the Hadamard and Toffoli are sufficient for universality. Multiple other gates sets have also been considered and been proven universal, each motivated by different routes toward acheiving the gates fault-tolerantly.

Everything we have discussed in this book follows the circuit model of computation. However, the circuit model is not the only universal model of quantum computation. There exists other forms of quantum computing such as the adiabatic quantum computing or the measurement based quantum computer. The fact that they are universal means that it has been proven that there is a polynomial time and resource mapping from the circuit model to these models of computing. These often leverage other quantum mechanical properties also in order to perform their computation. It is important to be aware that while these other forms of quantum computation exist, the benefits of each are hardware related. Since a universal model of quantum computation can perform any quantum algorithm, we need only stick with the circuit model and can ignore other universal models for our discussion.

There are other forms of quantum computation that are not universal, but are applicable to specific applications. For example quantum annealing is very useful to optimization and sampling problems. Annealing is the process of heating a metal to a high temperature and then allowing it to cool down slowly. This causes molten metal to flow over the surface of the metal piece and redistribute itself. This process changes many properties of the metal in question. Quantum annealing is analogous to this in some sense. It involves encoding problems into an energy landscape of sorts and then letting a quantum state explore the landscape.While normal waves may get trapped in troughs which are lower than their surroundings (local minima), quantum effects increase the speed at which the quantum states find the true lowest point on the landscape (global minima).

import qiskit
qiskit.__qiskit_version__
{'qiskit-terra': '0.14.2',
 'qiskit-aer': '0.5.2',
 'qiskit-ignis': '0.3.3',
 'qiskit-ibmq-provider': '0.7.2',
 'qiskit-aqua': '0.7.3',
 'qiskit': '0.19.6'}