Quantum Information for Engineering: An Introductory Theoretical Approach

Initial IPQ note – This text may be somewhat demanding if you do not have prior knowledge of information theory and computation. Nevertheless, I encourage you to read it anyway. Even if we miss some points, it should be interesting for anyone looking to gain a superficial understanding of the capabilities and limitations of quantum computing.

Additional IPQ note - This post is a translation. If you can read Spanish, I strongly recommend you to look for the original version on this blog.

When we approach quantum computing, even if we have a technical background, we tend to think that by moving from a classical approach to a quantum one -with the corresponding change in the rules of the game- there will be a kind of explosive growth in computational capacity, associated to an equally explosive growth in the ability to represent information.

However, if we attend to the most basic principles of quantum mechanics, we already know from various authors such as Von Neumann (1932) that when we measure a superposed state $\ket{\psi}=\sum_ic_i\ket{a_i}$ it collapses into one of the states of the orthonormal basis $\ket{a_k}$ with probability $|c_k|^2$.

In other words, everytime we conduct a measurement our quantum house of cards collapses into a collection of states that is exactly the same we have in classical computing. A qubit is reduced to a bit. So much for that.

It would seem that this principle equates what is feasible in quantum systems with what was already feasible in classical systems. We are seeing that, no matter how much superposition we have, from an information point of view it does not appear to give us that much.

On the other hand, we do not want to give up so easily. If we were able to manipulate systems without establishing any measurement, perhaps this would open up new computational scenarios.

Can we claim that we are effectively working with more information if we know that we are limited by the very nature of measurement and its intrinsic physical meaning? Is there a difference between operating on information and processing, storing, and exchanging that information? Does the measurement process, with its wavefunction collapse, prevent us from manipulating data under all circumstances?

Several very interesting questions arise, which, as we will see later, invite us to look at the problem from different angles; but first we need some context.

Motivation for information theory

What we know as classical information theory was born in the middle of the last century with the work of Shannon (1948).

The aim was to provide a formal, engineering-driven framework that would make it possible to design information and telecommunications systems while dealing with some well-known challenges, such as noise in communication channels.

Shannon (1948) adopted concepts already present in other disciplines, such as entropy; but he expressed them in a noticeably different way, abandoning the idea of disorder and incorporating instead a measure of uncertainty, or, viewed differently, the limitation in determining the precise state of a system. \begin{equation} H=-\sum p_i\log p_i, \end{equation}

Notice that from the expression above, assuming equiprobable states, we can infer how “difficult” it is to represent our system or, in other words, how many bits we need in order to encode it.

Shannon (1948), going deeper into this concept, described the maximum capacity of a channel as $C=\max I(X;Y)$, where the mutual information between the input $X$ and the output $Y$ is expressed as the difference between the entropy of $X$ and the entropy of $X$ conditioned on $Y$ as follows: \begin{equation} I(X;Y)=H(X)-H(X|Y), \end{equation}Finally, Shannon (1948) introduced the concept of channel codes as a way to guarantee lossless message transmission, even in noisy-channel scenarios, using redundancy so that the message can be reconstructed accurately at the destination.

The classical model falls short

In the second half of the 20th century, authors such as Landauer (1961) began to ask questions that ultimately pointed out the need for a new conceptual framework.

The premise is irrefutable: if information is conditioned by the physical substrate that holds it, then when that substrate is quantum, information will be conditioned by a set of laws that is also quantum, not classical.

Moreover, after the publication of Shannon (1948) it becomes clear that there is a list of quantum-system features (superposition, entanglement, collapse upon measurement, ...) which in many cases are not even contemplated by classical theory.

In this context, highly notable works emerge such as Schumacher (1995), which establishes the relationship between the number of qubits required to represent a quantum source and the entropy of Von Neumann (1927):

\begin{equation} n_{qubits}=S\bigg(\sum_z p_z \rho_z\bigg), \end{equation} And limits are placed on the amount of information accessible via Holevo (1973): \begin{equation} \chi\leq S(\rho)-\sum_x p_x S(\rho_x), \end{equation} where we are told that the classical information accessible from quantum states is bounded by the entropy of Von Neumann (1927).

The culmination arrives in the late 1990s, when Shor (1997) and Grover (1997) publish the first algorithms that operate over a quantum computational basis. As we will see later, these algorithms represent a revolution in how we treat problems with real industrial applications.

It becomes evident that the theoretical formalism of quantum information is already an engineering necessity, and the circle begun by Shannon (1948) is closed. We now have a complete theory.

Does quantum theory outperform classical theory in handling and exchanging information?

Once the context is established, we get to the core issue.

It would seem that quantum mechanics, by introducing superposition and entanglement, provides a working framework that multiplies the possibilities for designing information and telecommunications systems.

However, as we saw at the beginning of this post, we face a fundamental limitation due to wavefunction collapse and its mathematical translation: we can only measure states in our orthonormal basis. Moreover, as we saw in the previous section, there is a clear limit on the classical information accessible from quantum states.

So what is going on here? Well, as we said a few pages ago, we have to look at the problem from different angles.

On quantum processing

The superposition principle tells us that we can represent the state of a quantum system as a combination of orthonormal-basis states multiplied by their complex amplitudes, expressed by Von Neumann (1932) in the form:

\begin{equation} \ket{\psi}= \sum_n c_n \ket{\phi_n}, \end{equation}If we think in terms of qubits — since we are talking about computing — we should see it as follows: our classical system operates with binary digits (bits). These bits can take discrete values {0,1}. We know this well and we are comfortable with it.

Now we are going to replace it with our brand-new quantum system.

In a quantum system we will operate with qubits, and qubits are vectors. If we stick to a simple scenario with a single qubit to avoid tensor products, the basis states of $\mathbb{C}^2$ are: $$ \lvert 0 \rangle = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \qquad \lvert 1 \rangle = \begin{pmatrix} 0 \\ 1 \end{pmatrix}. $$

which -following the rules of quantum mechanics- can take the following values in superposition, where $\alpha$ and $\beta$ are complex amplitudes: \begin{equation} \ket{\psi}=\alpha\ket{0}+\beta\ket{1}, \end{equation}

From a computational point of view this is a total paradigm shift. We could theoretically operate with an infinite set of options.

However, we have already seen that there is a fundamental limitation that arises when measuring the state of our system. Whenever we do so, we will collapse into one of the basic states $\ket{0}$ or $\ket{1}$

To which extent does this circumstance block us?

Well, if we could define algorithms that operate on a quantum system while preserving superposition, manipulating complex amplitudes and changing the state without needing to measure, we would have a computational tool that is clearly superior.

All right -you may be thinking- but at some point we will have to measure. And that is correct: we will have to measure; but until that happens our quantum house of cards will remain intact. And we have already seen that it is a VERY large house of cards.

Quantum logic gates do exactly what we want.

A quantum logic gate is, in essence, a unitary operator that allows us to reconfigure the state of a quantum system as many times as needed; but -and this is crucial- without altering its superposition and entanglement unless we want to do so explicitly. \begin{equation} \mathrm{Hadamard} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}, \qquad \mathrm{CNOT} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{pmatrix}, \end{equation}

By successively applying gates of this kind, we will exploit the wave-like nature of our quantum system and force constructive interference on the amplitudes associated with solution states. Likewise, we can generate destructive interference on the amplitudes we are not interested in.

When we perform a measurement -only upon completing the algorithm- the system will inevitably collapse into one of the states in the orthonormal basis; but by then we will be done. We know from Von Neumann (1932) that the probability of measuring each basic state depends on the squared modulus of its associated complex amplitude.

This is the key.

Thanks to our algorithm and its interferences, we will face a scenario in which we will tend with much higher probability to collapse into states that are solutions to our problem.

It looks very promising. It seems we have it.

What is the computational improvement?

Let us use the examples from the previous section, specifically the factorization of very large integers.

The best classical algorithm, the General Number Field Sieve (Lenstra et al, 1993), achieves sub-exponential complexity $\mathcal{O}\!\left(\exp((\log N)^c)\right)$.
However, the algorithm by Shor (1997) breaks all records and comes down to polynomial complexity $\mathcal{O}((\log N)^3)$.

Well, the improvement is substantial. In fact it is so substantial that, although we will not go into it in depth here, it has forced the entire cybersecurity industry to rethink its foundations.

Confirmed. Yes, we DO have it.

On quantum storage and transport

In the previous section we saw that thanks to superposition the processing capacity of a quantum system is clearly superior compared with a classical system. Provided, of course, that appropriate algorithms exist.

However, with respect to data storage, it does not seem we will be able to bypass the limitation imposed by the fact that, in order to make any stored information valuable, we must be able to read it.

In other words, we are limited to interpreting and storing only what we can measure. In essence, and no matter how a quantum system may at some point be in superposition, we will always have to collapse that superposition into one of the orthonormal-basis states.

There is no way to escape this circumstance, and the theoretical foundations indicate it as well. Although the entropy of Von Neumann (1927) may in principle be greater than the entropy of Shannon (1948), the Holevo (1973) bound forces accessible information to always be limited by Shannon (1948), as follows: \begin{equation} \chi= S(\rho)-\sum_x p_x S(\rho_x)\leq S(\rho), \qquad \chi\leq H(p_x), \end{equation}

So, is there no advantage at all?

Although, as we saw in the previous section, quantum information theory does not provide net improvements in our ability to store and transmit information, it does provide substantial advantages at cryptographic level.

We can see in proposals such as Bennett and Brassard (1984) that, in this particular case, the collapse induced by measuring a state works in our favor.

Let us imagine that we want to exchange information and that we want to know, conclusively, whether the channel has been compromised or not.

We could prepare qubits in different orthonormal bases, exchange them without revealing which basis we used, and compare results at the destination.

States in the Hadamard basis (axis $X$): $$ \lvert + \rangle = \frac{1}{\sqrt{2}}\bigl(\lvert 0 \rangle + \lvert 1 \rangle\bigr), \qquad \lvert - \rangle = \frac{1}{\sqrt{2}}\bigl(\lvert 0 \rangle - \lvert 1 \rangle\bigr). $$ States with imaginary phase (related to axis $Y$): $$ \lvert i+ \rangle = \frac{1}{\sqrt{2}}\bigl(\lvert 0 \rangle + i \lvert 1 \rangle\bigr), \qquad \lvert i- \rangle = \frac{1}{\sqrt{2}}\bigl(\lvert 0 \rangle - i \lvert 1 \rangle\bigr). $$

If a malicious third party managed to intercept the communication, since they did not know the preparation basis, they would inadvertently perform measurements incompatible with the original state.

Without intending to, this third party would have introduced a statistically detectable disturbance into the system.

The power of this mechanism lies in the fact that it follows the natural behavior of quantum mechanics itself. There is no algorithm involved, nor any key that can be attacked by brute force.

Unless the laws of mechanics change, the communication channel will be infallible at detecting intrusions. Forever.

Is classical theory contained within quantum theory?

In the previous section we studied what is different in quantum information theory from a systems-design point of view. We verified that when it comes to processing capacity there is a substantial improvement, but with regards to interpretation, storage, and transport we do not get to improve classical figures.

We might therefore think that classical information theory is contained within the quantum information theory, with the latter being a generalization and the former a particular case.

Generalizing classical theory

Quantum systems are built with a layer of uncertainty, modeled as a complex amplitude, applied on top of an orthonormal basis of states. If we remove this amplitude and all states are diagonal in the same basis, superpositions would disappear from our system and, in practice, we would be returning to the fundamental definition of ordinary entropy and channel capacity. Let us see it.

Diagonality. When $\rho$ is diagonal, the state behaves exactly like a classical variable with probability $P_i$: \begin{equation} \rho=\sum_i P_i\ket{i}\bra{i}, \end{equation}
Entropy. If $\rho$ is diagonal with eigenvalues $p_i$, then: \begin{equation} S(\rho)=-\sum_i p_i \log p_i=H(p_i), \end{equation}
Trace distance. According to Helmstrom (1976), if two states $\rho$ and $\sigma$ are diagonal in the same basis with probabilities $p_i$ and $q_i$, then their trace distance reduces to the classical form: \begin{equation} D_{tr}=\frac{1}{2}\sum_i |p_i-q_i|,\qquad D_{tr}(\rho,\sigma)=\frac{1}{2}||\rho-\sigma||_1, \end{equation}
Fidelity. Applying Uhlmann (1976) and Bhattacharyya (1943), we know that the quantum fidelity between two states $\rho$ and $\sigma$, if both are diagonal in the same basis, reduces to the classical formulation: \begin{equation} F(\rho, \sigma)=\bigg(\mathrm{Tr}\sqrt{\sqrt{\rho}\sigma\sqrt{\rho}}\bigg)^2, \qquad F=\bigg(\sum_i\sqrt{p_iq_i}\bigg)^2, \end{equation}

A generalization with restrictions

Is it therefore correct to conclude that quantum information theory is a generalization of classical theory?.

Well, not really.

While, as we discussed in the previous section, classical theory seems like a particular form of quantum theory, there are some properties of the quantum formalism that are inherently more restrictive, in such a way that they prevent operations that are perfectly feasible in the classical world.

On top of that, the physical nature of the quantum systems forces us to incorporate concepts that do not exist in classical systems, such as coherence, understood here as a measure of a quantum system’s ability to preserve superposition and entanglement.

No-cloning principle. Unlike classical bits, according to the theorem by Wootters, W. K. and Zurek, W. H. (1982), qubits are, by definition, impossible to clone. There is no unitary operator U of the following form: \begin{equation} U\ket{\psi}_A\ket{e}_B=\ket{\psi}_A\ket{\psi}_B, \qquad \mathcal{H}_A=\mathcal{H}_B=\mathcal{H}, \end{equation}
Indistinguishability principle. According to Helmstrom (1976), only orthogonal states are distinguishable in a measurement. This limits us to the states of the orthonormal basis. \begin{equation}P_{\mathrm{max}} = \frac{1}{2}\left(1 + D_{tr}(\rho,\sigma)\right),\end{equation}
Coherence principle. This concept, which does not even exist in classical theory, establishes, according to Schumacher (1996), the measure of the coherent information of a channel in the form (where $\mathcal{E}$ represents the communication channel itself): \begin{equation} I(\rho, \mathcal{E})=S(\mathcal{E}(\rho))-S(\rho, \mathcal{E}), \end{equation}
Measurement as wavefunction collapse. This is another concept that does not exist in classical theory, and it is one of the distinctive elements of quantum theory, as we have seen throughout this post. Delving into the dual nature of particles (corpuscular and wave-like), according to Schrödinger (1926), the evolution of a system is described as: \begin{equation} |\psi(t)\rangle = e^{-i\omega t}\,|\psi(0)\rangle, \qquad \omega = \frac{E}{\hbar}, \end{equation} This wavefunction is not preserved indefinitely. In a quantum system, the act of measuring fundamentally alters the state, forcing it to collapse into one of the orthonormal-basis states.

References

Bennett, C. H. and Brassard, G. (1984). Quantum cryptography: Public key distribution and coin tossing. In Proceedings of IEEE International Conference on Computers, Systems and Signal Processing, pp. 175–179, Bangalore, India.

Bhattacharyya, A. (1943). On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society, 35:99–109.

Grover, L. K. (1997). Quantum mechanics helps in searching for a needle in a haystack. Physical Review Letters, 79(2):325–328.

Helstrom, C. W. (1976). Quantum Detection and Estimation Theory. Academic Press.

Holevo, A. S. (1973). Bounds for the quantity of information transmitted by a quantum communication channel. Problems of Information Transmission, 9(3):177–183.

Landauer, R. (1961). Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5(3):183–191.

Lenstra, A. K., Lenstra, H. W., Manasse, M. S., and Pila, J. (1993). The number field sieve. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, pp. 564–572.

Schrödinger, E. (1926). Quantisierung als Eigenwertproblem. Annalen der Physik, 79(4), 361–376.

Schumacher, B. (1995). Quantum coding. Physical Review A, 51(4).

Schumacher, B. and Nielsen, M. A. (1996). Quantum data processing and error correction. Physical Review A, 54(4).

Shannon, C. E. (1948a). A mathematical theory of communication. Bell System Technical Journal, 27(3):379–423.

Shannon, C. E. (1948b). A mathematical theory of communication, part ii. Bell System Technical Journal, 27(4):623–656.

Shor, P. W. (1997). Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing, 26(5):1484–1509.

Uhlmann, A. (1976). The “transition probability” in the state space of a *-algebra. Reports on Mathematical Physics, 9(2):273–279.

Von Neumann, J. (1927). Thermodynamik quantenmechanischer gesamtheiten. G¨ottinger Nachrichten, 1:273–291.

Von Neumann, J. (1932). Mathematische Grundlagen der Quantenmechanik, volumen 38 de Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer, Berlin.

Wootters, W. K. and Zurek, W. H. (1982). A single quantum cannot be cloned. Nature, 299:802–803.

Buscar este blog