Hybrid quantum-classical Neural Networks with PyTorch and Qiskit

Hybrid quantum-classical Neural Networks with PyTorch and Qiskit

Machine learning (ML) has established itself as a successful interdisciplinary field which seeks to mathematically extract generalizable information from data. Throwing in quantum computing gives rise to interesting areas of research which seek to leverage the principles of quantum mechanics to augment machine learning or vice-versa. Whether you're aiming to enhance classical ML algorithms by outsourcing difficult calculations to a quantum computer or optimise quantum algorithms using classical ML architectures - both fall under the diverse umbrella of quantum machine learning (QML).

In this chapter, we explore how a classical neaural network can be partially quantized to create a hybrid quantum-classical neural network. We will code up a simple example which integrates qiskit with a state-of-the-art open source software package - PyTorch. The purpose of this example is to demonstrate the ease of integrating Qiskit with existing ML tools and to encourage ML practitioners to explore what is possible with quantum computing.

How does it work?

Fig.1 Illustrates the framework we will construct in this chapter. Ultimately, we will create a hybrid quantum-classical neural network that seeks to classify hand drawn digits. Note that the edges shown in this image are all directed downward; however, the directionality is not visually indicated.


The background presented here on classical neural networks is included to establish relevant ideas and shared terminology; however, it is still extremely high-level. If you'd like to dive one step deeper into classical neural networks, see the well made video series by youtuber 3Blue1Brown. Alternatively, if you are already familiar with classical networks, you can skip to the next section.

Neurons and Weights

A neural network is ultimately just an elaborate function that is built by composing smaller building blocks called neurons. A neuron is typically a simple, easy-to-compute, and nonlinear function that maps one or more inputs to a single real number. The single output of a neuron is typically copied and fed as input into other neurons. Graphically, we represent neurons as nodes in a graph and we draw directed edges between nodes to indicate how the output of one neuron will be used as input to other neurons. It's also important to note that each edge in our graph is often associated with a scalar-value called a weight. The idea here is that each of the inputs to a neuron will be multiplied by a different scalar before being collected and processed into a single value. The objective when training a neural network consists primarily of choosing our weights such that the network behaves in a particular way.

Feed Forward Neural Networks

It is also worth noting that the particular type of neural network we will concern ourselves with is called a feed-forward neural network (FFNN). This means that as data flows through our neural network, it will never return to a neuron it has already visited. Equivalently, you could say that the graph which describes our neural network is a directed acyclic graph (DAG). Furthermore, we will stipulate that neurons within the same layer of our neural network will not have edges between them.

IO Structure of Layers

The input to a neural network is a classical (real-valued) vector. Each component of the input vector is multiplied by a different weight and fed into a layer of neurons according to the graph structure of the network. After each neuron in the layer has been evaluated, the results are collected into a new vector where the i'th component records the output of the i'th neuron. This new vector can then treated as input for a new layer, and so on. We will use the standard term hidden layer to describe all but the first and last layers of our network.

So how does quantum enter the picture?

To create a quantum-classical neural network, one can implement a hidden layer for our neural network using a parameterized quantum circuit. By "parameterized quantum circuit", we mean a quantum circuit where the rotation angles for each gate are specified by the components of a classical input vector. The outputs from our neural network's previous layer will be collected and used as the inputs for our parameterized circuit. The measurement statistics of our quantum circuit can then be collected and used as inputs for the following layer. A simple example is depicted below:

Here, $\sigma$ is a nonlinear function and $h_i$ is the value of neuron $i$ at each hidden layer. $R(h_i)$ represents any rotation gate about an angle equal to $h_i$ and $y$ is the final prediction value generated from the hybrid network.

What about backpropagation?

If you're familiar with classical ML, you may immediately be wondering how do we calculate gradients when quantum circuits are involved? This would be necessary to enlist powerful optimisation techniques such as gradient descent. It gets a bit technical, but in short, we can view a quantum circuit as a black box and the gradient of this black box with respect to its parameters can be calculated as follows:

where $\theta$ represents the parameters of the quantum circuit and $s$ is a macroscopic shift. The gradient is then simply the difference between our quantum circuit evaluated at $\theta+s$ and $\theta - s$. Thus, we can systematically differentiate our quantum circuit as part of a larger backpropogation routine. This closed form rule for calculating the gradient of quantum circuit parameters is known as the parameter shift rule.

Let's code!


First, we import some handy packages that we will need, including Qiskit and PyTorch.

import numpy as np
import torch
from torch.autograd import Function
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from torchvision import datasets, transforms
from qiskit import QuantumRegister, QuantumCircuit, ClassicalRegister, execute
from qiskit.circuit import Parameter
from qiskit import Aer
from matplotlib import pyplot as plt
%matplotlib inline

Tensors to lists

Next we create an additional function that converts a tensor to a list in Python. This is needed to connect Qiskit and PyTorch objects. In particular, we will use this function to convert tensors produced by PyTorch to a list, such that they can be fed into quantum circuits in Qiskit.

def to_numbers(tensor_list):
    num_list = []
    for tensor in tensor_list:
        num_list += [tensor.item()]
    return num_list

Create a "quantum class" with Qiskit

We can conveniently put our Qiskit quantum functions into a class. First, we specify how many trainable quantum parameters and how many shots we wish to use in our quantum circuit. In this example, we will keep it simple and use a 1-qubit circuit with one trainable quantum parameter $\theta$. We hard code the circuit for simplicity and use a $RY-$rotation by the angle $\theta$ to train the output of our circuit. The circuit looks like this:

In order to measure the output in the $z-$basis, we create a Python function to obtain the $\sigma_z$ expectation. Lastly, we create a "bind" function to convert our parameter to a list and run the circuit on the Aer simulator. We will see later how this all ties into the hybrid neural network.

class QiskitCircuit():
    # Specify initial parameters and the quantum circuit
    def __init__(self,shots):
        self.theta = Parameter('Theta')
        self.shots = shots
        def create_circuit():
            qr = QuantumRegister(1,'q')
            cr = ClassicalRegister(1,'c')
            ckt = QuantumCircuit(qr,cr)
            return ckt
        self.circuit = create_circuit()
    def N_qubit_expectation_Z(self, counts, shots, nr_qubits):
        expects = np.zeros(nr_qubits)
        for key in counts.keys():
            perc = counts[key]/shots
            check = np.array([(float(key[i])-1/2)*2*perc for i in range(nr_qubits)])
            expects += check   
        return expects    
    def bind(self, parameters):
        [self.theta] = to_numbers(parameters)[2][0]._params = to_numbers(parameters)
    def run(self, i):
        backend = Aer.get_backend('qasm_simulator')
        job_sim = execute(self.circuit,backend,shots=self.shots)
        result_sim = job_sim.result()
        counts = result_sim.get_counts(self.circuit)
        return self.N_qubit_expectation_Z(counts,self.shots,1)

Create a "quantum-classical class" with PyTorch

Now that our quantum circuit is defined, we can create the functions needed for backpropagation using PyTorch. The forward and backward passes contain elements from our Qiskit class. The backward pass directly computes the analytical gradients using the finite difference formula we introduced above.

class TorchCircuit(Function):    

    def forward(ctx, i):
        if not hasattr(ctx, 'QiskitCirc'):
            ctx.QiskitCirc = QiskitCircuit(shots=100)
        exp_value =[0])
        result = torch.tensor([exp_value]) # store the result as a torch tensor
        ctx.save_for_backward(result, i)
        return result
    def backward(ctx, grad_output):
        s = np.pi/2
        forward_tensor, i = ctx.saved_tensors  
        # Obtain paramaters 
        input_numbers = to_numbers(i[0])
        gradient = []
        for k in range(len(input_numbers)):
            input_plus_s = input_numbers
            input_plus_s[k] = input_numbers[k] + s  # Shift up by s
            exp_value_plus =[0]
            result_plus_s = torch.tensor([exp_value_plus])
            input_minus_s = input_numbers
            input_minus_s[k] = input_numbers[k] - s # Shift down by s
            exp_value_minus =[0]
            result_minus_s = torch.tensor([exp_value_minus])

            gradient_result = (result_plus_s - result_minus_s)

        result = torch.tensor([gradient])
        return result.float() * grad_output.float()

Putting this all together We will create a simple hybrid neural network to classify images of two types of digits (0 or 1) from the MNIST dataset. We first load MNIST and filter for pictures containing 0's and 1's. These will serve as inputs for our neural network to classify.

Data loading and preprocessing

transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor()]) # transform images to tensors/vectors
mnist_trainset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

labels = mnist_trainset.targets # get the labels for the data
labels = labels.numpy()

idx1 = np.where(labels == 0) # filter on zeros
idx2 = np.where(labels == 1) # filter on ones

# Specify number of datapoints per class (i.e. there will be n pictures of 1 and n pictures of 0 in the training set)

# concatenate the data indices
idx = np.concatenate((idx1[0][0:n],idx2[0][0:n])) 

# create the filtered dataset for our training set
mnist_trainset.targets = labels[idx] =[idx]

train_loader =, batch_size=1, shuffle=True)

The data will consist of images belonging to two classes: 0 and 1. An example image from both classes looks like this:

So far, we have loaded the data and coded a class that creates our quantum circuit which contains 1 trainable parameter. This quantum parameter will be inserted into a classical neural network along with the other classical parameters to form the hybrid neural network. We also created backward and forward pass functions that allow us to do backpropagation and optimise our neural network. Lastly, we need to specify our neural network architecture such that we can begin to train our parameters using optimisation techniques provided by PyTorch.

Creating the hybrid neural network

We can use a neat PyTorch pipeline to create a neural network architecture. The network will need to be compatible in terms of its dimensionality when we insert the quantum layer (i.e. our quantum circuit). Since our quantum in this example contains 1 parameter, we must ensure the network condenses neurons down to size 1. We create a network consisting of 3 hidden layers with 320, 50 and 1 neurons respectively. The value of the last neuron is fed as the parameter $\theta$ into our quantum circuit. The circuit measurement then serves as the final prediction for 0 or 1 as provided by a $\sigma_z$ measurement. The measurement outcomes are -1 which implies a predicted label of 0 and 1 which implies a predicted label of 1.

qc = TorchCircuit.apply 

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.h1 = nn.Linear(320, 50)
        self.h2 = nn.Linear(50, 1)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.h1(x))
        x = F.dropout(x,
        x = self.h2(x)
        x = qc(x)
        x = (x+1)/2  # Normalise the inputs to 1 or 0
        x =, 1-x), -1)
        return x

Training the network

We now have all the ingredients to train our hybrid network! We can specify any PyTorch optimiser, learning rate and cost/loss function in order to train over multiple epochs. In this instance, we use the Adam optimiser, a learning rate of 0.001 and the negative log-likelihood loss function.

network = Net()
optimizer = optim.Adam(network.parameters(), lr=0.001)

epochs = 30
loss_list = []
for epoch in range(epochs):
    total_loss = []
    target_list = []
    for batch_idx, (data, target) in enumerate(train_loader):
        output = network(data)
        loss = F.nll_loss(output, target)

# Normalise the loss between 0 and 1
for i in range(len(loss_list)):
    loss_list[i] += 1
# Plot the loss per epoch
plt.title("Hybrid NN Training Convergence")
plt.xlabel('Training Iterations')
Text(0, 0.5, 'Loss')

What now?

While it is totally possible to create hybrid neural networks, does this actually have any benefit? In fact, the classical layers of this network train perfectly fine (in fact, better) without the quantum layer. Furthermore, you may have noticed that the quantum layer we trained here generates no entanglement, and will therefore continue to be classically simulable as we scale up this particular architecture. This means that if you hope to achieve a quantum advantage using hybrid nerural networks, you'll need to start by extending this code to include a more sophisticated quantum layer.

The point of this exercise was to get you thinking about integrating techniques from ML and quantum computing in order to investigate if there is indeed some element of interest - and thanks to PyTorch and Qiskit, this becomes a little bit easier.