English
Languages
English
Bengali
Japanese
Shortcuts

Note

This page was generated from docs/tutorials/09_Protein_Folding.ipynb.

Protein Folding

Introduction

The structure and function of many natural and human-engineered proteins is still only poorly understood. As a result, our understanding of processes connected with protein folding, such as those encountered in Alzheimer’s disease, vaccine development, and crop improvement research, has remained limited.

Unfolded polypeptides have a very large number of degrees of freedom and thus an enormous number of potential conformations. For example, a chain with \(100\) aminoacids has on the order of \(10^{47}\) conformations. In reality, however, many proteins fold to their native structure within seconds. This is known as Levinthal’s paradox [1].

The exponential growth of potential conformations with chain length makes the problem intractable for classical computers. In the quantum framework, our resource-efficient algorithm scales linearly with the number of aminoacids N.

The goal of this work is to determine the minimum energy conformation of a protein. Starting from a random configuration, the protein’s structure is optimized to lower the energy. This can be achieved by encoding the protein folding problem into a qubit operator and ensuring that all physical constraints are satisfied.

For the problem encoding we use:

  • Configuration qubits: qubits that are used to describe the configurations and the relative position of the different beads

  • Interaction qubits: qubits that encode interactions between the different aminoacids

For our case we use a tetrahedral lattice (diamond shape lattice) where we encode the movement through the configuration qubits (see image below).

12028416379e4c6b95b302c9201666ca

The Hamiltonian of the system for a set of qubits \(\mathbf{q}=\{\mathbf{q}_{cf}, \mathbf{q}_{in}\}\) is

\[H(\mathbf{q}) = H_{gc}(\mathbf{q}_{cf}) + H_{ch}(\mathbf{q}_{cf}) + H_{in}(\mathbf{q}_{cf}, \mathbf{q}_{in})\]

where

  • \(H_{gc}\) is the geometrical constraint term (governing the growth of the primary sequence of aminoacids without bifurcations)

  • \(H_{ch}\) is the chirality constraint (enforcing the right stereochemistry for the system)

  • \(H_{in}\) is the interaction energy terms of the system. In our case we consider only nearest neighbor interactions.

Further details about the used model and the encoding of the problem can be found in [2].

[1]:
from qiskit_nature.problems.sampling.protein_folding.interactions.random_interaction import (
    RandomInteraction,
)
from qiskit_nature.problems.sampling.protein_folding.interactions.miyazawa_jernigan_interaction import (
    MiyazawaJerniganInteraction,
)
from qiskit_nature.problems.sampling.protein_folding.peptide.peptide import Peptide
from qiskit_nature.problems.sampling.protein_folding.protein_folding_problem import (
    ProteinFoldingProblem,
)

from qiskit_nature.problems.sampling.protein_folding.penalty_parameters import PenaltyParameters

from qiskit.utils import algorithm_globals, QuantumInstance

algorithm_globals.random_seed = 23

Protein Main Chain

The Protein consists of a main chain that is a linear chain of aminoacids. For the naming of different residues we use the one-letter code as defined in Ref. [3]. Further details about the naming and the type of aminoacids can also be found in [4].

For this particular case we demonstrate the generation of the qubit operator in a neuropeptide with the main chain consisting of 7 aminoacids with letter codes APRLRFY (see also [2]).

[2]:
main_chain = "APRLRFY"

Side Chains

Beyond the main chain of the protein there may be aminoacids attached to the residues of the main chain. Our model allows for side chains of the maximum length of one. Elongated side chains would require the introduction of additional penalty terms which are still under development. In this example we do not consider any side chains to keep the real structure of the neuropeptide.

[3]:
side_chains = [""] * 7

Interaction between Aminoacids

For the description of inter-residue contacts for proteins we use knowledge-based (statistical) potentials derived using quasi-chemical approximation. The potentials used here are introduced by Miyazawa, S. and Jernigan, R. L. in [5].

Beyond this model we also allow for random contact maps (interactions) that provide a random interaction map. One can also introduce a custom interaction map that enhances certain configurations of the protein (e.g. alpha helix, beta sheet etc).

[4]:
random_interaction = RandomInteraction()
mj_interaction = MiyazawaJerniganInteraction()

Physical Constraints

To ensure that all physical constraints are respected we introduce penalty functions. The different penalty terms used are:

  • penalty_chiral: A penalty parameter used to impose the right chirality.

  • penalty_back: A penalty parameter used to penalize turns along the same axis. This term is used to eliminate sequences where the same axis is chosen twice in a row. In this way we do not allow for a chain to fold back into itself.

  • penalty_1: A penalty parameter used to penalize local overlap between beads within a nearest neighbor contact.

[5]:
penalty_back = 10
penalty_chiral = 10
penalty_1 = 10

penalty_terms = PenaltyParameters(penalty_chiral, penalty_back, penalty_1)

Peptide Definition

Based on the main chain and possible side chains we define the peptide object that includes all the structural information of the modeled system.

[6]:
peptide = Peptide(main_chain, side_chains)

Protein Folding Problem

Based on the defined peptide, the interaction (contact map) and the penalty terms we defined for our model we define the protein folding problem that returns qubit operators.

[7]:
protein_folding_problem = ProteinFoldingProblem(peptide, mj_interaction, penalty_terms)
qubit_op = protein_folding_problem.qubit_op()
[8]:
print(qubit_op)
1613.5895000000003 * IIIIIIIII
+ 487.5 * IIIIIIZII
- 192.5 * IIIIIIIZZ
+ 192.5 * IIIIIIZZZ
- 195.0 * IIIIZIZII
- 195.0 * IIIIIZIZI
- 195.0 * IIIIZZZZI
- 95.0 * IIZIZIIII
- 95.0 * IIIZIZIII
- 95.0 * IIZZZZIII
+ 295.0 * IIIIIIZZI
- 497.5 * IIIIZIIII
- 300.0 * IIIIZZIII
+ 195.0 * IIIIIIIIZ
+ 197.5 * IIIIIZIIZ
- 197.5 * IIIIZZIIZ
- 904.2875 * IZIIIIIII
- 295.0 * IZIIIIZII
- 197.5 * IZIIIIZZI
+ 302.5 * IZIIZIIII
+ 202.5 * IZIIZZIII
+ 100.0 * IZIIZIZII
+ 100.0 * IZIIIZIZI
+ 100.0 * IZIIZZZZI
- 200.0 * IZIIIIIIZ
+ 97.5 * IZIIIIIZZ
- 97.5 * IZIIIIZZZ
- 100.0 * IZIIIZIIZ
+ 100.0 * IZIIZZIIZ
+ 100.0 * IIIIIIIZI
- 100.0 * IIIIIZIII
+ 2.5 * IZIIIIIZI
- 2.5 * IZIIIZIII
+ 192.5 * IIZIIIIII
+ 95.0 * IIZZIIIII
+ 97.5 * IIZIIIZII
+ 97.5 * IIIZIIIZI
+ 97.5 * IIZZIIZZI
- 97.5 * IIIZIIIIZ
+ 97.5 * IIZZIIIIZ
+ 7.5 * IZZIIIIII
+ 5.0 * IZZZIIIII
+ 2.5 * IZZIIIZII
+ 2.5 * IZIZIIIZI
+ 2.5 * IZZZIIZZI
- 2.5 * IZZIZIIII
- 2.5 * IZIZIZIII
- 2.5 * IZZZZZIII
- 2.5 * IZIZIIIIZ
+ 2.5 * IZZZIIIIZ
+ 105.0 * IIIZIIIII
- 701.802 * ZIIIIIIII
- 195.0 * ZIIIIIZII
- 102.5 * ZIIIIIIZI
- 97.5 * ZIIIIIZZI
+ 195.0 * ZIIIZIIII
+ 102.5 * ZIIIIZIII
+ 97.5 * ZIIIZZIII
- 200.0 * ZIZIIIIII
- 105.0 * ZIIZIIIII
- 100.0 * ZIZZIIIII
+ 97.5 * ZIIIZIZII
- 100.0 * ZIZIIIZII
+ 97.5 * ZIIIIZIZI
- 100.0 * ZIIZIIIZI
+ 97.5 * ZIIIZZZZI
- 100.0 * ZIZZIIZZI
+ 100.0 * ZIZIZIIII
+ 100.0 * ZIIZIZIII
+ 100.0 * ZIZZZZIII
+ 97.5 * ZIIIIIIZZ
- 97.5 * ZIIIIIZZZ
- 97.5 * ZIIIIZIIZ
+ 97.5 * ZIIIZZIIZ
+ 100.0 * ZIIZIIIIZ
- 100.0 * ZIZZIIIIZ
+ 5.0 * ZIIIIIIIZ

Using VQE with CVaR expectation value for the solution of the problem

The problem that we are tackling has now implemented all the physical constraints and has a diagonal Hamiltonian. For the particular case we are targeting the single bitstring that gives us the minimum energy (corresponding to the folded structure of the protein). Thus, we can use the Variational Quantum Eigensolver with Conditional Value at Risk (CVaR) expectation values for the solution of the problem and for finding the minimum configuration energy [6] . We follow the same approach as in Ref. [2] but here we use COBYLA for the classical optimization part. One can also use the standard VQE or QAOA algorithm for the solution of the problem, though as discussed in Ref. [2] CVaR is more suitable.

[9]:
from qiskit.circuit.library import RealAmplitudes
from qiskit.algorithms.optimizers import COBYLA
from qiskit.algorithms import NumPyMinimumEigensolver, VQE
from qiskit.opflow import PauliExpectation, CVaRExpectation
from qiskit import execute, Aer

# set classical optimizer
optimizer = COBYLA(maxiter=50)

# set variational ansatz
ansatz = RealAmplitudes(reps=1)

# set the backend
backend_name = "aer_simulator"
backend = QuantumInstance(
    Aer.get_backend(backend_name),
    shots=8192,
    seed_transpiler=algorithm_globals.random_seed,
    seed_simulator=algorithm_globals.random_seed,
)

counts = []
values = []


def store_intermediate_result(eval_count, parameters, mean, std):
    counts.append(eval_count)
    values.append(mean)


# initialize CVaR_alpha objective with alpha = 0.1
cvar_exp = CVaRExpectation(0.1, PauliExpectation())

# initialize VQE using CVaR
vqe = VQE(
    expectation=cvar_exp,
    optimizer=optimizer,
    ansatz=ansatz,
    quantum_instance=backend,
    callback=store_intermediate_result,
)

raw_result = vqe.compute_minimum_eigenvalue(qubit_op)
print(raw_result)
{   'aux_operator_eigenvalues': None,
    'cost_function_evals': 50,
    'eigenstate': {   '000000000': 0.015625,
                      '000000001': 0.046875,
                      '000000010': 0.03314563036811941,
                      '000000011': 0.06346905003621844,
                      '000000101': 0.04941058844013093,
                      '000000110': 0.015625,
                      '000000111': 0.02209708691207961,
                      '000001001': 0.029231698334171417,
                      '000001010': 0.019136638615493577,
                      '000001011': 0.04555431167847891,
                      '000001101': 0.029231698334171417,
                      '000010001': 0.015625,
                      '000010011': 0.027063293868263706,
                      '000010101': 0.02209708691207961,
                      '000010111': 0.011048543456039806,
                      '000011000': 0.02209708691207961,
                      '000011001': 0.06051536478449089,
                      '000011010': 0.03983608994994363,
                      '000011011': 0.09631896879639025,
                      '000011100': 0.011048543456039806,
                      '000011101': 0.06810779599282303,
                      '000011110': 0.02209708691207961,
                      '000011111': 0.019136638615493577,
                      '000100000': 0.015625,
                      '000100001': 0.03125,
                      '000100010': 0.019136638615493577,
                      '000100011': 0.05063078670631141,
                      '000100101': 0.03983608994994363,
                      '000100110': 0.015625,
                      '000100111': 0.015625,
                      '000101001': 0.03314563036811941,
                      '000101010': 0.011048543456039806,
                      '000101011': 0.036643873123620545,
                      '000101101': 0.019136638615493577,
                      '000110000': 0.011048543456039806,
                      '000110010': 0.011048543456039806,
                      '000110011': 0.019136638615493577,
                      '000110101': 0.029231698334171417,
                      '000111000': 0.015625,
                      '000111001': 0.04555431167847891,
                      '000111010': 0.024705294220065465,
                      '000111011': 0.09110862335695782,
                      '000111101': 0.05412658773652741,
                      '000111110': 0.015625,
                      '000111111': 0.011048543456039806,
                      '001000000': 0.024705294220065465,
                      '001000001': 0.04941058844013093,
                      '001000010': 0.036643873123620545,
                      '001000011': 0.09820160226544168,
                      '001000101': 0.061515686515717274,
                      '001000110': 0.024705294220065465,
                      '001000111': 0.02209708691207961,
                      '001001001': 0.011048543456039806,
                      '001001010': 0.011048543456039806,
                      '001001011': 0.04133986423538423,
                      '001001101': 0.019136638615493577,
                      '001010001': 0.019136638615493577,
                      '001010010': 0.019136638615493577,
                      '001010011': 0.034938562148434216,
                      '001010101': 0.034938562148434216,
                      '001011001': 0.019136638615493577,
                      '001011011': 0.034938562148434216,
                      '001011101': 0.027063293868263706,
                      '001100000': 0.04419417382415922,
                      '001100001': 0.13799626353637262,
                      '001100010': 0.10065702130254005,
                      '001100011': 0.27085837223999554,
                      '001100100': 0.019136638615493577,
                      '001100101': 0.15428451295415233,
                      '001100110': 0.05412658773652741,
                      '001100111': 0.06442352540027595,
                      '001101000': 0.015625,
                      '001101001': 0.07574499777213015,
                      '001101010': 0.03983608994994363,
                      '001101011': 0.11267347735824966,
                      '001101100': 0.015625,
                      '001101101': 0.0855816496101822,
                      '001101111': 0.015625,
                      '001110001': 0.011048543456039806,
                      '001110011': 0.029231698334171417,
                      '001110101': 0.02209708691207961,
                      '001111000': 0.038273277230987154,
                      '001111001': 0.10825317547305482,
                      '001111010': 0.05740991584648074,
                      '001111011': 0.18087245160609727,
                      '001111100': 0.015625,
                      '001111101': 0.11744762795603834,
                      '001111110': 0.024705294220065465,
                      '001111111': 0.019136638615493577,
                      '010001011': 0.011048543456039806,
                      '010001101': 0.011048543456039806,
                      '010010000': 0.011048543456039806,
                      '010010001': 0.015625,
                      '010010010': 0.02209708691207961,
                      '010010011': 0.029231698334171417,
                      '010010101': 0.011048543456039806,
                      '010010111': 0.011048543456039806,
                      '010011000': 0.011048543456039806,
                      '010011001': 0.036643873123620545,
                      '010011010': 0.03125,
                      '010011011': 0.05298695299316616,
                      '010011101': 0.038273277230987154,
                      '010100000': 0.03314563036811941,
                      '010100001': 0.07967217989988726,
                      '010100010': 0.046875,
                      '010100011': 0.12979099785809492,
                      '010100100': 0.011048543456039806,
                      '010100101': 0.07493486755176124,
                      '010100110': 0.024705294220065465,
                      '010100111': 0.036643873123620545,
                      '010101001': 0.04133986423538423,
                      '010101010': 0.024705294220065465,
                      '010101011': 0.06536406457297465,
                      '010101101': 0.04419417382415922,
                      '010101110': 0.011048543456039806,
                      '010110001': 0.015625,
                      '010110011': 0.03125,
                      '010110110': 0.011048543456039806,
                      '010111000': 0.024705294220065465,
                      '010111001': 0.07328774624724109,
                      '010111010': 0.03983608994994363,
                      '010111011': 0.10768788087570486,
                      '010111101': 0.08193819126329309,
                      '010111110': 0.011048543456039806,
                      '010111111': 0.019136638615493577,
                      '011000001': 0.03983608994994363,
                      '011000011': 0.05524271728019903,
                      '011000101': 0.036643873123620545,
                      '011001001': 0.024705294220065465,
                      '011001010': 0.011048543456039806,
                      '011001011': 0.03125,
                      '011001101': 0.019136638615493577,
                      '011011001': 0.024705294220065465,
                      '011011010': 0.011048543456039806,
                      '011011011': 0.029231698334171417,
                      '011011101': 0.015625,
                      '011100001': 0.024705294220065465,
                      '011100010': 0.019136638615493577,
                      '011100011': 0.03983608994994363,
                      '011100101': 0.034938562148434216,
                      '011101011': 0.019136638615493577,
                      '011101101': 0.02209708691207961,
                      '011111011': 0.015625,
                      '011111101': 0.011048543456039806,
                      '100000000': 0.015625,
                      '100000001': 0.04941058844013093,
                      '100000010': 0.03314563036811941,
                      '100000011': 0.08267972847076846,
                      '100000101': 0.05524271728019903,
                      '100000110': 0.011048543456039806,
                      '100000111': 0.02209708691207961,
                      '100001000': 0.015625,
                      '100001001': 0.027063293868263706,
                      '100001010': 0.019136638615493577,
                      '100001011': 0.05182226234930312,
                      '100001101': 0.036643873123620545,
                      '100010000': 0.011048543456039806,
                      '100010001': 0.024705294220065465,
                      '100010011': 0.034938562148434216,
                      '100010101': 0.024705294220065465,
                      '100011000': 0.024705294220065465,
                      '100011001': 0.071602745233685,
                      '100011010': 0.048159484398195125,
                      '100011011': 0.106548294507702,
                      '100011101': 0.07890238233095373,
                      '100011110': 0.024705294220065465,
                      '100011111': 0.015625,
                      '100100000': 0.015625,
                      '100100001': 0.03314563036811941,
                      '100100010': 0.02209708691207961,
                      '100100011': 0.07328774624724109,
                      '100100101': 0.048159484398195125,
                      '100100110': 0.015625,
                      '100100111': 0.011048543456039806,
                      '100101001': 0.027063293868263706,
                      '100101010': 0.015625,
                      '100101011': 0.03125,
                      '100101101': 0.036643873123620545,
                      '100101110': 0.011048543456039806,
                      '100110000': 0.011048543456039806,
                      '100110001': 0.02209708691207961,
                      '100110011': 0.03983608994994363,
                      '100110101': 0.027063293868263706,
                      '100110110': 0.011048543456039806,
                      '100110111': 0.02209708691207961,
                      '100111000': 0.029231698334171417,
                      '100111001': 0.06346905003621844,
                      '100111010': 0.04133986423538423,
                      '100111011': 0.10881553341550093,
                      '100111101': 0.0625,
                      '100111110': 0.019136638615493577,
                      '100111111': 0.011048543456039806,
                      '101000000': 0.02209708691207961,
                      '101000001': 0.06810779599282303,
                      '101000010': 0.04133986423538423,
                      '101000011': 0.12645635981436443,
                      '101000101': 0.07733980419227864,
                      '101000110': 0.015625,
                      '101000111': 0.027063293868263706,
                      '101001001': 0.029231698334171417,
                      '101001010': 0.011048543456039806,
                      '101001011': 0.03314563036811941,
                      '101001101': 0.019136638615493577,
                      '101010000': 0.011048543456039806,
                      '101010001': 0.024705294220065465,
                      '101010010': 0.02209708691207961,
                      '101010011': 0.04941058844013093,
                      '101010101': 0.015625,
                      '101010111': 0.015625,
                      '101011000': 0.015625,
                      '101011001': 0.027063293868263706,
                      '101011011': 0.03125,
                      '101011101': 0.019136638615493577,
                      '101100000': 0.06810779599282303,
                      '101100001': 0.1690102160373745,
                      '101100010': 0.129319885603491,
                      '101100011': 0.3125,
                      '101100100': 0.011048543456039806,
                      '101100101': 0.16424840657522374,
                      '101100110': 0.05740991584648074,
                      '101100111': 0.06810779599282303,
                      '101101000': 0.02209708691207961,
                      '101101001': 0.10126157341262282,
                      '101101010': 0.05298695299316616,
                      '101101011': 0.13887803777055607,
                      '101101100': 0.011048543456039806,
                      '101101101': 0.10423175050098699,
                      '101101110': 0.027063293868263706,
                      '101101111': 0.019136638615493577,
                      '101110000': 0.011048543456039806,
                      '101110001': 0.011048543456039806,
                      '101110011': 0.03125,
                      '101110101': 0.011048543456039806,
                      '101111000': 0.05298695299316616,
                      '101111001': 0.13212136347881065,
                      '101111010': 0.07245014449606019,
                      '101111011': 0.21791133876533364,
                      '101111100': 0.011048543456039806,
                      '101111101': 0.15428451295415233,
                      '101111110': 0.04133986423538423,
                      '101111111': 0.03125,
                      '110001011': 0.011048543456039806,
                      '110010001': 0.02209708691207961,
                      '110010010': 0.015625,
                      '110010011': 0.038273277230987154,
                      '110010101': 0.02209708691207961,
                      '110010111': 0.011048543456039806,
                      '110011000': 0.019136638615493577,
                      '110011001': 0.04555431167847891,
                      '110011010': 0.027063293868263706,
                      '110011011': 0.06629126073623882,
                      '110011101': 0.03983608994994363,
                      '110011110': 0.015625,
                      '110100000': 0.03314563036811941,
                      '110100001': 0.08341467384399462,
                      '110100010': 0.05412658773652741,
                      '110100011': 0.15027643798180737,
                      '110100101': 0.0943987966687076,
                      '110100110': 0.027063293868263706,
                      '110100111': 0.036643873123620545,
                      '110101000': 0.015625,
                      '110101001': 0.0427908248050911,
                      '110101010': 0.024705294220065465,
                      '110101011': 0.07733980419227864,
                      '110101101': 0.0625,
                      '110101110': 0.015625,
                      '110110001': 0.02209708691207961,
                      '110110010': 0.011048543456039806,
                      '110110011': 0.03125,
                      '110110101': 0.024705294220065465,
                      '110111000': 0.02209708691207961,
                      '110111001': 0.08769509500251425,
                      '110111010': 0.04133986423538423,
                      '110111011': 0.1316585902058806,
                      '110111101': 0.09631896879639025,
                      '110111110': 0.019136638615493577,
                      '110111111': 0.011048543456039806,
                      '111000001': 0.04555431167847891,
                      '111000010': 0.02209708691207961,
                      '111000011': 0.071602745233685,
                      '111000100': 0.011048543456039806,
                      '111000101': 0.03983608994994363,
                      '111000110': 0.011048543456039806,
                      '111000111': 0.019136638615493577,
                      '111001001': 0.019136638615493577,
                      '111001011': 0.03125,
                      '111001101': 0.024705294220065465,
                      '111011001': 0.019136638615493577,
                      '111011010': 0.02209708691207961,
                      '111011011': 0.03314563036811941,
                      '111011101': 0.019136638615493577,
                      '111100001': 0.03125,
                      '111100010': 0.029231698334171417,
                      '111100011': 0.061515686515717274,
                      '111100101': 0.03125,
                      '111100110': 0.011048543456039806,
                      '111100111': 0.011048543456039806,
                      '111101001': 0.015625,
                      '111101011': 0.019136638615493577,
                      '111101101': 0.015625,
                      '111111001': 0.011048543456039806,
                      '111111011': 0.02209708691207961,
                      '111111101': 0.015625},
    'eigenvalue': (-1.3980178222655593+0j),
    'optimal_parameters': {   ParameterVectorElement(θ[13]): 1.1013642144225715,
                              ParameterVectorElement(θ[14]): -0.0313552145205504,
                              ParameterVectorElement(θ[15]): 0.7705700955144723,
                              ParameterVectorElement(θ[16]): 0.18505081170538862,
                              ParameterVectorElement(θ[17]): 0.5496347144255032,
                              ParameterVectorElement(θ[6]): -0.9943385087310626,
                              ParameterVectorElement(θ[7]): -0.048782814078807774,
                              ParameterVectorElement(θ[8]): 1.190948991529499,
                              ParameterVectorElement(θ[9]): 0.31372849859857715,
                              ParameterVectorElement(θ[10]): -1.0964143534287234,
                              ParameterVectorElement(θ[11]): 0.2516332046098113,
                              ParameterVectorElement(θ[12]): -1.4702306518130106,
                              ParameterVectorElement(θ[2]): -0.1383064560367305,
                              ParameterVectorElement(θ[3]): -0.853941001354921,
                              ParameterVectorElement(θ[5]): 3.922465032736834,
                              ParameterVectorElement(θ[4]): 0.5595204307110316,
                              ParameterVectorElement(θ[0]): 2.555308432670651,
                              ParameterVectorElement(θ[1]): 1.0035801367262132},
    'optimal_point': array([ 2.55530843,  1.00358014, -0.13830646, -0.853941  ,  0.55952043,
        3.92246503, -0.99433851, -0.04878281,  1.19094899,  0.3137285 ,
       -1.09641435,  0.2516332 , -1.47023065,  1.10136421, -0.03135521,
        0.7705701 ,  0.18505081,  0.54963471]),
    'optimal_value': -1.3980178222655593,
    'optimizer_evals': None,
    'optimizer_time': 31.43800950050354}
[10]:
import matplotlib.pyplot as plt

fig = plt.figure()

plt.plot(counts, values)
plt.ylabel("Conformation Energy")
plt.xlabel("VQE Iterations")

fig.add_axes([0.44, 0.51, 0.44, 0.32])

plt.plot(counts[40:], values[40:])
plt.ylabel("Conformation Energy")
plt.xlabel("VQE Iterations")
plt.show()
../_images/tutorials_09_Protein_Folding_30_0.png

Visualizing the answer

In order to reduce computational costs, we have reduced the problem’s qubit operator to the minimum amount of qubits needed to represent the shape of the protein. In order to decode the answer we need to understand how this has been done. * The shape of the protein has been encoded by a sequence of turns , \(\{0,1,2,3\}\). Each turn represents a different direction in the lattice. * For a main bead of \(N_{aminoacids}\) in a lattice, we need \(N_{aminoacids}-1\) turns in order to represent its shape. However, the orientation of the protein is not relevant to its energy. Therefore the first two turns of the shape can be set to \([1,0]\) without loss of generality. * If the second bead does not have any side chain, we can also set the \(6^{th}\) qubit to \([1]\) without breaking symmetry. * Since the length of the secondary chains is always limited to \(1\) we only need one turn to describe the shape of the chain.

The total amount of qubits we need to represent the shape of the protein will be \(2(N_{aminoacids}-3)\) if there is a secondary chain coming out of the second bead or \(2(N_{aminoacids}-3) - 1\), otherwise. All the other qubits will remain unused during the optimization process. See:

[11]:
result = protein_folding_problem.interpret(raw_result=raw_result)
print(
    "The bitstring representing the shape of the protein during optimization is: ",
    result.turn_sequence,
)
print("The expanded expression is:", result.get_result_binary_vector())
The bitstring representing the shape of the protein during optimization is:  101100011
The expanded expression is: 1______0_____________________________________________________________________________________________________________________________110001_1____

Now that we know which qubits encode which information, we can decode the bitstring into the explicit turns that form the shape of the protein.

[12]:
print(f"The folded protein's main sequence of turns is: {result.protein_shape_decoder.main_turns}")
print(f"and the side turn sequences are: {result.protein_shape_decoder.side_turns}")
The folded protein's main sequence of turns is: [1, 0, 3, 2, 0, 3]
and the side turn sequences are: [None, None, None, None, None, None, None]

From this sequence of turns we can get the cartesian coordinates of each of the aminoacids of the protein.

[13]:
print(result.protein_shape_file_gen.get_xyz_data())
[['A' '0.0' '0.0' '0.0']
 ['P' '0.5773502691896258' '0.5773502691896258' '-0.5773502691896258']
 ['R' '1.1547005383792517' '0.0' '-1.1547005383792517']
 ['L' '1.7320508075688776' '-0.5773502691896258' '-0.5773502691896258']
 ['R' '2.3094010767585034' '0.0' '0.0']
 ['F' '1.7320508075688776' '0.5773502691896258' '0.5773502691896258']
 ['Y' '1.154700538379252' '1.1547005383792517' '0.0']]

And finally, we can also plot the structure of the protein in 3D. Note that when rendered with the proper backend this plot can be interactively rotated.

[14]:
fig = result.get_figure(title="Protein Structure", ticks=False, grid=True)
fig.get_axes()[0].view_init(10, 70)
../_images/tutorials_09_Protein_Folding_39_0.png

And here is an example with side chains.

[15]:
peptide = Peptide("APRLR", ["", "", "F", "Y", ""])
protein_folding_problem = ProteinFoldingProblem(peptide, mj_interaction, penalty_terms)
qubit_op = protein_folding_problem.qubit_op()
raw_result = vqe.compute_minimum_eigenvalue(qubit_op)
result_2 = protein_folding_problem.interpret(raw_result=raw_result)
[16]:
fig = result_2.get_figure(title="Protein Structure", ticks=False, grid=True)
fig.get_axes()[0].view_init(10, 60)
../_images/tutorials_09_Protein_Folding_42_0.png

References

[1] https://en.wikipedia.org/wiki/Levinthal%27s_paradox

[2] A.Robert, P.Barkoutsos, S.Woerner and I.Tavernelli, Resource-efficient quantum algorithm for protein folding, NPJ Quantum Information, 2021, https://doi.org/10.1038/s41534-021-00368-4

[3] IUPAC–IUB Commission on Biochemical Nomenclature (1972). “A one-letter notation for aminoacid sequences”. Pure and Applied Chemistry. 31 (4): 641–645. doi:10.1351/pac197231040639. PMID 5080161.

[4] https://en.wikipedia.org/wiki/Amino_acid

[5] S. Miyazawa and R. L.Jernigan, Residue – Residue Potentials with a Favorable Contact Pair Term and an Unfavorable High Packing Density Term for Simulation and Threading, J. Mol. Biol.256, 623–644, 1996, Table 3, https://doi.org/10.1006/jmbi.1996.0114

[6] P.Barkoutsos, G. Nannichini, A.Robert, I.Tavernelli, S.Woerner, Improving Variational Quantum Optimization using CVaR, Quantum 4, 256, 2020, https://doi.org/10.22331/q-2020-04-20-256

[17]:
import qiskit.tools.jupyter

%qiskit_version_table
%qiskit_copyright

Version Information

Qiskit SoftwareVersion
qiskit-terra0.21.0
qiskit-aer0.10.4
qiskit-nature0.4.3
System information
Python version3.8.13
Python compilerGCC 9.4.0
Python builddefault, Jun 20 2022 14:28:56
OSLinux
CPUs2
Memory (Gb)6.783603668212891
Fri Jul 15 17:11:57 2022 UTC

This code is a part of Qiskit

© Copyright IBM 2017, 2022.

This code is licensed under the Apache License, Version 2.0. You may
obtain a copy of this license in the LICENSE.txt file in the root directory
of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.

Any modifications or derivative works of this code must retain this
copyright notice, and modified files need to carry a notice indicating
that they have been altered from the originals.

[ ]: