{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quantum Kernel Machine Learning\n", "\n", "The general task of machine learning is to find and study patterns in data. For many datasets, the datapoints are better understood in a higher dimensional feature space, through the use of a kernel function:\n", "$k(\\vec{x}_i, \\vec{x}_j) = \\langle f(\\vec{x}_i), f(\\vec{x}_j) \\rangle$\n", "where $k$ is the kernel function, $\\vec{x}_i, \\vec{x}_j$ are $n$ dimensional inputs, $f$ is a map from $n$-dimension to $m$-dimension space and $\\langle a,b \\rangle$ denotes the dot product. When considering finite data, a kernel function can be represented as a matrix: \n", "$K_{ij} = k(\\vec{x}_i,\\vec{x}_j)$.\n", "\n", "In quantum kernel machine learning, a quantum feature map $\\phi(\\vec{x})$ is used to map a classical feature vector $\\vec{x}$ to a quantum Hilbert space, $| \\phi(\\vec{x})\\rangle \\langle \\phi(\\vec{x})|$, such that $K_{ij} = \\left| \\langle \\phi^\\dagger(\\vec{x}_j)| \\phi(\\vec{x}_i) \\rangle \\right|^{2}$. See [_Supervised learning with quantum enhanced feature spaces_](https://arxiv.org/pdf/1804.11326.pdf) for more details.\n", "\n", "In this notebook, we use `qiskit` to calculate a kernel matrix using a quantum feature map, then use this kernel matrix in `scikit-learn` classification and clustering algorithms.\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "from sklearn.svm import SVC\n", "from sklearn.cluster import SpectralClustering\n", "from sklearn.metrics import normalized_mutual_info_score\n", "\n", "from qiskit import BasicAer\n", "from qiskit.circuit.library import ZZFeatureMap\n", "from qiskit.utils import QuantumInstance, algorithm_globals\n", "from qiskit_machine_learning.algorithms import QSVC\n", "from qiskit_machine_learning.kernels import QuantumKernel\n", "from qiskit_machine_learning.datasets import ad_hoc_data\n", "\n", "seed = 12345\n", "algorithm_globals.random_seed = seed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Classification\n", "\n", "For our classification example, we will use the _ad hoc dataset_ as described in [_Supervised learning with quantum enhanced feature spaces_](https://arxiv.org/pdf/1804.11326.pdf), and the `scikit-learn` [support vector machine](https://scikit-learn.org/stable/modules/svm.html) classification (`svc`) algorithm. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "adhoc_dimension = 2\n", "train_features, train_labels, test_features, test_labels, adhoc_total = ad_hoc_data(\n", " training_size=20,\n", " test_size=5,\n", " n=adhoc_dimension,\n", " gap=0.3,\n", " plot_data=False,\n", " one_hot=False,\n", " include_sample_total=True,\n", ")\n", "\n", "plt.figure(figsize=(5, 5))\n", "plt.ylim(0, 2 * np.pi)\n", "plt.xlim(0, 2 * np.pi)\n", "plt.imshow(\n", " np.asmatrix(adhoc_total).T,\n", " interpolation=\"nearest\",\n", " origin=\"lower\",\n", " cmap=\"RdBu\",\n", " extent=[0, 2 * np.pi, 0, 2 * np.pi],\n", ")\n", "\n", "plt.scatter(\n", " train_features[np.where(train_labels[:] == 0), 0],\n", " train_features[np.where(train_labels[:] == 0), 1],\n", " marker=\"s\",\n", " facecolors=\"w\",\n", " edgecolors=\"b\",\n", " label=\"A train\",\n", ")\n", "plt.scatter(\n", " train_features[np.where(train_labels[:] == 1), 0],\n", " train_features[np.where(train_labels[:] == 1), 1],\n", " marker=\"o\",\n", " facecolors=\"w\",\n", " edgecolors=\"r\",\n", " label=\"B train\",\n", ")\n", "plt.scatter(\n", " test_features[np.where(test_labels[:] == 0), 0],\n", " test_features[np.where(test_labels[:] == 0), 1],\n", " marker=\"s\",\n", " facecolors=\"b\",\n", " edgecolors=\"w\",\n", " label=\"A test\",\n", ")\n", "plt.scatter(\n", " test_features[np.where(test_labels[:] == 1), 0],\n", " test_features[np.where(test_labels[:] == 1), 1],\n", " marker=\"o\",\n", " facecolors=\"r\",\n", " edgecolors=\"w\",\n", " label=\"B test\",\n", ")\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=\"upper left\", borderaxespad=0.0)\n", "plt.title(\"Ad hoc dataset for classification\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With our training and testing datasets ready, we set up the `QuantumKernel` class to calculate a kernel matrix using the [ZZFeatureMap](https://qiskit.org/documentation/stubs/qiskit.circuit.library.ZZFeatureMap.html), and the `BasicAer` `qasm_simulator` using 1024 shots." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "adhoc_feature_map = ZZFeatureMap(feature_dimension=adhoc_dimension, reps=2, entanglement=\"linear\")\n", "\n", "adhoc_backend = QuantumInstance(\n", " BasicAer.get_backend(\"qasm_simulator\"), shots=1024, seed_simulator=seed, seed_transpiler=seed\n", ")\n", "\n", "adhoc_kernel = QuantumKernel(feature_map=adhoc_feature_map, quantum_instance=adhoc_backend)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `scikit-learn` `svc` algorithm allows us to define a [custom kernel](https://scikit-learn.org/stable/modules/svm.html#custom-kernels) in two ways: by providing the kernel as a callable function or by precomputing the kernel matrix. We can do either of these using the `QuantumKernel` class in `qiskit`.\n", "\n", "The following code gives the kernel as a callable function:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Callable kernel classification test score: 1.0\n" ] } ], "source": [ "adhoc_svc = SVC(kernel=adhoc_kernel.evaluate)\n", "adhoc_svc.fit(train_features, train_labels)\n", "adhoc_score = adhoc_svc.score(test_features, test_labels)\n", "\n", "print(f\"Callable kernel classification test score: {adhoc_score}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following code precomputes and plots the training and testing kernel matrices before providing them to the `scikit-learn` `svc` algorithm:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Precomputed kernel classification test score: 1.0\n" ] } ], "source": [ "adhoc_matrix_train = adhoc_kernel.evaluate(x_vec=train_features)\n", "adhoc_matrix_test = adhoc_kernel.evaluate(x_vec=test_features, y_vec=train_features)\n", "\n", "fig, axs = plt.subplots(1, 2, figsize=(10, 5))\n", "axs[0].imshow(\n", " np.asmatrix(adhoc_matrix_train), interpolation=\"nearest\", origin=\"upper\", cmap=\"Blues\"\n", ")\n", "axs[0].set_title(\"Ad hoc training kernel matrix\")\n", "axs[1].imshow(np.asmatrix(adhoc_matrix_test), interpolation=\"nearest\", origin=\"upper\", cmap=\"Reds\")\n", "axs[1].set_title(\"Ad hoc testing kernel matrix\")\n", "plt.show()\n", "\n", "adhoc_svc = SVC(kernel=\"precomputed\")\n", "adhoc_svc.fit(adhoc_matrix_train, train_labels)\n", "adhoc_score = adhoc_svc.score(adhoc_matrix_test, test_labels)\n", "\n", "print(f\"Precomputed kernel classification test score: {adhoc_score}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`qiskit` also contains the `qsvc` class that extends the `sklearn svc` class, that can be used as follows:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "QSVC classification test score: 1.0\n" ] } ], "source": [ "qsvc = QSVC(quantum_kernel=adhoc_kernel)\n", "qsvc.fit(train_features, train_labels)\n", "qsvc_score = qsvc.score(test_features, test_labels)\n", "\n", "print(f\"QSVC classification test score: {qsvc_score}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Clustering\n", "\n", "For our clustering example, we will again use the _ad hoc dataset_ as described in [_Supervised learning with quantum enhanced feature spaces_](https://arxiv.org/pdf/1804.11326.pdf), and the `scikit-learn` `spectral` clustering algorithm.\n", "\n", "We will regenerate the dataset with a larger gap between the two classes, and as clustering is an unsupervised machine learning task, we don't need a test sample." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "adhoc_dimension = 2\n", "train_features, train_labels, test_features, test_labels, adhoc_total = ad_hoc_data(\n", " training_size=25,\n", " test_size=0,\n", " n=adhoc_dimension,\n", " gap=0.6,\n", " plot_data=False,\n", " one_hot=False,\n", " include_sample_total=True,\n", ")\n", "\n", "plt.figure(figsize=(5, 5))\n", "plt.ylim(0, 2 * np.pi)\n", "plt.xlim(0, 2 * np.pi)\n", "plt.imshow(\n", " np.asmatrix(adhoc_total).T,\n", " interpolation=\"nearest\",\n", " origin=\"lower\",\n", " cmap=\"RdBu\",\n", " extent=[0, 2 * np.pi, 0, 2 * np.pi],\n", ")\n", "plt.scatter(\n", " train_features[np.where(train_labels[:] == 0), 0],\n", " train_features[np.where(train_labels[:] == 0), 1],\n", " marker=\"s\",\n", " facecolors=\"w\",\n", " edgecolors=\"b\",\n", " label=\"A\",\n", ")\n", "plt.scatter(\n", " train_features[np.where(train_labels[:] == 1), 0],\n", " train_features[np.where(train_labels[:] == 1), 1],\n", " marker=\"o\",\n", " facecolors=\"w\",\n", " edgecolors=\"r\",\n", " label=\"B\",\n", ")\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=\"upper left\", borderaxespad=0.0)\n", "plt.title(\"Ad hoc dataset for clustering\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We again set up the `QuantumKernel` class to calculate a kernel matrix using the [ZZFeatureMap](https://qiskit.org/documentation/stubs/qiskit.circuit.library.ZZFeatureMap.html), and the BasicAer `qasm_simulator` using 1024 shots." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "adhoc_feature_map = ZZFeatureMap(feature_dimension=adhoc_dimension, reps=2, entanglement=\"linear\")\n", "\n", "adhoc_backend = QuantumInstance(\n", " BasicAer.get_backend(\"qasm_simulator\"), shots=1024, seed_simulator=seed, seed_transpiler=seed\n", ")\n", "\n", "adhoc_kernel = QuantumKernel(feature_map=adhoc_feature_map, quantum_instance=adhoc_backend)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The scikit-learn spectral clustering algorithm allows us to define a [custom kernel] in two ways: by providing the kernel as a callable function or by precomputing the kernel matrix. Using the QuantumKernel class in qiskit, we can only use the latter.\n", "\n", "The following code precomputes and plots the kernel matrices before providing it to the scikit-learn spectral clustering algorithm, and scoring the labels using normalized mutual information, since we a priori know the class labels." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Clustering score: 0.7287008798015754\n" ] } ], "source": [ "adhoc_matrix = adhoc_kernel.evaluate(x_vec=train_features)\n", "\n", "plt.figure(figsize=(5, 5))\n", "plt.imshow(np.asmatrix(adhoc_matrix), interpolation=\"nearest\", origin=\"upper\", cmap=\"Greens\")\n", "plt.title(\"Ad hoc clustering kernel matrix\")\n", "plt.show()\n", "\n", "adhoc_spectral = SpectralClustering(2, affinity=\"precomputed\")\n", "cluster_labels = adhoc_spectral.fit_predict(adhoc_matrix)\n", "cluster_score = normalized_mutual_info_score(cluster_labels, train_labels)\n", "\n", "print(f\"Clustering score: {cluster_score}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`scikit-learn` has other algorithms that can use a precomputed kernel matrix, here are a few:\n", "\n", "- [Agglomerative clustering](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html)\n", "- [Support vector regression](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html)\n", "- [Ridge regression](https://scikit-learn.org/stable/modules/generated/sklearn.kernel_ridge.KernelRidge.html)\n", "- [Gaussian process regression](https://scikit-learn.org/stable/modules/gaussian_process.html)\n", "- [Principal component analysis](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.KernelPCA.html)\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

Version Information

Qiskit SoftwareVersion
qiskit-terra0.19.0
qiskit-aer0.8.2
qiskit-ignis0.6.0
qiskit-aqua0.9.2
qiskit0.27.0
qiskit-machine-learning0.3.0
System information
Python3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)]
OSWindows
CPUs4
Memory (Gb)31.837730407714844
Fri Dec 03 15:08:20 2021 GMT Standard Time
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

This code is a part of Qiskit

© Copyright IBM 2017, 2021.

This code is licensed under the Apache License, Version 2.0. You may
obtain a copy of this license in the LICENSE.txt file in the root directory
of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.

Any modifications or derivative works of this code must retain this
copyright notice, and modified files need to carry a notice indicating
that they have been altered from the originals.

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import qiskit.tools.jupyter\n", "\n", "%qiskit_version_table\n", "%qiskit_copyright" ] } ], "metadata": { "celltoolbar": "Tags", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" } }, "nbformat": 4, "nbformat_minor": 2 }