{ "cells": [ { "cell_type": "markdown", "id": "opened-florist", "metadata": {}, "source": [ "# Pegasos Quantum Support Vector Classifier\n", "\n", "There's another SVM based algorithm that benefits from the quantum kernel method. Here, we introduce an implementation of a another classification algorithm, which is an alternative version to the `QSVC` available in Qiskit Machine Learning and shown in the [\"Quantum Kernel Machine Learning\"](./03_quantum_kernel.ipynb) tutorial. This classification algorithm implements the Pegasos algorithm from the paper \"Pegasos: Primal Estimated sub-GrAdient SOlver for SVM\" by Shalev-Shwartz et al., see: https://home.ttic.edu/~nati/Publications/PegasosMPB.pdf.\n", "\n", "This algorithm is an alternative to the dual optimization from the `scikit-learn` package, benefits from the kernel trick, and yields a training complexity that is independent of the size of the training set. Thus, the `PegasosQSVC` is expected to train faster than QSVC for sufficiently large training sets.\n", "\n", "The algorithm can be used as direct replacement of `QSVC` with some hyper-parameterization." ] }, { "cell_type": "markdown", "id": "thirty-painting", "metadata": {}, "source": [ "Let's generate some data:" ] }, { "cell_type": "code", "execution_count": 1, "id": "impressed-laser", "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import make_blobs\n", "\n", "# example dataset\n", "features, labels = make_blobs(n_samples=20, n_features=2, centers=2, random_state=3, shuffle=True)" ] }, { "cell_type": "markdown", "id": "moderate-yugoslavia", "metadata": {}, "source": [ "We pre-process the data to ensure compatibility with the rotation encoding and split it into the training and test datasets." ] }, { "cell_type": "code", "execution_count": 2, "id": "adolescent-composer", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import MinMaxScaler\n", "\n", "features = MinMaxScaler(feature_range=(0, np.pi)).fit_transform(features)\n", "\n", "train_features, test_features, train_labels, test_labels = train_test_split(\n", " features, labels, train_size=15, shuffle=False\n", ")" ] }, { "cell_type": "markdown", "id": "central-poverty", "metadata": {}, "source": [ "We have two features in the dataset, so we set a number of qubits to the number of features in the dataset.\n", "\n", "Then we set $\\tau$ to the number of steps performed during the training procedure. Please note that, there is no early stopping criterion in the algorithm. The algorithm iterates over all $\\tau$ steps.\n", "\n", "And the last one is the hyperparameter $C$. This is a positive regularization parameter. The strength of the regularization is inversely proportional to $C$. Smaller $C$ induce smaller weights which generally helps preventing overfitting. However, due to the nature of this algorithm, some of the computation steps become trivial for larger $C$. Thus, larger $C$ improve the performance of the algorithm drastically. If the data is linearly separable in feature space, $C$ should be chosen to be large. If the separation is not perfect, $C$ should be chosen smaller to prevent overfitting." ] }, { "cell_type": "code", "execution_count": 3, "id": "dying-dispatch", "metadata": {}, "outputs": [], "source": [ "# number of qubits is equal to the number of features\n", "num_qubits = 2\n", "\n", "# number of steps performed during the training procedure\n", "tau = 100\n", "\n", "# regularization parameter\n", "C = 1000" ] }, { "cell_type": "markdown", "id": "improving-wilderness", "metadata": {}, "source": [ "The algorithm will run using:\n", "\n", "- The default fidelity instantiated in `FidelityQuantumKernel`\n", "- A quantum kernel created from `ZFeatureMap`" ] }, { "cell_type": "code", "execution_count": 4, "id": "automated-allergy", "metadata": {}, "outputs": [], "source": [ "from qiskit import BasicAer\n", "from qiskit.circuit.library import ZFeatureMap\n", "from qiskit_algorithms.utils import algorithm_globals\n", "\n", "from qiskit_machine_learning.kernels import FidelityQuantumKernel\n", "\n", "algorithm_globals.random_seed = 12345\n", "\n", "feature_map = ZFeatureMap(feature_dimension=num_qubits, reps=1)\n", "\n", "qkernel = FidelityQuantumKernel(feature_map=feature_map)" ] }, { "cell_type": "markdown", "id": "attractive-stationery", "metadata": {}, "source": [ "The implementation `PegasosQSVC` is compatible with the `scikit-learn` interfaces and has a pretty standard way of training a model. In the constructor we pass parameters of the algorithm, in this case there are a regularization hyper-parameter $C$ and a number of steps.\n", "\n", "Then we pass training features and labels to the `fit` method, which trains a models and returns a fitted classifier.\n", "\n", "Afterwards, we score our model using test features and labels." ] }, { "cell_type": "code", "execution_count": 5, "id": "representative-thumb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PegasosQSVC classification test score: 1.0\n" ] } ], "source": [ "from qiskit_machine_learning.algorithms import PegasosQSVC\n", "\n", "pegasos_qsvc = PegasosQSVC(quantum_kernel=qkernel, C=C, num_steps=tau)\n", "\n", "# training\n", "pegasos_qsvc.fit(train_features, train_labels)\n", "\n", "# testing\n", "pegasos_score = pegasos_qsvc.score(test_features, test_labels)\n", "print(f\"PegasosQSVC classification test score: {pegasos_score}\")" ] }, { "cell_type": "markdown", "id": "sustainable-empire", "metadata": {}, "source": [ "For visualization purposes we create a mesh grid of a predefined step that spans our minimum and maximum values we applied in MinMaxScaler. We also add some margin to the grid for better representation of the training and test samples." ] }, { "cell_type": "code", "execution_count": 6, "id": "judicial-pottery", "metadata": {}, "outputs": [], "source": [ "grid_step = 0.2\n", "margin = 0.2\n", "grid_x, grid_y = np.meshgrid(\n", " np.arange(-margin, np.pi + margin, grid_step), np.arange(-margin, np.pi + margin, grid_step)\n", ")" ] }, { "cell_type": "markdown", "id": "marine-constitution", "metadata": {}, "source": [ "We convert the grid to the shape compatible with the model, the shape should be `(n_samples, n_features)`.\n", "Then for each grid point we predict a label. In our case predicted labels will be used for coloring the grid." ] }, { "cell_type": "code", "execution_count": 7, "id": "competitive-outdoors", "metadata": {}, "outputs": [], "source": [ "meshgrid_features = np.column_stack((grid_x.ravel(), grid_y.ravel()))\n", "meshgrid_colors = pegasos_qsvc.predict(meshgrid_features)" ] }, { "cell_type": "markdown", "id": "former-constraint", "metadata": {}, "source": [ "Finally, we plot our grid according to the labels/colors we obtained from the model. We also plot training and test samples." ] }, { "cell_type": "code", "execution_count": 8, "id": "monetary-knife", "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.figure(figsize=(5, 5))\n", "meshgrid_colors = meshgrid_colors.reshape(grid_x.shape)\n", "plt.pcolormesh(grid_x, grid_y, meshgrid_colors, cmap=\"RdBu\", shading=\"auto\")\n", "\n", "plt.scatter(\n", " train_features[:, 0][train_labels == 0],\n", " train_features[:, 1][train_labels == 0],\n", " marker=\"s\",\n", " facecolors=\"w\",\n", " edgecolors=\"r\",\n", " label=\"A train\",\n", ")\n", "plt.scatter(\n", " train_features[:, 0][train_labels == 1],\n", " train_features[:, 1][train_labels == 1],\n", " marker=\"o\",\n", " facecolors=\"w\",\n", " edgecolors=\"b\",\n", " label=\"B train\",\n", ")\n", "\n", "plt.scatter(\n", " test_features[:, 0][test_labels == 0],\n", " test_features[:, 1][test_labels == 0],\n", " marker=\"s\",\n", " facecolors=\"r\",\n", " edgecolors=\"r\",\n", " label=\"A test\",\n", ")\n", "plt.scatter(\n", " test_features[:, 0][test_labels == 1],\n", " test_features[:, 1][test_labels == 1],\n", " marker=\"o\",\n", " facecolors=\"b\",\n", " edgecolors=\"b\",\n", " label=\"B test\",\n", ")\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=\"upper left\", borderaxespad=0.0)\n", "plt.title(\"Pegasos Classification\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 9, "id": "imperial-promise", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

Version Information

Qiskit SoftwareVersion
qiskit-terra0.22.0
qiskit-aer0.11.0
qiskit-ignis0.7.0
qiskit0.33.0
qiskit-machine-learning0.5.0
System information
Python version3.7.9
Python compilerMSC v.1916 64 bit (AMD64)
Python builddefault, Aug 31 2020 17:10:11
OSWindows
CPUs4
Memory (Gb)31.837730407714844
Thu Oct 13 10:42:49 2022 GMT Daylight Time
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

This code is a part of Qiskit

© Copyright IBM 2017, 2022.

This code is licensed under the Apache License, Version 2.0. You may
obtain a copy of this license in the LICENSE.txt file in the root directory
of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.

Any modifications or derivative works of this code must retain this
copyright notice, and modified files need to carry a notice indicating
that they have been altered from the originals.

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import qiskit.tools.jupyter\n", "\n", "%qiskit_version_table\n", "%qiskit_copyright" ] } ], "metadata": { "celltoolbar": "Tags", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" } }, "nbformat": 4, "nbformat_minor": 5 }