QClassifier_CN.ipynb 161.6 KB
Notebook
Newer Older
Q
Quleaf 已提交

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 量子分类器\n",
    "\n",
    "<em> Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. </em>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 概览\n",
    "\n",
    "本教程我们将讨论量子分类器(quantum classifier)的原理,以及如何利用量子神经网络(quantum neural network, QNN)来完成**二分类**任务。这类方法早期工作的主要代表是 Mitarai et al.(2018) 的量子电路学习 [(Quantum Circuit Learning, QCL)](https://arxiv.org/abs/1803.00745) [1], Farhi & Neven (2018) [2] 和 Schuld et al.(2018) 的中心电路量子分类器 [Circuit-Centric Quantum Classifiers](https://arxiv.org/abs/1804.00633) [3]。这里我们以第一类的 QCL 框架应用于监督学习(Supervised learning)为例进行介绍,通常我们需要先将经典数据编码成量子数据,然后通过训练量子神经网络的参数,最终得到一个最优的分类器。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 背景\n",
    "\n",
    "在监督学习的情况下,我们需要输入 $N$ 个带标签的数据点构成的数据集 $D = \\{(x^k,y^k)\\}_{k=1}^{K}$,其中 $x^k\\in \\mathbb{R}^{m}$ 是数据点,$y^k \\in\\{0,1\\}$ 是对应数据点 $x^k$ 的分类标签。**分类过程实质上是一个决策过程,决策给定数据点的标签归属问题**。 对于量子分类器框架,分类器 $\\mathcal{F}$ 的实现方式为一个含参 $\\theta$ 的量子神经网络/参数化量子电路, 测量量子系统以及数据后处理的组合。一个优秀的分类器 $\\mathcal{F}_\\theta$ 应该尽可能的将每个数据集内的数据点正确地映射到相对应的标签上 $\\mathcal{F}_\\theta(x^k) \\rightarrow y^k$。因此,我们将预测标签 $\\tilde{y}^{k} = \\mathcal{F}_\\theta(x^k)$ 和实际标签 $y^k$ 之间的累计距离作为损失函数 $\\mathcal{L}(\\theta)$ 进行优化。对于两分类任务,可以选择二次损失函数\n",
    "\n",
    "$$\n",
    "\\mathcal{L}(\\theta) = \\sum_{k=1}^N 1/N \\cdot |\\tilde{y}^{k}-y^k|^2. \\tag{1}\n",
    "$$\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 方案流程\n",
    "\n",
    "这里我们给出实现量子电路学习 (QCL) 框架下量子分类器的一个流程。\n",
    "\n",
    "1. 将经典数据编码$x^k$为量子数据$\\lvert \\psi_{\\rm in}\\rangle^k$。本教程采用角度编码。关于编码方式的具体操作,见[量子态编码经典数据](./DataEncoding_CN.ipynb)。用户也可以尝试其他编码,如振幅编码,体验不同编码方式对分类器学习效率的影响。\n",
    "2. 构建可调参数量子电路,对应幺正变换(unitary gate)$U(\\theta)$。\n",
    "3. 对每一个量子数据$\\lvert\\psi_{\\rm in}\\rangle^k$,通过参数化量子电路$U(\\theta)$,得到输出态$\\lvert \\psi_{\\rm out}\\rangle^k = U(\\theta)\\lvert \\psi_{\\rm in} \\rangle^k$。\n",
    "4. 对每一个量子数据得到的输出量子态$\\lvert \\psi_{\\rm out}\\rangle^k$,通过测量与数据后处理,得到标签 $\\tilde{y}^{k}$。\n",
    "5. 重复以上步骤,得到数据集内所有点的标签,并计算损失函数 $\\mathcal{L}(\\theta)$。\n",
    "6. 通过梯度下降等优化方法不断调整参数 $\\theta$ 的值,从而最小化损失函数。记录优化完成后的最优参数 $\\theta^*$, 这时我们就学习到了最优的分类器 $\\mathcal{F}_{\\theta^*}$。\n",
    "\n",
    "<img src=\"./figures/qclassifier-fig-pipeline-cn.png\" width=\"700px\" /> \n",
    "<center> 图 1:量子分类器训练的流程图 </center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Paddle Quantum 实现\n",
    "\n",
    "这里,我们先导入所需要的语言包:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-02T09:15:03.419838Z",
     "start_time": "2021-03-02T09:15:03.413324Z"
    }
   },
   "outputs": [],
   "source": [
    "# 导入 numpy、paddle 和 paddle_quantum\n",
    "import numpy as np\n",
    "import paddle\n",
    "import paddle_quantum\n",
    "\n",
    "# 构建量子电路\n",
    "from paddle_quantum.ansatz import Circuit\n",
    "\n",
    "# 一些用到的函数\n",
    "from numpy import pi as PI\n",
    "from paddle import matmul, transpose, reshape  # paddle 矩阵乘法与转置\n",
    "from paddle_quantum.qinfo import pauli_str_to_matrix  # 得到 N 量子比特泡利矩阵,\n",
    "from paddle_quantum.linalg import dagger # 复共轭\n",
    "\n",
    "# 作图与计算时间\n",
    "from matplotlib import pyplot as plt\n",
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "分类器问题用到的参数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 数据集参数设置\n",
    "Ntrain = 200        # 规定训练集大小\n",
    "Ntest = 100         # 规定测试集大小\n",
    "boundary_gap = 0.5  # 设置决策边界的宽度\n",
    "seed_data = 2       # 固定随机种子\n",
    "# 训练参数设置\n",
    "N = 4               # 所需的量子比特数量\n",
    "DEPTH = 1           # 采用的电路深度\n",
    "BATCH = 20          # 训练时 batch 的大小\n",
    "EPOCH = int(200 * BATCH / Ntrain)\n",
    "                    # 训练 epoch 轮数,使得总迭代次数 EPOCH * (Ntrain / BATCH) 在200左右\n",
    "LR = 0.01           # 设置学习速率\n",
    "seed_paras = 19     # 设置随机种子用以初始化各种参数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据集的生成\n",
    "\n",
    "对于监督学习来说,我们绕不开的一个问题就是——采用什么样的数据集呢?在这个教程中我们按照论文 [1] 里所提及方法生成简单的圆形决策边界二分数据集 $\\{(x^{k}, y^{k})\\}$。其中数据点 $x^{k}\\in \\mathbb{R}^{2}$,标签 $y^{k} \\in \\{0,1\\}$。\n",
    "\n",
    "<img src=\"./figures/qclassifier-fig-data-cn.png\" width=\"400px\" /> \n",
    "<center> 图 2:生成的数据集和对应的决策边界 </center>\n",
    "\n",
    "具体的生成方式和可视化请见如下代码:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据集生成函数 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-02T09:15:04.631031Z",
     "start_time": "2021-03-02T09:15:04.617301Z"
    }
   },
   "outputs": [],
   "source": [
    "# 圆形决策边界两分类数据集生成器\n",
    "def circle_data_point_generator(Ntrain, Ntest, boundary_gap, seed_data):\n",
    "    \"\"\"\n",
    "    :param Ntrain: 训练集大小\n",
    "    :param Ntest: 测试集大小\n",
    "    :param boundary_gap: 取值于 (0, 0.5), 两类别之间的差距\n",
    "    :param seed_data: 随机种子\n",
    "    :return: 四个列表:训练集x,训练集y,测试集x,测试集y\n",
    "    \"\"\"\n",
    "    # 生成共Ntrain + Ntest组数据,x对应二维数据点,y对应编号\n",
    "    # 取前Ntrain个为训练集,后Ntest个为测试集\n",
    "    train_x, train_y = [], []\n",
    "    num_samples, seed_para = 0, 0\n",
    "    while num_samples < Ntrain + Ntest:\n",
    "        np.random.seed((seed_data + 10) * 1000 + seed_para + num_samples)\n",
    "        data_point = np.random.rand(2) * 2 - 1  # 生成[-1, 1]范围内二维向量\n",
    "\n",
    "        # 如果数据点的模小于(0.7 - gap),标为0\n",
    "        if np.linalg.norm(data_point) < 0.7 - boundary_gap / 2:\n",
    "            train_x.append(data_point)\n",
    "            train_y.append(0.)\n",
    "            num_samples += 1\n",
    "\n",
    "        # 如果数据点的模大于(0.7 + gap),标为1\n",
    "        elif np.linalg.norm(data_point) > 0.7 + boundary_gap / 2:\n",
    "            train_x.append(data_point)\n",
    "            train_y.append(1.)\n",
    "            num_samples += 1\n",
    "        else:\n",
    "            seed_para += 1\n",
    "\n",
    "    train_x = np.array(train_x).astype(\"float64\")\n",
    "    train_y = np.array([train_y]).astype(\"float64\").T\n",
    "\n",
    "    print(\"训练集的维度大小 x {} 和 y {}\".format(np.shape(train_x[0:Ntrain]), np.shape(train_y[0:Ntrain])))\n",
    "    print(\"测试集的维度大小 x {} 和 y {}\".format(np.shape(train_x[Ntrain:]), np.shape(train_y[Ntrain:])), \"\\n\")\n",
    "\n",
    "    return train_x[0:Ntrain], train_y[0:Ntrain], train_x[Ntrain:], train_y[Ntrain:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据集可视化函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 用以可视化生成的数据集\n",
    "def data_point_plot(data, label):\n",
    "    \"\"\"\n",
    "    :param data: 形状为 [M, 2], 代表M个 2-D 数据点\n",
    "    :param label: 取值 0 或者 1\n",
    "    :return: 画这些数据点\n",
    "    \"\"\"\n",
    "    dim_samples, dim_useless = np.shape(data)\n",
    "    plt.figure(1)\n",
    "    for i in range(dim_samples):\n",
    "        if label[i] == 0:\n",
    "            plt.plot(data[i][0], data[i][1], color=\"r\", marker=\"o\")\n",
    "        elif label[i] == 1:\n",
    "            plt.plot(data[i][0], data[i][1], color=\"b\", marker=\"o\")\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "此教程采用大小分别为 200, 100 的训练集,测试集,决策边界宽度为 0.5 的数据,用以训练与测试量子神经网络训练效果:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-02T09:15:06.422981Z",
     "start_time": "2021-03-02T09:15:05.043595Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "训练集的维度大小 x (200, 2) 和 y (200, 1)\n",
      "测试集的维度大小 x (100, 2) 和 y (100, 1) \n",
      "\n",
      "训练集 200 个数据点的可视化:\n"
     ]
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "测试集 100 个数据点的可视化:\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAD4CAYAAADhNOGaAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAgsklEQVR4nO3df+xdd33f8efLDpEwRJA4TnB+fZ1WGVqoSkq+Mj9atSBIl7iiBmloSb9QT2PysimoTGu3ZJFQtMoSo6MTVJTKUISJvyJiApqoGEJgZSlDQL6J8sNpauJYjmPsJd84CJoRLU383h/nXHx8fX+c+73n93k9pKt77/lxz+eee+55f87n11FEYGZm/bWu7gSYmVm9HAjMzHrOgcDMrOccCMzMes6BwMys586qOwFrcf7558eWLVvqToaZWavcf//9z0bEpuHprQwEW7ZsYWVlpe5kmJm1iqQnR0130ZCZWc85EJiZ9ZwDgZlZzzkQmJn1nAOBmVnPFRIIJH1O0jOS9o+ZL0mflHRQ0sOS3pSZd62kA+m8m4tIj5nNZ3kZtmyBdeuS5+XlulNkZSrqiuDzwLUT5l8HXJE+dgKfBpC0HvhUOv9K4AZJVxaUJjNbg+Vl2LkTnnwSIpLnnTsdDLqskEAQEfcCz01YZDvwhUh8H3itpM3AVuBgRByKiBeBO9JlW8s5KWu7W2+Fn//89Gk//3ky3bqpqjqCi4GnMu+PptPGTT+DpJ2SViStrK6ulpbQeTgnZV1w5Mhs0639qgoEGjEtJkw/c2LE7ohYjIjFTZvO6CHdCG3JSfmqpV5N3/+XXTbb9Lar4/do2jFQ1RATR4FLM+8vAY4BZ4+Z3kptyEkNrloGAWtw1QKwtFRfuvqi6ft/eRmef/7M6Rs2wK5d1aenbHX8Ho08BiKikAewBdg/Zt7vAF8nuQJ4C/DDdPpZwCHgcpKg8BDwhmnbuvrqq6OJFhYikkKh0x/r10fs3Vt36hLj0riwUOx29u5NPlNKnpvy/etW1f5fi717IzZsODNtGzd29/fL+3sUcTwPPmPU9qo6BoCVGHWOHjVx1gfwReA48I8kuf8PAjcCN6bzRdI66AngEWAxs+424EfpvFvzbK+IQFDGiWrcHwmS6U34M0mj0ycVt41R+6Ep379uVez/tWpykJrXuP97nt+jiON50rmhymOg1EBQ9WPeQFDmiWrv3uQKoKl/qCr+7F0+ocyryfumyUFqHpP+73l+jyJ+s0lXAp25Iqj6MW8gKPvP2OQ/VBW59SZ//7o1+WqpyUFqHpO+V57fo4jjedxnVH0MOBBklH2iavofquzy+6Z//7o1tf6kiiBVx3ef9n+flqayrwiqPAYcCDLKPlE1OddXhb5//zYr80Rd13Ex7/+9rDqCOv4TDgQZXc35NEnfv7+dqa4rxaJO5EW1GqrzPzEuECiZ1y6Li4sx760ql5eTjl5HjiQdZbZtg337Tr3ftasZ7brNumLduuQ0PEyCkyfL3fbw/72v/29J90fE4vD03g5DvbQEhw8nB+CuXbBnT7FDQ1TRc7BpvRPNJqmzx3L2/374cD+DwCS9DQRZRQ8NUcWYQx7XyNpm166kh3LWcI9lZ25qMqq8qOmPonsWF92KaK1lobOUIbpljrXRpGO8KRWqXYbrCMbbsiXJUQ9bWEguI2e1lrLQ4fFHIMkt7d49+jK2zvJWszIU/T+0M7mOYII8l6yzWEtZ6KzFU30bIdK6rw2DNnaVAwFJjnv37iTnISXP43LieawlsMz6Jyg6eJnVzZmb+jgQpIpsVbCWwDLrn6Do4GVWt7IyN66AzmFUxUHTH00dhnoerigzK77Tlf9Xp2NMZbGvCBrCOXyz4tv713nXwDZdibjVkJl1Vl2t62ZtBVgVtxoys96pqwK6LfcvHygkEEi6VtIBSQcl3Txi/h9JejB97Jf0sqTz0nmHJT2SznM238wKU1frurY1hZ07EEhaT3IbyuuAK4EbJF2ZXSYi/iQiroqIq4BbgP8VEc9lFnlHOv+MSxYzs7Wqq+6tbU1hi7gi2AocjIhDEfEicAewfcLyN5Dc47jRyqzoaVMlklnb1THgXNv6+RQRCC4Gnsq8P5pOO4OkDcC1wJczkwP4pqT7Je0ctxFJOyWtSFpZXV0tINnjlTmgmweLM+u+trUCnLvVkKT3Af8sIv51+v4DwNaI+NCIZf8F8P6IeHdm2kURcUzSBcA9wIci4t5J2yy71VCZY554PBUzq0uZrYaOApdm3l8CHBuz7PUMFQtFxLH0+RngqyRFTbUYFNmMOlFDMRU9batEMrPuKyIQ3AdcIelySWeTnOzvGl5I0muA3wLuzEx7laRzBq+B3wb2F5CmmWWLbMYpoqKnbZVIZtZ9cweCiHgJuAm4G3gM+FJEPCrpRkk3ZhZ9L/DNiPi/mWkXAt+V9BDwQ+BrEfGNedO0FqPa/WYVVdHTtkokM6tPVQ1LziriQyJiH7BvaNpfDL3/PPD5oWmHgDcWkYZ5TSqaWVgo9h6nr3zlqaCzcSN84hPNrUQys3oM904eNCyB4s8X7lmcGlc0M6jELWLHD37YEydOTXvhhfk/18y6p8reyQ4EqSqKbNrW7dzM6lNlwxIHglQV7X7dYsjM8qqyYYkDQUbZPRDdYsjM8qqyYYkDQYXcYsjM8qqyd3IhrYYsn8EPeOutSXHQZZcV2xrJzLplaama84OvCCpWxwBYZlaNtg4o6SsCM7MCVNnuv2i+IjAzK0Cbm4c7EJiZFaDNzcMdCMzMCnDeebNNbxIHAjOznnMgMDMrwHPPzTa9SRwIzMwKkHfkgCY2MXUgMDMrQJ6RA5p6z3IHghk1MZqbWf3yDAnR1CamhQQCSddKOiDpoKSbR8x/u6SfSnowfXwk77pN0tRobmbNMG3kgKY2MZ07EEhaD3wKuA64ErhB0pUjFv3biLgqffyXGddthKZGczNrh6aOQFzEFcFW4GBEHIqIF4E7gO0VrFu5pkZzM2uHpo5AXEQguBh4KvP+aDpt2FslPSTp65LeMOO6SNopaUXSyurqagHJnl1To7mZtUOVQ0vPoohAoBHTYuj9A8BCRLwR+DPgr2ZYN5kYsTsiFiNicdOmTWtN61yaGs3NrD2aOAJxEYHgKHBp5v0lwLHsAhHxs4h4Pn29D3iFpPPzrNskTY3mZmbzKCIQ3AdcIelySWcD1wN3ZReQ9DpJSl9vTbd7Is+6TdPEaG4N5bbG1hJz348gIl6SdBNwN7Ae+FxEPCrpxnT+XwD/HPi3kl4CXgCuj4gARq47b5rMatfmwemtd5Scj9tlcXExVlZW6k6G2XhbtiQn/2ELC8mlpFkNJN0fEYvD092z2KwMbmtsLeJAYFYGtzW2FnEgMCuD2xpbizgQmJXBbY2tRRwIzMoyrq2xm5Vaw8zdfNTMZuBmpdZAviIwm6bIHLyHsLUGciAwm6Tom1BMa1bqYiOrgQOB2SRF5+AnNSv1nY+sJg4E1l5V5J6L7hg2qVmpi42sJg4E1k5V5Z6L7hg2qVmpeyNbTRwIrJ2qyj2X0TFsXLNS90a2mjgQWDtVlXsuq2PYqGIt90a2mjgQWDtVmXsu+iYU44q1wL2RrRYOBNYOwznobdvam3ueVKzlOx9ZRlWtiR0IrPlG5aD37IEdO07lnjduhFe+Ej7wgea3v3elsOVQZWviQgKBpGslHZB0UNLNI+YvSXo4fXxP0hsz8w5LekTSg5J8txk707gc9L59Sa759tvhhRfgxIl2tL8vuljLndA6qdLWxBEx14PkFpNPAL8EnA08BFw5tMzbgHPT19cBP8jMOwycP8s2r7766rAekSKSU/zpDymZv7Awev7CQp2pHm/v3ogNG05P64YNyfRZPmPwvYf3z6yfZY007bBfC2AlRpxTi7gi2AocjIhDEfEicAewfSjYfC8ifpK+/T5wSQHbLZQzVQ02LQfdtqKWeVsiZcsMIDk/ZLkTWidU2R6iiEBwMfBU5v3RdNo4HwS+nnkfwDcl3S9p57iVJO2UtCJpZXV1da4ED3PP/oab1qyyje3v56kUHlVmMGzWIOicUONU2pp41GXCLA/gfcBnM+8/APzZmGXfATwGbMxMuyh9voCkWOk3p22z6KKhtpUs9NKgKERKnrNFH0UUtVSV1iKMKzNY68E7av9BxMaNLmKqWdGHEmOKhooIBG8F7s68vwW4ZcRyv0pSl/BPJnzWbcAfTttm0YGgjLI4q1jZJ99Z0lF2UBqXc1nr9iZ9nusbalfkoV1mIDgLOARczqnK4jcMLXMZcBB429D0VwHnZF5/D7h22jZ9RWCNVcXBNCrYDHIzazlTTLvC8B+hNkXnK8YFgrnrCCLiJeAm4O602OdLEfGopBsl3Zgu9hFgI/DnQ81ELwS+K+kh4IfA1yLiG/OmaVbu2W+FqaLielRl8+23J+eJtXRCm1aX0tRK9x6oqgmpkiDRLouLi7GyUmyXg+XlZOceOZL8L3btcqdOW4MtW0615slaWEhO0k00fPvMYU1Oe8etW3dmozBI4v/Jk7N/nqT7I2LxjO2sJXFd5J79Vog2Xl4OrjA2bjxzXtPT3nFVNYhzIDAbKKIJZVmjlZZtaQmefRb27m1f2jusqnyFi4bMYHTxyIYNPhFa7Yosth5XNORAYAbtLNs3m5HrCMwmadswFWYFciCwfhvUC4y7Mm7yMBVZHiLC5nBW3Qkwq820ZpNtaTEz/D2ydzxz/Ybl4CsC669Jg7e1qcVMpQPXWxf5isD6a1z5v9SuCmLXb9icfEVg/dXG4atH6cr3sNo4EFh/tbEX8Ch5vocrk20CBwLrrzJ6Addxwp32PXznJZvCHcrMitLU3snuLGcpdygzK1tTW++4MtmmcCAwK0pTT7iuTLYpHAjMitLUE24dleKunG6VQgKBpGslHZB0UNLNI+ZL0ifT+Q9LelPedc1ao6mtkKoeGtuV060zd2WxpPXAj4BrgKPAfcANEfF3mWW2AR8CtgFvBj4REW/Os+4oriy2xvKt7lw53WDjKouL6Fm8FTgYEYfSDd0BbAeyJ/PtwBfSmyd/X9JrJW0GtuRY16w9lpb6d+If1tS6EhuriKKhi4GnMu+PptPyLJNnXQAk7ZS0ImlldXV17kSbtUbbytubWldiYxURCDRi2nB507hl8qybTIzYHRGLEbG4adOmGZNo1lJtLG9val2JjVVEIDgKXJp5fwlwLOcyeda1vmtbjrhITe2bMElb79vcY0XUEdwHXCHpcuDHwPXA7w0tcxdwU1oH8GbgpxFxXNJqjnWtz/o+1n5by9tdV9Iqc18RRMRLwE3A3cBjwJci4lFJN0q6MV1sH3AIOAh8Bvh3k9adN03WIW3MERfJ5e1WAY81ZM22bt3o20hKcPJk9empWlPHL7JW8lhD1k59zxG7vN0q4EBgzTaqBcorXgHPP9+fyuOlpaQj1smTybODgBXMgcCabThHvHFj8nziRHuaU5rlVFcDOQcCa75sjvjVr4YXXzx9flMqj/vczNXmVmeXEQcCa5cmNqdcXobzz4f3v79dHb9m5UBXqjobyDkQWLs0rfJ4kI07ceLMeU25UilCG3s4t8yocfomTS+SAwHO6LRK04YvGJWNy2p6x6+8+t6fowLr1882vUi9DwTO6LTMuOaUUE80n3ai70oz1yYWyXXMyy/PNr1IvQ8Ezui00HBzSqgvmk860XdpoLWmFcl10MLCbNOL1PtA4IxOB9QZzUcVVUHSzLVLHb+aViTXQdu2JRe5WVXt4t4HAmd0OqDOaD6qqGrvXnj22e4EAXAP55ItL8OePaePpiLBjh3V7OLejzXkoVw6wLdGtJar6hD2WENjOKPTAS62sJaru4i694EAPJRL6zmaW8vVXUTtQGDd4GhuLVb3Re1cgUDSeZLukfR4+nzuiGUulfQ3kh6T9KikP8jMu03SjyU9mD62zZMes0q5J6IVpO6L2rkqiyV9DHguIj4q6Wbg3Ij4T0PLbAY2R8QDks4B7gfeExF/J+k24PmI+G+zbNc3prHauZWBtVBZlcXbgT3p6z3Ae4YXiIjjEfFA+vofSG5JefGc2zWrl3siWofMGwgujIjjkJzwgQsmLSxpC/BrwA8yk2+S9LCkz40qWjJrpLqbeZgVaGogkPQtSftHPLbPsiFJrwa+DHw4In6WTv408MvAVcBx4OMT1t8paUXSyurq6iybNite3c08zAo0NRBExLsi4ldGPO4Enk7rAAZ1Ac+M+gxJryAJAssR8ZXMZz8dES9HxEngM8DWCenYHRGLEbG4adOm2b6lWdHqbuZhVqB5i4buAnakr3cAdw4vIEnAXwKPRcSfDs3bnHn7XmD/nOkxq0bdzTys86pslDZvq6GNwJeAy4AjwPsi4jlJFwGfjYhtkn4D+FvgEeBkuup/joh9km4nKRYK4DDwbwZ1DpO41ZCZdVlZjdLGtRrq/VhD0ywvJw1BjhxJin937XKmz8zKVdbYQx5raA1805qOcMcva5mqG6U5EEzgpuId4GhuLVR1ozQHggncVLwDHM2thapulOZAMIGbijdcniKfLkVzF3H1RtWN0hwIJnBT8QbLW+TTlWjuIq7eqXJAXQeCCdxUvMHyFvl0JZq7iMtK5Oaj1k7r1p1+g9cBKclCZXWhDfAs39dsjHHNR8+qIzFmc7vsstENrUcV+Swtte/EP2yW72s2IxcNWTt1pcgnr759X6uUA4G1U98qcPr2fa1SriMoQBeKoM2s+zzExJCimmS7VZ+ZtV0vA0GRJ2+36jOztutlICjy5N2ljqtm1k+9DARFnry70nHVzPqrl4GgyJO3W/WZWdvNFQgknSfpHkmPp8/njlnusKRHJD0oaWXW9YtW5MnbrfrMrO3mvSK4Gfh2RFwBfDt9P847IuKqoaZLs6xfmKJP3lUODmVmVrR571l8AHh7RBxPb0T/nYh4/YjlDgOLEfHsWtYf1rR+BGZmbVBWP4ILBzebT58vGLNcAN+UdL+knWtY38zMSjJ10DlJ3wJeN2LWLI0tfz0ijkm6ALhH0t9HxL0zrE8aQHYCXOYmOWZmhZkaCCLiXePmSXpa0uZM0c4zYz7jWPr8jKSvAluBe4Fc66fr7gZ2Q1I0NC3dZmaWz7xFQ3cBO9LXO4A7hxeQ9CpJ5wxeA78N7M+7vpmZlWveQPBR4BpJjwPXpO+RdJGkfekyFwLflfQQ8EPgaxHxjUnrm5lZdeYKBBFxIiLeGRFXpM/PpdOPRcS29PWhiHhj+nhDROyatn7b5BnAzvcdN7Om8h3K5jQYwG4wdtFgADs41Z8gzzJmZnXpzRATZeXI8wxg5xFKzazJenFFUGaOPM8Adh6h1MyarBdXBGXmyPMMYOcRSs26qwv1f70IBGXmyPMMYLdtWzKm0aRlzKx9unKHwl4EgjJz5NMGsFtehj17koNkQIIdO1xRbNZ2Xan/68XN64frCCDJkVcxXPSWLUkuYdjCQjJSqZm117p1p2fyBqRkNOKm6fXN6+u8Z4Aris26qyv1f70IBFDfPQO6cqCY2ZlmvcnVPBXLZVZK9yYQ1MW3sjTrrllKG+apWC67UroXdQR1W15OKo+OHEmuBHbtckWxWd/MU19YVF3juDoCBwIzswrMU7FcVKV0ryuLu6QLnVfM+mie+sKy6xodCFqkK51XzPponvrCsusaHQhapCudV8z6aJ5m7GU3gXcdQYu0rfOKmTVLKXUEks6TdI+kx9Pnc0cs83pJD2YeP5P04XTebZJ+nJm3bZ70dJ37JJhZGeYtGroZ+HZEXAF8O31/mog4EBFXRcRVwNXAz4GvZhb574P5EbFveH07xX0SzKwM8waC7cCe9PUe4D1Tln8n8EREjGgRa9PUOVSGmXXXvDemuTAijgNExHFJF0xZ/nrgi0PTbpL0+8AK8B8i4idzpqnTlpZ84jezYk29IpD0LUn7Rzy2z7IhSWcDvwv8j8zkTwO/DFwFHAc+PmH9nZJWJK2srq7OsmkzM5tgaiCIiHdFxK+MeNwJPC1pM0D6/MyEj7oOeCAins589tMR8XJEnAQ+A2ydkI7dEbEYEYubNm3K+/3MzNasLx04560juAvYkb7eAdw5YdkbGCoWGgSR1HuB/XOmx8wst0kn+j514Jw3EHwUuEbS48A16XskXSTpFy2AJG1I539laP2PSXpE0sPAO4B/P2d6bIq+5HDMppl2ou9TB053KJtiMHLok0/C+vXw8stJa502jiBa553azJpm2oieXezA6UHn1iCbY4AkCEB7LxH7lMMxm2bS3QOXl5NAMEpZHTjrvFr3FcEE43IMA22773AXczhmazXu/71xI7zwwpmZJijvCrqqq3VfEazBtPsKt+2+wx6iwuxUzvvJJ5NMUNag5/6oILB+fXlBYMeOeq/WHQgmmHaCbNsJ1ENUWN8NF/dGnAoGg576zz03et2TJ2cPAtOKewbpGRQ7D6sssxkRrXtcffXVUYW9eyM2bIhIDpfTHxs2JPPbZu/eiIWFCCl5buN3MFurhYXR/+eFhdmWyWPU+WP4vDFuW2vd5jTASow4p9Z+Ul/Lo6pAEHHqxAkR69ef+nHqOoH6RG62dtLoE650apk8J/A88gSUcekpK7PpQNABRR2gdXIgszrlze0XcZzmCTrj0rN+fTn/DQeCDijqknWg6pNyFwKZtVuVx2Ce/2vV/wkHgg7Ik8PIq46TctGBzGwtqsoA5f2PVZkhGxcI3I+gRab1hKzrs/JyPwbrm8HIBEeOJK0M6x6RwP0IOqDI5p+TelWWxf0YrG+WlpKM1cmTyXNTh3JxIGiRIu9QVsdJeVQgk2Cb71RtVisHgpYpKoeR5+qi6LFPlpaSHpTZ3pwRsGdP+8ZtajuPQmunGVVx0PRHXyuLizapkqqsyuSmVBj3uRlrF1tv9fn3nAVuNVS+Lh2MZZ2wi2z5tFZdPBHOoinBuCh9/z1nMS4QuGioIF27m1FZlclNqDDu+3Dcs/62TS9G6uLvWfk+HxUd8j6A9wGPAieBxQnLXQscAA4CN2emnwfcAzyePp+bZ7tNvCLoWi6rrO/ThNxbE65K6jTLb9uE32uarv2eZe5zyigaAv4p8HrgO+MCAbAeeAL4JeBs4CHgynTexwaBAbgZ+K95ttvEQOCDcbbPrrMIrWtBe1az/LZt2FdtSOMsyvw+pQSCX3zI5EDwVuDuzPtbgFvS1weAzenrzcCBPNtrYiDo2sEYUf8JuyxtyOWWLe9v24YMTtd+zzL3+bhAUEUdwcXAU5n3R9NpABdGxHGA9PmCCtJTii6O9d+WzjCzKrI/Rlvl/W2bUKczTdd+zzr2+dRAIOlbkvaPeGzPuQ2NmBazJRMk7ZS0ImlldXV11tVL17WDseu6GuSK1pYMTpd+zzr2+VnTFoiId825jaPApZn3lwDH0tdPS9ocEcclbQaemZCO3cBuSMYamjNNpVhaavcBaDZscDw3abycrqtjnxcy6Jyk7wB/GBFnjAQn6SzgR8A7gR8D9wG/FxGPSvoT4EREfFTSzcB5EfEfp22vr4POmZnNo5RB5yS9V9JRkgrhr0m6O51+kaR9ABHxEnATcDfwGPCliHg0/YiPAtdIehy4Jn1vZmYV8jDUZmY94WGozcxsJAcCM7OecyAwM+u5VtYRSFoFRtxoMZfzgWcLTE5RnK7ZNTVtTtdsmpouaG7a1pquhYjYNDyxlYFgHpJWRlWW1M3pml1T0+Z0zaap6YLmpq3odLloyMys5xwIzMx6ro+BYHfdCRjD6ZpdU9PmdM2mqemC5qat0HT1ro7AzMxO18crAjMzy3AgMDPruU4GAknvk/SopJOSxjaxknStpAOSDqajnw6mnyfpHkmPp8/nFpSuqZ8r6fWSHsw8fibpw+m82yT9ODNvW1XpSpc7LOmRdNsrs65fRrokXSrpbyQ9lv7mf5CZV+j+Gne8ZOZL0ifT+Q9LelPedeeVI21LaZoelvQ9SW/MzBv5u1aUrrdL+mnmN/pI3nVLTtcfZdK0X9LLks5L55W5vz4n6RlJ+8fML+cYG3XbsrY/qOleyjnSNdPnpmn8PySdQABuIxnuu+j9lStdwGHg/Hm/V5HpIrnF6ZvS1+eQDHk++B0L21+TjpfMMtuAr5PcjOktwA/yrltB2t4GnJu+vm6Qtkm/a0Xpejvw12tZt8x0DS3/buB/lr2/0s/+TeBNwP4x80s5xjp5RRARj0XEgSmLbQUORsShiHgRuAMY3HVtO7Anfb0HeE9BSZv1c98JPBERa+1Fnde837e2/RURxyPigfT1P5AMdX7x8HIFmHS8ZNP7hUh8H3itkhsu5Vm31LRFxPci4ifp2++T3CCqbPN87zL32ayffQPwxYK2PVFE3As8N2GRUo6xTgaCnOq4l/Ksn3s9Zx6AN6WXhJ8rqghmhnQF8E1J90vauYb1y0oXAJK2AL8G/CAzuaj9Nel4mbZMnnXnMevnf5AkVzkw7netKl1vlfSQpK9LesOM65aZLiRtAK4FvpyZXNb+yqOUY2zqrSqbStK3gNeNmHVrRNyZ5yNGTJu7Le2kdM34OWcDvwvckpn8aeCPSdL5x8DHgX9VYbp+PSKOSboAuEfS36c5mDUrcH+9muTP+uGI+Fk6ec37a9QmRkwbPl7GLVPKsZZju2cuKL2DJBD8RmZy4b/rDOl6gKTo8/m0DuevgCtyrltmugbeDfzviMjm0svaX3mUcoy1NhBEQ+6lPEu6JM3yudcBD0TE05nP/sVrSZ8B/rrKdEXEsfT5GUlfJbkcvZea95ekV5AEgeWI+Erms9e8v0aYdLxMW+bsHOvOI0/akPSrwGeB6yLixGD6hN+19HRlgjYRsU/Sn0s6P8+6ZaYr44yr8hL3Vx6lHGN9Lhq6D7hC0uVp7vt64K503l3AjvT1DiDPFUYes3zuGeWS6clw4L3AyJYFZaRL0qsknTN4Dfx2Zvu17S9JAv4SeCwi/nRoXpH7a9Lxkk3v76ctO94C/DQt0sqz7jymfr6ky4CvAB+IiB9lpk/6XatI1+vS3xBJW0nOSSfyrFtmutL0vAb4LTLHXcn7K49yjrEyar7rfpD86Y8C/w94Grg7nX4RsC+z3DaSViZPkBQpDaZvBL4NPJ4+n1dQukZ+7oh0bSD5M7xmaP3bgUeAh9MfeXNV6SJpjfBQ+ni0KfuLpIgj0n3yYPrYVsb+GnW8ADcCN6avBXwqnf8ImRZr4461Ao/5aWn7LPCTzD5amfa7VpSum9LtPkRSif22KvbZtHSl7/8lcMfQemXvry8Cx4F/JDmHfbCKY8xDTJiZ9Vyfi4bMzAwHAjOz3nMgMDPrOQcCM7OecyAwM+s5BwIzs55zIDAz67n/D8qKOkqmMigOAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      " 读者不妨自己调节数据集的参数设置来生成属于自己的数据集吧!\n"
     ]
    }
   ],
   "source": [
    "# 生成自己的数据集\n",
    "train_x, train_y, test_x, test_y = circle_data_point_generator(Ntrain, Ntest, boundary_gap, seed_data)\n",
    "\n",
    "# 打印数据集的维度信息\n",
    "print(\"训练集 {} 个数据点的可视化:\".format(Ntrain))\n",
    "data_point_plot(train_x, train_y)\n",
    "print(\"测试集 {} 个数据点的可视化:\".format(Ntest))\n",
    "data_point_plot(test_x, test_y)\n",
    "print(\"\\n 读者不妨自己调节数据集的参数设置来生成属于自己的数据集吧!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据的预处理\n",
    "\n",
    "与经典机器学习不同的是,量子分类器在实际工作的时候需要考虑数据的预处理。我们需要多加一个步骤将经典的数据转化成量子信息才能放在量子计算机上运行。此处我们采用角度编码方式得到量子数据。\n",
    "\n",
    "首先我们确定需要使用的量子比特数量。因为我们的数据 $\\{x^{k} = (x^{k}_0, x^{k}_1)\\}$ 是二维的, 按照 Mitarai (2018) 论文[1]中的编码方式我们至少需要2个量子比特。接着准备一系列的初始量子态 $|00\\rangle$。然后将经典信息 $\\{x^{k}\\}$ 编码成一系列量子门 $U(x^{k})$ 并作用在初始量子态上。最终得到一系列的量子态 $|\\psi_{\\rm in}\\rangle^k = U(x^{k})|00\\rangle$。这样我们就完成从经典信息到量子信息的编码了!\n",
    "\n",
    "给定 $m$ 个量子比特去编码二维的经典数据点,采用角度编码,量子门的构造为:\n",
    "\n",
    "$$\n",
    "U(x^{k}) = \\otimes_{j=0}^{m-1} R_j^z\\big[\\arccos(x^{k}_{j \\, \\text{mod} \\, 2}\\cdot x^{k}_{j \\, \\text{mod} \\, 2})\\big] R_j^y\\big[\\arcsin(x^{k}_{j \\, \\text{mod} \\, 2}) \\big],\\tag{2}\n",
    "$$\n",
    "\n",
    "**注意** :这种表示下,我们将第一个量子比特编号为 $j = 0$。更多编码方式见 [Robust data encodings for quantum classifiers](https://arxiv.org/pdf/2003.01695.pdf)。读者也可以直接使用量桨中提供的[编码方式](./DataEncoding_CN.ipynb)。这里我们也欢迎读者自己创新尝试全新的编码方式。\n",
    "\n",
    "由于这种编码的方式看着比较复杂,我们不妨来举一个简单的例子。假设我们给定一个数据点 $x = (x_0, x_1)= (1,0)$, 显然这个数据点的标签应该为 1,对应上图**蓝色**的点。同时数据点对应的2比特量子门 $U(x)$ 是\n",
    "\n",
    "$$\n",
    "U(x) = \n",
    "\\bigg( R_0^z\\big[\\arccos(x_{0}\\cdot x_{0})\\big] R_0^y\\big[\\arcsin(x_{0}) \\big]  \\bigg)\n",
    "\\otimes \n",
    "\\bigg( R_1^z\\big[\\arccos(x_{1}\\cdot x_{1})\\big] R_1^y\\big[\\arcsin(x_{1}) \\big] \\bigg),\\tag{3}\n",
    "$$\n",
    "\n",
    "\n",
    "把具体的数值带入我们就能得到:\n",
    "$$\n",
    "U(x) = \n",
    "\\bigg( R_0^z\\big[0\\big] R_0^y\\big[\\pi/2 \\big]  \\bigg)\n",
    "\\otimes \n",
    "\\bigg( R_1^z\\big[\\pi/2\\big] R_1^y\\big[0 \\big] \\bigg),\n",
    "\\tag{4}\n",
    "$$\n",
    "\n",
    "以下是常用的旋转门的矩阵形式:\n",
    "\n",
    "\n",
    "$$\n",
    "R_x(\\theta) :=\n",
    "\\begin{bmatrix}\n",
    "\\cos \\frac{\\theta}{2} &-i\\sin \\frac{\\theta}{2} \\\\\n",
    "-i\\sin \\frac{\\theta}{2} &\\cos \\frac{\\theta}{2}\n",
    "\\end{bmatrix}\n",
    ",\\quad\n",
    "R_y(\\theta) :=\n",
    "\\begin{bmatrix}\n",
    "\\cos \\frac{\\theta}{2} &-\\sin \\frac{\\theta}{2} \\\\\n",
    "\\sin \\frac{\\theta}{2} &\\cos \\frac{\\theta}{2}\n",
    "\\end{bmatrix}\n",
    ",\\quad\n",
    "R_z(\\theta) :=\n",
    "\\begin{bmatrix}\n",
    "e^{-i\\frac{\\theta}{2}} & 0 \\\\\n",
    "0 & e^{i\\frac{\\theta}{2}}\n",
    "\\end{bmatrix}.\n",
    "\\tag{5}\n",
    "$$\n",
    "\n",
    "那么这个两比特量子门 $U(x)$ 的矩阵形式可以写为:\n",
    "\n",
    "$$\n",
    "U(x) = \n",
    "\\bigg(\n",
    "\\begin{bmatrix}\n",
    "1 & 0 \\\\ \n",
    "0 & 1\n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "\\cos \\frac{\\pi}{4} &-\\sin \\frac{\\pi}{4} \\\\ \n",
    "\\sin \\frac{\\pi}{4} &\\cos \\frac{\\pi}{4} \n",
    "\\end{bmatrix}\n",
    "\\bigg)\n",
    "\\otimes \n",
    "\\bigg(\n",
    "\\begin{bmatrix}\n",
    "e^{-i\\frac{\\pi}{4}} & 0 \\\\ \n",
    "0 & e^{i\\frac{\\pi}{4}}\n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "1 &0 \\\\ \n",
    "0 &1\n",
    "\\end{bmatrix}\n",
    "\\bigg)\\, .\\tag{6}\n",
    "$$\n",
    "\n",
    "化简后我们作用在零初始化的 $|00\\rangle$ 量子态上可以得到编码后的量子态 $|\\psi_{\\rm in}\\rangle$,\n",
    "\n",
    "$$\n",
    "|\\psi_{\\rm in}\\rangle =\n",
    "U(x)|00\\rangle = \\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "1-i &0 &-1+i &0 \\\\ \n",
    "0 &1+i &0  &-1-i \\\\\n",
    "1-i &0 &1-i  &0 \\\\\n",
    "0 &1+i &0  &1+i \n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "1 \\\\\n",
    "0 \\\\\n",
    "0 \\\\\n",
    "0\n",
    "\\end{bmatrix}\n",
    "= \\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "1-i \\\\\n",
    "0 \\\\\n",
    "1-i \\\\\n",
    "0\n",
    "\\end{bmatrix}.\\tag{7}\n",
    "$$\n",
    "\n",
    "接着我们来看看代码上怎么实现这种编码方式。需要注意的是:代码中使用了一个张量积来表述\n",
    "\n",
    "$$\n",
    "(U_1 |0\\rangle)\\otimes (U_2 |0\\rangle) = (U_1 \\otimes U_2) |0\\rangle\\otimes|0\\rangle\n",
    "= (U_1 \\otimes U_2) |00\\rangle.\\tag{8}\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-02T09:15:06.589265Z",
     "start_time": "2021-03-02T09:15:06.452691Z"
    }
   },
   "outputs": [],
   "source": [
    "# 构建绕 Y 轴,绕 Z 轴旋转 theta 角度矩阵\n",
    "def Ry(theta):\n",
    "    \"\"\"\n",
    "    :param theta: 参数\n",
    "    :return: Y 旋转矩阵\n",
    "    \"\"\"\n",
    "    return np.array([[np.cos(theta / 2), -np.sin(theta / 2)],\n",
    "                     [np.sin(theta / 2), np.cos(theta / 2)]])\n",
    "\n",
    "def Rz(theta):\n",
    "    \"\"\"\n",
    "    :param theta: 参数\n",
    "    :return: Z 旋转矩阵\n",
    "    \"\"\"\n",
    "    return np.array([[np.cos(theta / 2) - np.sin(theta / 2) * 1j, 0],\n",
    "                     [0, np.cos(theta / 2) + np.sin(theta / 2) * 1j]])\n",
    "\n",
    "# 经典 -> 量子数据编码器\n",
    "def datapoints_transform_to_state(data, n_qubits):\n",
    "    \"\"\"\n",
    "    :param data: 形状为 [-1, 2],numpy向量形式\n",
    "    :param n_qubits: 数据转化后的量子比特数量\n",
    "    :return: 形状为 [-1, 1, 2 ^ n_qubits]\n",
    "            形状中-1表示第一个参数为任意大小。在此教程实例分析中,对应于BATCH,用以得到Eq.(1)中平方误差的平均值\n",
    "    \"\"\"\n",
    "    dim1, dim2 = data.shape\n",
    "    res = []\n",
    "    for sam in range(dim1):\n",
    "        res_state = 1.\n",
    "        zero_state = np.array([[1, 0]])\n",
    "        # 角度编码\n",
    "        for i in range(n_qubits):\n",
    "            # 对偶数编号量子态作用 Rz(arccos(x0^2)) Ry(arcsin(x0))\n",
    "            if i % 2 == 0:\n",
    "                state_tmp=np.dot(zero_state, Ry(np.arcsin(data[sam][0])).T)\n",
    "                state_tmp=np.dot(state_tmp, Rz(np.arccos(data[sam][0] ** 2)).T)\n",
    "                res_state=np.kron(res_state, state_tmp)\n",
    "            # 对奇数编号量子态作用 Rz(arccos(x1^2)) Ry(arcsin(x1))\n",
    "            elif i % 2 == 1:\n",
    "                state_tmp=np.dot(zero_state, Ry(np.arcsin(data[sam][1])).T)\n",
    "                state_tmp=np.dot(state_tmp, Rz(np.arccos(data[sam][1] ** 2)).T)\n",
    "                res_state=np.kron(res_state, state_tmp)\n",
    "        res.append(res_state)\n",
    "    res = np.array(res, dtype=paddle_quantum.get_dtype())\n",
    "\n",
    "    return res"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "测试角度编码下得到的量子数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "作为测试我们输入以上的经典信息:\n",
      "(x_0, x_1) = (1, 0)\n",
      "编码后输出的2比特量子态为:\n",
      "[[[0.5-0.5j 0. +0.j  0.5-0.5j 0. +0.j ]]]\n"
     ]
    }
   ],
   "source": [
    "print(\"作为测试我们输入以上的经典信息:\")\n",
    "print(\"(x_0, x_1) = (1, 0)\")\n",
    "print(\"编码后输出的2比特量子态为:\")\n",
    "print(datapoints_transform_to_state(np.array([[1, 0]]), n_qubits=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 构造量子神经网络\n",
    "\n",
    "那么在完成上述从经典数据到量子数据的编码后,我们现在可以把这些量子态输入到量子计算机里面了。在那之前,我们还需要设计下我们所采用的量子神经网络结构。\n",
    "\n",
    "<img src=\"./figures/qclassifier-fig-circuit.png\" width=\"600px\" /> \n",
    "<center> 图 3:参数化量子神经网络的电路结构 </center>\n",
    "\n",
    "为了方便,我们统一将上述参数化的量子神经网络称为 $U(\\boldsymbol{\\theta})$。这个 $U(\\boldsymbol{\\theta})$ 是我们分类器的关键组成部分,需要一定的复杂结构来拟合我们的决策边界。与经典神经网络类似,量子神经网络的的设计并不是唯一的,这里展示的仅仅是一个例子,读者不妨自己设计出自己的量子神经网络。我们还是拿原来提过的这个数据点 $x = (x_0, x_1)= (1,0)$ 来举例子,编码过后我们已经得到了一个量子态 $|\\psi_{\\rm in}\\rangle$,\n",
    "\n",
    "$$\n",
    "|\\psi_{\\rm in}\\rangle =\n",
    "\\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "1-i \\\\\n",
    "0 \\\\\n",
    "1-i \\\\\n",
    "0\n",
    "\\end{bmatrix},\\tag{9}\n",
    "$$\n",
    "\n",
    "接着我们把这个量子态输入进我们的量子神经网络,也就是把一个酉矩阵乘以一个向量。得到处理过后的量子态 $|\\psi_{\\rm out}\\rangle$\n",
    "\n",
    "$$\n",
    "|\\psi_{\\rm out}\\rangle = U(\\boldsymbol{\\theta})|\\psi_{\\rm in}\\rangle,\\tag{10}\n",
    "$$\n",
    "\n",
    "如果我们把所有的参数 $\\theta$ 都设置为 $\\theta = \\pi$, 那么我们就可以写出具体的矩阵了:\n",
    "\n",
    "$$\n",
    "|\\psi_{\\rm out}\\rangle = \n",
    "U(\\boldsymbol{\\theta} =\\pi)|\\psi_{\\rm in}\\rangle =\n",
    "\\begin{bmatrix}\n",
    "0  &0 &-1 &0 \\\\ \n",
    "-1 &0 &0  &0 \\\\\n",
    "0  &1 &0  &0 \\\\\n",
    "0  &0 &0  &1 \n",
    "\\end{bmatrix}\n",
    "\\cdot\n",
    "\\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "1-i \\\\\n",
    "0 \\\\\n",
    "1-i \\\\\n",
    "0\n",
    "\\end{bmatrix}\n",
    "= \\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "-1+i \\\\\n",
    "-1+i \\\\\n",
    "0 \\\\\n",
    "0\n",
    "\\end{bmatrix}.\\tag{11}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 测量\n",
    "\n",
    "经过量子神经网络$U(\\theta)$后,得到是量子态$\\lvert \\psi_{\\rm out}\\rangle^k = U(\\theta)\\lvert \\psi_{\\rm in} \\rangle^k$。要想得到该量子态的标签,我们需要通过测量来得到经典信息。然后再通过这些处理后的经典信息计算损失函数 $\\mathcal{L}(\\boldsymbol{\\theta})$。最后再通过梯度下降算法来不断更新 QNN 参数 $\\boldsymbol{\\theta}$,并优化损失函数。\n",
    "\n",
    "\n",
    "这里我们采用的测量方式是测量泡利 $Z$ 算符在第一个量子比特上的期望值。 具体来说,\n",
    "\n",
    "$$\n",
    "\\langle Z \\rangle = \n",
    "\\langle \\psi_{\\rm out} |Z\\otimes I\\cdots \\otimes I| \\psi_{\\rm out}\\rangle,\\tag{12}\n",
    "$$\n",
    "\n",
    "复习一下,泡利 $Z$ 算符的矩阵形式为:\n",
    "\n",
    "$$\n",
    "Z := \\begin{bmatrix} 1 &0 \\\\ 0 &-1 \\end{bmatrix},\\tag{13}\n",
    "$$\n",
    "\n",
    "继续我们前面的 2 量子比特的例子,测量过后我们得到的期望值就是:\n",
    "$$\n",
    "\\langle Z \\rangle = \n",
    "\\langle \\psi_{\\rm out} |Z\\otimes I| \\psi_{\\rm out}\\rangle = \n",
    "\\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "-1-i \\quad\n",
    "-1-i \\quad\n",
    "0   \\quad\n",
    "0\n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "1  &0 &0  &0 \\\\ \n",
    "0  &1 &0  &0 \\\\\n",
    "0  &0 &-1 &0 \\\\\n",
    "0  &0 &0  &-1 \n",
    "\\end{bmatrix}\n",
    "\\cdot\n",
    "\\frac{1}{2}\n",
    "\\begin{bmatrix}\n",
    "-1+i \\\\\n",
    "-1+i \\\\\n",
    "0 \\\\\n",
    "0\n",
    "\\end{bmatrix}\n",
    "= 1,\\tag{14}\n",
    "$$\n",
    "\n",
    "好奇的读者或许会问,这个测量结果好像就是我们原来的标签 1 ,这是不是意味着我们已经成功的分类这个数据点了?其实并不然,因为 $\\langle Z \\rangle$ 的取值范围通常在 $[-1,1]$之间。 为了对应我们的标签范围 $y^{k} \\in \\{0,1\\}$, 我们还需要将区间上下限映射上。这个映射最简单的做法就是让\n",
    "\n",
    "$$\n",
    "\\tilde{y}^{k} = \\frac{\\langle Z \\rangle}{2} + \\frac{1}{2} + bias \\quad \\in [0, 1].\\tag{15}\n",
    "$$\n",
    "\n",
    "其中加入偏置(bias)是机器学习中的一个小技巧,目的就是为了让决策边界不受制于原点或者一些超平面。一般我们默认偏置初始化为0,并且优化器在迭代过程中会类似于参数 $\\theta$ 一样不断更新偏置确保 $\\tilde{y}^{k} \\in [0, 1]$。当然读者也可以选择其他复杂的映射(激活函数)比如说 sigmoid 函数。映射过后我们就可以把 $\\tilde{y}^{k}$ 看作是我们估计出的标签(label)了。如果 $\\tilde{y}^{k}< 0.5$ 就对应标签 0,如果 $\\tilde{y}^{k}> 0.5$  就对应标签 1。 我们稍微复习一下整个流程,\n",
    "\n",
    "\n",
    "$$\n",
    "x^{k} \\rightarrow |\\psi_{\\rm in}\\rangle^{k} \\rightarrow U(\\boldsymbol{\\theta})|\\psi_{\\rm in}\\rangle^{k} \\rightarrow\n",
    "|\\psi_{\\rm out}\\rangle^{k} \\rightarrow ^{k}\\langle \\psi_{\\rm out} |Z\\otimes I\\cdots \\otimes I| \\psi_{\\rm out} \\rangle^{k}\n",
    "\\rightarrow \\langle Z \\rangle  \\rightarrow \\tilde{y}^{k}.\\tag{16}\n",
    "$$\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 损失函数\n",
    "\n",
    "相比于公式(1)中损失函数,需要在每次迭代中对所有 Ntrain 个数据点进行测量计算,在实际应用中,我们将训练集中的数据拆分为 \"Ntrain/BATCH\" 组,其中每组包含BATCH个数据。\n",
    "\n",
    "对第 i 组数据,训练对应损失函数:\n",
    "$$\n",
    "\\mathcal{L}_{i} = \\sum_{k=1}^{BATCH} \\frac{1}{BATCH} |y^{i,k} - \\tilde{y}^{i,k}|^2,\\tag{17}\n",
    "$$\n",
    "并对每一组训练 EPOCH 次。\n",
    "\n",
    "当取 \"BATCH = Ntrain\",此时仅有一组数据点,Eq. (17)重新变为Eq. (1)。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-03-02T09:15:07.667491Z",
     "start_time": "2021-03-02T09:15:07.661325Z"
    }
   },
   "outputs": [],
   "source": [
    "# 生成只作用在第一个量子比特上的泡利 Z 算符\n",
    "# 其余量子比特上都作用单位矩阵\n",
    "def Observable(n):\n",
    "    r\"\"\"\n",
    "    :param n: 量子比特数量\n",
    "    :return: 局部可观测量: Z \\otimes I \\otimes ...\\otimes I\n",
    "    \"\"\"\n",
    "    Ob = pauli_str_to_matrix([[1.0, 'z0']], n)\n",
    "\n",
    "    return Ob"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 搭建整个优化流程图\n",
    "class Opt_Classifier(paddle_quantum.gate.Gate):\n",
    "    \"\"\"\n",
    "    创建模型训练网络\n",
    "    \"\"\"\n",
    "    def __init__(self, n, depth, seed_paras=1):\n",
    "        # 初始化部分,通过n, depth给出初始电路\n",
    "        super(Opt_Classifier, self).__init__()\n",
    "        self.n = n\n",
    "        self.depth = depth\n",
    "        # 初始化偏置 (bias)\n",
    "        self.bias = self.create_parameter(\n",
    "            shape=[1],\n",
    "            default_initializer=paddle.nn.initializer.Normal(std=0.01),\n",
    "            dtype='float32',\n",
    "            is_bias=False)\n",
    "        \n",
    "        self.circuit = Circuit(n)\n",
    "        # 先搭建广义的旋转层\n",
    "        for i in range(n):\n",
    "            self.circuit.rz(qubits_idx=i)\n",
    "            self.circuit.ry(qubits_idx=i)\n",
    "            self.circuit.rz(qubits_idx=i)\n",
    "\n",
    "        # 默认深度为 depth = 1\n",
    "        # 对每一层搭建电路\n",
    "        for d in range(3, depth + 3):\n",
    "            # 搭建纠缠层\n",
    "            for i in range(n-1):\n",
    "                self.circuit.cnot(qubits_idx=[i, i + 1])\n",
    "            self.circuit.cnot(qubits_idx=[n-1, 0])\n",
    "            # 对每一个量子比特搭建Ry\n",
    "            for i in range(n):\n",
    "                self.circuit.ry(qubits_idx=i)\n",
    "\n",
    "    # 定义前向传播机制、计算损失函数 和交叉验证正确率\n",
    "    def forward(self, state_in, label):\n",
    "        \"\"\"\n",
    "        输入: state_in:输入量子态,shape: [-1, 1, 2^n] -- 此教程中为[BATCH, 1, 2^n]\n",
    "               label:输入量子态对应标签,shape: [-1, 1]\n",
    "        计算损失函数:\n",
    "                L = 1/BATCH * ((<Z> + 1)/2 + bias - label)^2\n",
    "        \"\"\"\n",
    "        # 将 Numpy array 转换成 tensor\n",
    "        Ob = paddle.to_tensor(Observable(self.n))\n",
    "        label_pp = reshape(paddle.to_tensor(label), [-1, 1])\n",
    "\n",
    "        # 按照随机初始化的参数 theta \n",
    "        Utheta = self.circuit.unitary_matrix()\n",
    "\n",
    "        # 因为 Utheta是学习到的,我们这里用行向量运算来提速而不会影响训练效果\n",
    "        state_out = matmul(state_in, Utheta)  # [-1, 1, 2 ** n]形式,第一个参数在此教程中为BATCH\n",
    "\n",
    "        # 测量得到泡利 Z 算符的期望值 <Z> -- shape [-1,1,1]\n",
    "        E_Z = matmul(matmul(state_out, Ob), transpose(paddle.conj(state_out), perm=[0, 2, 1]))\n",
    "\n",
    "        # 映射 <Z> 处理成标签的估计值 \n",
    "        state_predict = paddle.real(E_Z)[:, 0] * 0.5 + 0.5 + self.bias  # 计算每一个y^{i,k}与真实值得平方差\n",
    "        loss = paddle.mean((state_predict - label_pp) ** 2)  # 对BATCH个得到的平方差取平均,得到L_i:shape:[1,1]\n",
    "\n",
    "        # 计算交叉验证正确率\n",
    "        is_correct = (paddle.abs(state_predict - label_pp) < 0.5).nonzero().shape[0]\n",
    "        acc = is_correct / label.shape[0]\n",
    "\n",
    "        return loss, acc, state_predict.numpy(), self.circuit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 训练过程\n",
    "\n",
    "好了, 那么定义完以上所有的概念之后我们不妨来看看实际的训练过程!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 用于绘制最终训练得到分类器的平面分类图\n",
    "def heatmap_plot(Opt_Classifier, N):\n",
    "    # 生成数据点 x_y_\n",
    "    Num_points = 30\n",
    "    x_y_ = []\n",
    "    for row_y in np.linspace(0.9, -0.9, Num_points):\n",
    "        row = []\n",
    "        for row_x in np.linspace(-0.9, 0.9, Num_points):\n",
    "            row.append([row_x, row_y])\n",
    "        x_y_.append(row)\n",
    "    x_y_ = np.array(x_y_).reshape(-1, 2).astype(\"float64\")\n",
    "\n",
    "    # 计算预测: heat_data\n",
    "    input_state_test = paddle.to_tensor(\n",
    "        datapoints_transform_to_state(x_y_, N))\n",
    "    loss_useless, acc_useless, state_predict, cir = Opt_Classifier(state_in=input_state_test, label=x_y_[:, 0])\n",
    "    heat_data = state_predict.reshape(Num_points, Num_points)\n",
    "\n",
    "    # 画图\n",
    "    fig = plt.figure(1)\n",
    "    ax = fig.add_subplot(111)\n",
    "    x_label = np.linspace(-0.9, 0.9, 3)\n",
    "    y_label = np.linspace(0.9, -0.9, 3)\n",
    "    ax.set_xticks([0, Num_points // 2, Num_points - 1])\n",
    "    ax.set_xticklabels(x_label)\n",
    "    ax.set_yticks([0, Num_points // 2, Num_points - 1])\n",
    "    ax.set_yticklabels(y_label)\n",
    "    im = ax.imshow(heat_data, cmap=plt.cm.RdBu)\n",
    "    plt.colorbar(im)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过 Adam 优化器不断学习训练"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def QClassifier(Ntrain, Ntest, gap, N, DEPTH, EPOCH, LR, BATCH, seed_paras, seed_data):\n",
    "    \"\"\"\n",
    "    量子二分类器\n",
    "    输入参数:\n",
    "        Ntrain,        # 规定训练集大小\n",
    "        Ntest,         # 规定测试集大小\n",
    "        gap,           # 设定决策边界的宽度\n",
    "        N,             # 所需的量子比特数量\n",
    "        DEPTH,         # 采用的电路深度\n",
    "        BATCH,         # 训练时 batch 的大小\n",
    "        EPOCH,         # 训练 epoch 轮数\n",
    "        LR,            # 设置学习速率\n",
    "        seed_paras,    # 设置随机种子用以初始化各种参数\n",
    "        seed_data,     # 固定生成数据集所需要的随机种子\n",
    "    \"\"\"\n",
    "    # 生成训练集测试集\n",
    "    train_x, train_y, test_x, test_y = circle_data_point_generator(Ntrain=Ntrain, Ntest=Ntest, boundary_gap=gap, seed_data=seed_data)\n",
    "    # 读取训练集的维度\n",
    "    N_train = train_x.shape[0]\n",
    "\n",
    "    paddle.seed(seed_paras)\n",
    "    # 初始化寄存器存储正确率 acc 等信息\n",
    "    summary_iter, summary_test_acc = [], []\n",
    "\n",
    "    # 一般来说,我们利用Adam优化器来获得相对好的收敛\n",
    "    # 当然你可以改成SGD或者是RMSprop\n",
    "    myLayer = Opt_Classifier(n=N, depth=DEPTH)  # 得到初始化量子电路\n",
    "    opt = paddle.optimizer.Adam(learning_rate=LR, parameters=myLayer.parameters())\n",
    "\n",
    "    # 优化循环\n",
    "    # 此处将训练集分为Ntrain/BATCH组数据,对每一组训练后得到的量子线路作为下一组数据训练的初始量子电路\n",
    "    # 故通过cir记录每组数据得到的最终量子线路\n",
    "    i = 0  # 记录总迭代次数\n",
    "    for ep in range(EPOCH):\n",
    "        # 将训练集分组,对每一组训练\n",
    "        for itr in range(N_train // BATCH):\n",
    "            i += 1  # 记录总迭代次数\n",
    "            # 将经典数据编码成量子态 |psi>, 维度 [BATCH, 2 ** N]\n",
    "            input_state = paddle.to_tensor(datapoints_transform_to_state(train_x[itr * BATCH:(itr + 1) * BATCH], N))\n",
    "\n",
    "            # 前向传播计算损失函数\n",
    "            loss, train_acc, state_predict_useless, cir \\\n",
    "                = myLayer(state_in=input_state, label=train_y[itr * BATCH:(itr + 1) * BATCH])  # 对此时量子电路优化\n",
    "            # 显示迭代过程中performance变化\n",
    "            if i % 30 == 5:\n",
    "                # 计算测试集上的正确率 test_acc\n",
    "                input_state_test = paddle.to_tensor(datapoints_transform_to_state(test_x, N))\n",
    "                loss_useless, test_acc, state_predict_useless, t_cir \\\n",
    "                    = myLayer(state_in=input_state_test,label=test_y)\n",
    "                print(\"epoch:\", ep, \"iter:\", itr,\n",
    "                      \"loss: %.4f\" % loss.numpy(),\n",
    "                      \"train acc: %.4f\" % train_acc,\n",
    "                      \"test acc: %.4f\" % test_acc)\n",
    "                # 存储正确率 acc 等信息\n",
    "                summary_iter.append(itr + ep * N_train)\n",
    "                summary_test_acc.append(test_acc) \n",
    "\n",
    "            # 反向传播极小化损失函数\n",
    "            loss.backward()\n",
    "            opt.minimize(loss)\n",
    "            opt.clear_grad()\n",
    "\n",
    "    # 得到训练后电路\n",
    "    print(\"训练后的电路:\")\n",
    "    print(cir)\n",
    "    # 画出 heatmap 表示的决策边界\n",
    "    heatmap_plot(myLayer, N=N)\n",
    "\n",
    "    return summary_test_acc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "训练集的维度大小 x (200, 2) 和 y (200, 1)\n",
      "测试集的维度大小 x (100, 2) 和 y (100, 1) \n",
      "\n",
      "epoch: 0 iter: 4 loss: 0.2750 train acc: 0.7000 test acc: 0.6700\n",
      "epoch: 3 iter: 4 loss: 0.2471 train acc: 0.2500 test acc: 0.5500\n",
      "epoch: 6 iter: 4 loss: 0.1976 train acc: 0.8000 test acc: 0.9200\n",
      "epoch: 9 iter: 4 loss: 0.1639 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 12 iter: 4 loss: 0.1441 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 15 iter: 4 loss: 0.1337 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 18 iter: 4 loss: 0.1287 train acc: 1.0000 test acc: 1.0000\n",
      "训练后的电路:\n",
      "--Rz(3.490)----Ry(5.436)----Rz(3.281)----*--------------x----Ry(0.098)--\n",
      "                                         |              |               \n",
      "--Rz(1.499)----Ry(2.579)----Rz(3.496)----x----*---------|----Ry(1.282)--\n",
      "                                              |         |               \n",
      "--Rz(5.956)----Ry(3.158)----Rz(3.949)---------x----*----|----Ry(1.418)--\n",
      "                                                   |    |               \n",
      "--Rz(1.604)----Ry(0.722)----Rz(5.037)--------------x----*----Ry(2.437)--\n",
      "                                                                        \n"
     ]
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "主程序段总共运行了 7.0908989906311035 秒\n"
     ]
    }
   ],
   "source": [
    "def main():\n",
    "    \"\"\"\n",
    "    主函数\n",
    "    \"\"\"\n",
    "    time_start = time.time()\n",
    "    acc = QClassifier(\n",
    "        Ntrain = 200,        # 规定训练集大小\n",
    "        Ntest = 100,         # 规定测试集大小\n",
    "        gap = 0.5,           # 设定决策边界的宽度\n",
    "        N = 4,               # 所需的量子比特数量\n",
    "        DEPTH = 1,           # 采用的电路深度\n",
    "        BATCH = 20,          # 训练时 batch 的大小\n",
    "        EPOCH = int(200 * BATCH / Ntrain),          \n",
    "                             # 训练 epoch 轮数,使得总迭代次数 EPOCH * (Ntrain / BATCH) 在200左右\n",
    "        LR = 0.01,            # 设置学习速率\n",
    "        seed_paras = 19,     # 设置随机种子用以初始化各种参数\n",
    "        seed_data = 2,       # 固定生成数据集所需要的随机种子\n",
    "    )\n",
    "    \n",
    "    time_span = time.time() - time_start\n",
    "    print('主程序段总共运行了', time_span, '秒')\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过打印训练结果可以看到不断优化后分类器在测试集和训练集的正确率都达到了 $100\\%$。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 研究不同的编码方式\n",
    "\n",
    "监督学习的编码方式对分类结果有很大影响 [4]。在量桨中,我们集成了常用的编码方式,包括振幅编码、角度编码、IQP编码等。 用户可以用内置的 ``SimpleDataset`` 类实例对简单分类数据(不需要降维的数据)进行编码;也可以用内置的 ``VisionDataset`` 类实例对图片数据进行编码。编码的方法都是调用类对象的 ``encode`` 方法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'numpy.ndarray'>\n",
      "(100, 4)\n"
     ]
    }
   ],
   "source": [
    "# 使用前面构建的圆形数据集研究编码\n",
    "from paddle_quantum.dataset import *\n",
    "\n",
    "# 用两个量子比特编码二维数据\n",
    "quantum_train_x = SimpleDataset(2).encode(train_x, 'angle_encoding', 2)\n",
    "quantum_test_x = SimpleDataset(2).encode(test_x, 'angle_encoding', 2)\n",
    "\n",
    "print(type(quantum_test_x)) # ndarray\n",
    "print(quantum_test_x.shape) # (100, 4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里我们对上面的分类器进行化简,之后的所有分类都采用这个分类器。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 简化的分类器\n",
    "def QClassifier2(quantum_train_x, train_y,quantum_test_x,test_y, N, DEPTH, EPOCH, LR, BATCH):\n",
    "    \"\"\"\n",
    "    量子二分类分类器\n",
    "    输入:\n",
    "        quantum_train_x     # 训练特征\n",
    "        train_y             # 训练标签\n",
    "        quantum_test_x      # 测试特征\n",
    "        test_y              # 测试标签\n",
    "        N                   # 使用的量子比特数目\n",
    "        DEPTH               # 分类器电路的深度\n",
    "        EPOCH               # 迭代次数\n",
    "        LR                  # 学习率\n",
    "        BATCH               # 一个批量的大小\n",
    "    \"\"\"\n",
    "    Ntrain = len(quantum_train_x)\n",
    "    \n",
    "    paddle.seed(1)\n",
    "\n",
    "    net = Opt_Classifier(n=N, depth=DEPTH)\n",
    "\n",
    "    # 测试准确率列表\n",
    "    summary_iter, summary_test_acc = [], []\n",
    "\n",
    "    # 这里用 Adam,但是也可以是 SGD 或者 RMSprop\n",
    "    opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())\n",
    "\n",
    "    # 进行优化\n",
    "    for ep in range(EPOCH):\n",
    "        for itr in range(Ntrain // BATCH):\n",
    "            # 导入数据\n",
    "            input_state = quantum_train_x[itr * BATCH:(itr + 1) * BATCH]  # paddle.tensor类型\n",
    "            input_state = reshape(input_state, [-1, 1, 2 ** N])\n",
    "            label = train_y[itr * BATCH:(itr + 1) * BATCH]\n",
    "            test_input_state = reshape(quantum_test_x, [-1, 1, 2 ** N])\n",
    "\n",
    "            loss, train_acc, state_predict_useless, cir = net(state_in=input_state, label=label)\n",
    "\n",
    "            if itr % 5 == 0:\n",
    "                # 获取测试准确率\n",
    "                loss_useless, test_acc, state_predict_useless, t_cir = net(state_in=test_input_state, label=test_y)\n",
    "                print(\"epoch:\", ep, \"iter:\", itr,\n",
    "                      \"loss: %.4f\" % loss.numpy(),\n",
    "                      \"train acc: %.4f\" % train_acc,\n",
    "                      \"test acc: %.4f\" % test_acc)\n",
    "                summary_test_acc.append(test_acc)\n",
    "\n",
    "            loss.backward()\n",
    "            opt.minimize(loss)\n",
    "            opt.clear_grad()\n",
    "\n",
    "    return summary_test_acc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "现在可以开始用不同编码方式对上面产生的圆形数据进行编码。这里我们采用五种编码方法:振幅编码、角度编码、泡利旋转编码、IQP编码、复杂纠缠编码。然后我们绘制出测试精度曲线以便分析。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Encoding method: amplitude_encoding\n",
      "epoch: 0 iter: 0 loss: 0.3005 train acc: 0.6000 test acc: 0.4600\n",
      "epoch: 0 iter: 5 loss: 0.2908 train acc: 0.3000 test acc: 0.5000\n",
      "epoch: 0 iter: 10 loss: 0.2313 train acc: 0.8000 test acc: 0.6200\n",
      "epoch: 0 iter: 15 loss: 0.2181 train acc: 0.7000 test acc: 0.7000\n",
      "Encoding method: angle_encoding\n",
      "epoch: 0 iter: 0 loss: 0.4141 train acc: 0.4000 test acc: 0.3700\n",
      "epoch: 0 iter: 5 loss: 0.2942 train acc: 0.6000 test acc: 0.6700\n",
      "epoch: 0 iter: 10 loss: 0.1952 train acc: 0.6000 test acc: 0.6700\n",
      "epoch: 0 iter: 15 loss: 0.2389 train acc: 0.6000 test acc: 0.6000\n",
      "Encoding method: pauli_rotation_encoding\n",
      "epoch: 0 iter: 0 loss: 0.1985 train acc: 0.7000 test acc: 0.7400\n",
      "epoch: 0 iter: 5 loss: 0.2303 train acc: 0.6000 test acc: 0.6900\n",
      "epoch: 0 iter: 10 loss: 0.1970 train acc: 0.6000 test acc: 0.7200\n",
      "epoch: 0 iter: 15 loss: 0.2120 train acc: 0.7000 test acc: 0.7000\n",
      "Encoding method: IQP_encoding\n",
      "epoch: 0 iter: 0 loss: 0.2962 train acc: 0.5000 test acc: 0.4500\n",
      "epoch: 0 iter: 5 loss: 0.2074 train acc: 0.7000 test acc: 0.7000\n",
      "epoch: 0 iter: 10 loss: 0.2463 train acc: 0.6000 test acc: 0.6500\n",
      "epoch: 0 iter: 15 loss: 0.2090 train acc: 0.9000 test acc: 0.5800\n",
      "Encoding method: complex_entangled_encoding\n",
      "epoch: 0 iter: 0 loss: 0.2500 train acc: 0.6000 test acc: 0.6800\n",
      "epoch: 0 iter: 5 loss: 0.2571 train acc: 0.5000 test acc: 0.6800\n",
      "epoch: 0 iter: 10 loss: 0.2661 train acc: 0.7000 test acc: 0.6700\n",
      "epoch: 0 iter: 15 loss: 0.1916 train acc: 0.8000 test acc: 0.7200\n"
     ]
    }
   ],
   "source": [
    "# 测试不同编码方式\n",
    "encoding_list = ['amplitude_encoding', 'angle_encoding', 'pauli_rotation_encoding', 'IQP_encoding', 'complex_entangled_encoding']\n",
    "num_qubit = 2 # 这里需要小心,如果量子比特数目取 1,可能会报错,因为有 CNOT 门\n",
    "dimension = 2\n",
    "acc_list = []\n",
    "\n",
    "for i in range(len(encoding_list)):\n",
    "    encoding = encoding_list[i]\n",
    "    print(\"Encoding method:\", encoding)\n",
    "    # 用 SimpleDataset 类来编码数据,这里数据维度为 2,编码量子比特数目也是 2\n",
    "    quantum_train_x= SimpleDataset(dimension).encode(train_x, encoding, num_qubit)\n",
    "    quantum_test_x= SimpleDataset(dimension).encode(test_x, encoding, num_qubit)\n",
    "    quantum_train_x = paddle.to_tensor(quantum_train_x)\n",
    "    quantum_test_x = paddle.to_tensor(quantum_test_x)\n",
    "\n",
    "    acc = QClassifier2(\n",
    "            quantum_train_x, # 训练特征\n",
    "            train_y,         # 训练标签\n",
    "            quantum_test_x,  # 测试特征\n",
    "            test_y,          # 测试标签\n",
    "            N = num_qubit,   # 使用的量子比特数目\n",
    "            DEPTH = 1,       # 分类器电路的深度\n",
    "            EPOCH = 1,       # 迭代次数\n",
    "            LR = 0.1,        # 学习率\n",
    "            BATCH = 10,      # 一个批量的大小\n",
    "          )\n",
    "    acc_list.append(acc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 绘制五种编码方法的训练曲线\n",
    "x=[2*i for i in range(len(acc_list[0]))]\n",
    "for i in range(len(encoding_list)):\n",
    "    plt.plot(x,acc_list[i])\n",
    "plt.legend(encoding_list)\n",
    "plt.title(\"Benchmarking different encoding methods\")\n",
    "plt.xlabel(\"Iteration\")\n",
    "plt.ylabel(\"Test accuracy\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 用内置的 MNIST 和 Iris 数据集实现量子分类\n",
    "\n",
    "量桨将常用的分类数据集进行了编码,用户可以使用 `paddle_quantum.dataset` 模块获取编码的量子电路或者量子态。目前集成了4个数据集,包括 MNIST, FashionMNIST, Iris 和 BreastCancer。下面展示如何用这些内置数据集快速实现量子监督学习。\n",
    "\n",
    "我们从 Iris 数据集开始。Iris 数据集包括三种类别,每种类别有50个样本。数据集中只有四个特征,是比较简单且容易编码的数据集。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch: 0 iter: 0 loss: 0.3372 train acc: 0.5000 test acc: 0.5000\n",
      "epoch: 0 iter: 5 loss: 0.2687 train acc: 0.2500 test acc: 0.5500\n",
      "epoch: 0 iter: 10 loss: 0.0781 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 0 iter: 15 loss: 0.0786 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 1 iter: 0 loss: 0.0903 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 1 iter: 5 loss: 0.1020 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 1 iter: 10 loss: 0.0553 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 1 iter: 15 loss: 0.0559 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 2 iter: 0 loss: 0.0770 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 2 iter: 5 loss: 0.0879 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 2 iter: 10 loss: 0.0438 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 2 iter: 15 loss: 0.0538 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 3 iter: 0 loss: 0.0768 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 3 iter: 5 loss: 0.0887 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 3 iter: 10 loss: 0.0417 train acc: 1.0000 test acc: 1.0000\n",
      "epoch: 3 iter: 15 loss: 0.0511 train acc: 1.0000 test acc: 1.0000\n"
     ]
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Iris 数据集二分类\n",
    "\n",
    "test_rate=0.2\n",
    "num_qubit=4\n",
    "\n",
    "# 获取 Iris 数据集的量子态\n",
    "iris = Iris (encoding='angle_encoding', num_qubits=num_qubit, test_rate=test_rate,classes=[0, 1], return_state=True)\n",
    "\n",
    "quantum_train_x, train_y = iris.train_x, iris.train_y\n",
    "quantum_test_x, test_y = iris.test_x, iris.test_y\n",
    "testing_data_num = len(test_y)\n",
    "training_data_num = len(train_y)\n",
    "\n",
    "acc = QClassifier2(\n",
    "        quantum_train_x, # 训练特征\n",
    "        train_y,         # 训练标签\n",
    "        quantum_test_x,  # 测试特征\n",
    "        test_y,          # 测试标签\n",
    "        N = num_qubit,   # 使用的量子比特数目\n",
    "        DEPTH = 1,       # 分类器电路的深度\n",
    "        EPOCH = 4,       # 迭代次数\n",
    "        LR = 0.1,        # 学习率\n",
    "        BATCH = 4,      # 一个批量的大小\n",
    "      )\n",
    "plt.plot(acc)\n",
    "plt.title(\"Classify Iris 0&1 using angle encoding\")\n",
    "plt.xlabel(\"Iteration\")\n",
    "plt.ylabel(\"Testing accuracy\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "第二个例子为 MNIST 数据集。 MNIST 是手写数字数据集,有 0-9 十个类别(每一类训练集中有 6000 个样本,测试集中有 1000 个样本)。所有的图片都是 $28\\times28$ 的灰度图,所以需要使用 ``resize`` 或 ``PCA`` 降维到目标维度 ``target_dimension`` 。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch: 0 iter: 0 loss: 0.2345 train acc: 0.5750 test acc: 0.5350\n",
      "epoch: 0 iter: 5 loss: 0.2322 train acc: 0.6500 test acc: 0.5800\n",
      "epoch: 0 iter: 10 loss: 0.2423 train acc: 0.6250 test acc: 0.5550\n",
      "epoch: 1 iter: 0 loss: 0.1909 train acc: 0.8000 test acc: 0.6900\n",
      "epoch: 1 iter: 5 loss: 0.1938 train acc: 0.7250 test acc: 0.6450\n",
      "epoch: 1 iter: 10 loss: 0.2055 train acc: 0.6750 test acc: 0.7250\n",
      "epoch: 2 iter: 0 loss: 0.1855 train acc: 0.8000 test acc: 0.7400\n",
      "epoch: 2 iter: 5 loss: 0.1627 train acc: 0.8000 test acc: 0.7650\n",
      "epoch: 2 iter: 10 loss: 0.1684 train acc: 0.8250 test acc: 0.7900\n",
      "epoch: 3 iter: 0 loss: 0.1676 train acc: 0.8250 test acc: 0.7750\n",
      "epoch: 3 iter: 5 loss: 0.1387 train acc: 0.8500 test acc: 0.7500\n",
      "epoch: 3 iter: 10 loss: 0.1679 train acc: 0.8500 test acc: 0.7950\n",
      "epoch: 4 iter: 0 loss: 0.1584 train acc: 0.7250 test acc: 0.8050\n",
      "epoch: 4 iter: 5 loss: 0.1408 train acc: 0.8500 test acc: 0.8150\n",
      "epoch: 4 iter: 10 loss: 0.1603 train acc: 0.8500 test acc: 0.8100\n"
     ]
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 使用 MNIST 进行分类\n",
    "\n",
    "# 主要参数\n",
    "training_data_num = 500\n",
    "testing_data_num = 200\n",
    "num_qubit = 4\n",
    "\n",
    "# 选择3和6两个类,将 MNIST 从 28*28 重采样为 4*4,再用振幅编码方式进行编码 \n",
    "train_dataset = MNIST(mode='train', encoding='amplitude_encoding', num_qubits=num_qubit, classes=[3,6],\n",
    "                      data_num=training_data_num,need_cropping=True,\n",
    "                      downscaling_method='resize', target_dimension=16, return_state=True)\n",
    "\n",
    "val_dataset = MNIST(mode='test', encoding='amplitude_encoding', num_qubits=num_qubit, classes=[3,6],\n",
    "                    data_num=testing_data_num,need_cropping=True,\n",
    "                    downscaling_method='resize', target_dimension=16,return_state=True)\n",
    "\n",
    "quantum_train_x, train_y = train_dataset.quantum_image_states, train_dataset.labels\n",
    "quantum_test_x, test_y = val_dataset.quantum_image_states, val_dataset.labels\n",
    "\n",
    "acc = QClassifier2(\n",
    "        quantum_train_x, # 训练特征\n",
    "        train_y,         # 训练标签\n",
    "        quantum_test_x,  # 测试特征\n",
    "        test_y,          # 测试标签\n",
    "        N = num_qubit,   # 使用的量子比特数目\n",
    "        DEPTH = 3,       # 分类器电路的深度\n",
    "        EPOCH = 5,       # 迭代次数\n",
    "        LR = 0.1,        # 学习率\n",
    "        BATCH = 40,      # 一个批量的大小\n",
    "      )\n",
    "plt.plot(acc)\n",
    "plt.title(\"Classify MNIST 3&6 using amplitude encoding\")\n",
    "plt.xlabel(\"Iteration\")\n",
    "plt.ylabel(\"Testing accuracy\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 参考文献\n",
    "\n",
    "[1] Mitarai, Kosuke, et al. Quantum circuit learning. [Physical Review A 98.3 (2018): 032309.](https://arxiv.org/abs/1803.00745)\n",
    "\n",
    "[2] Farhi, Edward, and Hartmut Neven. Classification with quantum neural networks on near term processors. [arXiv preprint arXiv:1802.06002 (2018).](https://arxiv.org/abs/1802.06002)\n",
    "\n",
    "[3] Schuld, Maria, et al. Circuit-centric quantum classifiers. [Physical Review A 101.3 (2020): 032308.](https://arxiv.org/abs/1804.00633)\n",
    "\n",
    "[4] Schuld, Maria. Supervised quantum machine learning models are kernel methods. [arXiv preprint arXiv:2101.11020 (2021).](https://arxiv.org/pdf/2101.11020)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}