{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Quantum Autoencoder\n",
    "\n",
    "<em> Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. </em>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "This tutorial will show how to train a quantum autoencoder to compress and reconstruct a given quantum state (mixed state) [1].\n",
    "\n",
    "### Theory\n",
    "\n",
    "The form of the quantum autoencoder is very similar to the classical autoencoder, which is composed of an encoder $E$ and a decoder $D$. For the input quantum state $\\rho_{in}$ of the $N$ qubit system (here we use the density operator representation of quantum mechanics to describe the mixed state), first use the encoder $E = U(\\theta)$ to encode information into some of the qubits in the system. This part of qubits is denoted by **system $A$**. After measuring and discarding the remaining qubits (this part is denoted by **system $B$**), we get the compressed quantum state $\\rho_{encode}$! The dimension of the compressed quantum state is the same as the dimension of the quantum system $A$. Suppose we need $N_A$ qubits to describe the system $A$, then the dimension of the encoded quantum state $\\rho_{encode}$ is $2^{N_A}\\times 2^{N_A}$. Note that the mathematical operation corresponding to the measure-and-discard operation in this step is partial trace. The reader can intuitively treat it as the inverse operation of the tensor product $\\otimes$.\n",
    "\n",
    "Let us look at a specific example. Given a quantum state $\\rho_A$ of $N_A$ qubits and another quantum state $\\rho_B$ of $N_B$ qubits, the quantum state of the entire quantum system composed of subsystems $A$ and $B$ is $\\rho_{AB} = \\rho_A \\otimes \\rho_B$, which is a state of $N = N_A + N_B$ qubits. Now we let the entire quantum system evolve under the action of the unitary matrix $U$ for some time to get a new quantum state $\\tilde{\\rho_{AB}} = U\\rho_{AB}U^\\dagger$. So if we only want to get the new quantum state $\\tilde{\\rho_A}$ of quantum subsystem A at this time, what should we do? We simply measure the quantum subsystem $B$ and then discard it. This step of the operation is completed by partial trace $\\tilde{\\rho_A} = \\text{Tr}_B (\\tilde{\\rho_{AB}})$. With Paddle Quantum, we can call the built-in function `partial_trace(rho_AB, 2**N_A, 2**N_B, 2)` to complete this operation. **Note:** The last parameter is 2, which means that we want to discard quantum system $B$.\n",
    "\n",
    "![QA-fig-encoder_pipeline](./figures/QA-fig-encoder_pipeline.png)\n",
    "\n",
    "After discussing the encoding process, let us take a look at how decoding is done. To decode the quantum state $\\rho_{encode}$, we need to introduce an ancillary system $C$ with the same dimension as the system $B$ and take its initial state as the $|0\\dots0\\rangle$ state. Then use the decoder $D = U^\\dagger(\\theta)$ to act on the entire quantum system $A+C$ to decode the compressed information in system A. We hope that the final quantum state $\\rho_{out}$ and $\\rho_{in}$ are as similar as possible and use Uhlmann-Josza fidelity $F$ to measure the similarity between them.\n",
    "\n",
    "$$\n",
    "F(\\rho_{in}, \\rho_{out}) = \\left(\\operatorname{tr} \\sqrt{\\sqrt{\\rho_{in}} \\rho_{out} \\sqrt{\\rho_{in}}} \\right)^{2}.\n",
    "\\tag{1}\n",
    "$$\n",
    "\n",
    "Finally, by optimizing the encoder's parameters, we can improve the fidelity of $\\rho_{in}$ and $\\rho_{out}$ as much as possible."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Paddle Quantum Implementation\n",
    "\n",
    "Next, we will use a simple example to show the workflow of the quantum autoencoder. Here we first import the necessary packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-04-30T08:57:25.375761Z",
     "start_time": "2021-04-30T08:57:25.355799Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>pre { white-space: pre !important; }</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.core.display import HTML\n",
    "display(HTML(\"<style>pre { white-space: pre !important; }</style>\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-04-30T08:57:29.266507Z",
     "start_time": "2021-04-30T08:57:25.894969Z"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import paddle\n",
    "from paddle import matmul, trace, kron, real\n",
    "import paddle_quantum as pq\n",
    "from paddle_quantum.ansatz.circuit import Circuit\n",
    "from paddle_quantum.qinfo import state_fidelity, partial_trace\n",
    "from paddle_quantum.linalg import dagger, haar_unitary\n",
    "from paddle_quantum.state.common import to_state"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generating the initial state\n",
    "\n",
    "Let us consider the quantum state $\\rho_{in}$ of $N = 3$ qubits. We first encode the information into the two qubits below (system $A$) through the encoder then measure and discard the first qubit (system $B$). Secondly, we introduce another qubit (the new reference system $C$) in state $|0\\rangle$ to replace the discarded qubit $B$. Finally, through the decoder, the compressed information in A is restored to $\\rho_{out}$. Here, we assume that the initial state is a mixed state and the spectrum of $\\rho_{in}$ is $\\lambda_i \\in \\{0.4, 0.2, 0.2, 0.1, 0.1, 0, 0, 0\\}$, and then generate the initial state $\\rho_{in}$ by applying a random unitary transformation.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-04-30T08:57:29.283394Z",
     "start_time": "2021-04-30T08:57:29.270542Z"
    }
   },
   "outputs": [],
   "source": [
    "N_A = 2                          # Number of qubits in system A\n",
    "N_B = 1                          # Number of qubits in system B\n",
    "N = N_A + N_B                    # Total number of qubits\n",
    "SEED = 15                        # Set random seed\n",
    "paddle.seed(SEED)\n",
    "pq.set_dtype('complex64')        # set data type\n",
    "\n",
    "V = haar_unitary(N).numpy()                              # Generate a random unitary matrix\n",
    "D = np.diag([0.4, 0.2, 0.2, 0.1, 0.1, 0, 0, 0])          # Set the spectrum of the target state rho\n",
    "V_H = V.conj().T                                         # Apply Hermitian transpose\n",
    "rho_in = to_state((V @ D @ V_H).astype('complex64'))     # Generate input state rho_in\n",
    "rho_C = to_state(np.diag([1,0]).astype('complex64'))     # Generate ancilla state rho_C"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Building a quantum neural network\n",
    "\n",
    "Here, we use quantum neural networks (QNN) as encoders and decoders. Suppose system A has $N_A$ qubits, both system $B$ and $C$ have $N_B$ qubits, and the depth of the QNN is $D$. Encoder $E$ acts on the total system composed of systems A and B, and decoder $D$ acts on the total system composed of $A$ and $C$. In this example, $N_{A} = 2$ and $N_{B} = 1$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-04-30T08:57:29.301622Z",
     "start_time": "2021-04-30T08:57:29.288113Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The initialized circuit:\n",
      "--Ry(2.974)----Rz(3.296)----*---------x----Ry(4.201)----Rz(4.559)----*---------x----Ry(5.254)----Rz(4.834)----*---------x----Ry(3.263)----Rz(1.664)----*---------x----Ry(5.240)----Rz(1.166)----*---------x----Ry(5.038)----Rz(0.564)----*---------x--\n",
      "                            |         |                              |         |                              |         |                              |         |                              |         |                              |         |  \n",
      "--Ry(2.407)----Rz(3.514)----x----*----|----Ry(6.279)----Rz(4.675)----x----*----|----Ry(4.986)----Rz(5.080)----x----*----|----Ry(2.845)----Rz(2.662)----x----*----|----Ry(0.015)----Rz(0.052)----x----*----|----Ry(4.341)----Rz(5.329)----x----*----|--\n",
      "                                 |    |                                   |    |                                   |    |                                   |    |                                   |    |                                   |    |  \n",
      "--Ry(3.866)----Rz(3.272)---------x----*----Ry(2.219)----Rz(2.298)---------x----*----Ry(6.060)----Rz(0.431)---------x----*----Ry(3.197)----Rz(1.673)---------x----*----Ry(2.324)----Rz(0.037)---------x----*----Ry(4.892)----Rz(1.856)---------x----*--\n",
      "                                                                                                                                                                                                                                                      \n"
     ]
    }
   ],
   "source": [
    "# Set circuit depth\n",
    "cir_depth = 6\n",
    "\n",
    "# Use Circuit class to build the encoder E\n",
    "cir_Encoder = Circuit(N)\n",
    "for _ in range(cir_depth):\n",
    "    cir_Encoder.ry('full')\n",
    "    cir_Encoder.rz('full')\n",
    "    cir_Encoder.cnot('cycle')\n",
    "print(\"The initialized circuit:\") \n",
    "print(cir_Encoder)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configuring the training model: loss function\n",
    "\n",
    "Here, we define the loss function to be\n",
    "\n",
    "$$\n",
    "Loss = 1-\\langle 0...0|\\rho_{trash}|0...0\\rangle,\n",
    "\\tag{2}\n",
    "$$\n",
    "\n",
    "where $\\rho_{trash}$ is the quantum state of the system $B$ discarded after encoding. Then we train the QNN through PaddlePaddle to minimize the loss function. If the loss function reaches 0, the input state and output state will be exactly the same state. This means that we have achieved compression and decompression perfectly, in which case the fidelity of the initial and final states is $F(\\rho_{in}, \\rho_{out}) = 1$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-04-30T08:57:52.962725Z",
     "start_time": "2021-04-30T08:57:42.476996Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "iter: 10 loss: 0.1700 fid: 0.8226\n",
      "iter: 20 loss: 0.1329 fid: 0.8605\n",
      "iter: 30 loss: 0.1138 fid: 0.8793\n",
      "iter: 40 loss: 0.1075 fid: 0.8856\n",
      "iter: 50 loss: 0.1024 fid: 0.8891\n",
      "iter: 60 loss: 0.1010 fid: 0.8917\n",
      "iter: 70 loss: 0.1003 fid: 0.8927\n",
      "iter: 80 loss: 0.1001 fid: 0.8932\n",
      "iter: 90 loss: 0.1000 fid: 0.8936\n",
      "iter: 100 loss: 0.1000 fid: 0.8939\n",
      "\n",
      "The trained circuit:\n",
      "--Ry(3.715)----Rz(3.480)----*---------x----Ry(5.783)----Rz(3.863)----*---------x----Ry(4.643)----Rz(5.174)----*---------x----Ry(3.187)----Rz(1.627)----*---------x----Ry(5.195)----Rz(1.282)----*---------x----Ry(5.990)----Rz(0.919)----*---------x--\n",
      "                            |         |                              |         |                              |         |                              |         |                              |         |                              |         |  \n",
      "--Ry(2.889)----Rz(3.898)----x----*----|----Ry(7.024)----Rz(5.530)----x----*----|----Ry(4.595)----Rz(4.244)----x----*----|----Ry(3.576)----Rz(2.981)----x----*----|----Ry(-1.39)----Rz(0.356)----x----*----|----Ry(3.557)----Rz(4.126)----x----*----|--\n",
      "                                 |    |                                   |    |                                   |    |                                   |    |                                   |    |                                   |    |  \n",
      "--Ry(3.593)----Rz(3.298)---------x----*----Ry(0.958)----Rz(2.486)---------x----*----Ry(6.317)----Rz(0.314)---------x----*----Ry(3.837)----Rz(2.775)---------x----*----Ry(3.415)----Rz(-0.07)---------x----*----Ry(4.543)----Rz(3.359)---------x----*--\n",
      "                                                                                                                                                                                                                                                      \n"
     ]
    }
   ],
   "source": [
    "# Set hyper-parameters\n",
    "LR = 0.2       # Set the learning rate\n",
    "ITR = 100      # Set the number of iterations\n",
    "\n",
    "class NET(paddle.nn.Layer):\n",
    "    def __init__(self, cir, rho_in, rho_C, dtype='float32'):\n",
    "        super(NET, self).__init__()\n",
    "        # load the circuit of the encoder E\n",
    "        self.cir = cir\n",
    "        # load the input state rho_in and the ancilla state rho_C\n",
    "        self.rho_in = rho_in\n",
    "        self.rho_C = rho_C\n",
    "        # set trainable parameters\n",
    "        self.theta = cir.parameters()\n",
    "    \n",
    "    # Define loss function and forward propagation mechanism\n",
    "    def forward(self):\n",
    "    \n",
    "        # Generate the matrices of the encoder E and decoder D\n",
    "        E = self.cir.unitary_matrix()\n",
    "        E_dagger = dagger(E)\n",
    "        D = E_dagger\n",
    "        D_dagger = E\n",
    "\n",
    "        # Encode the quantum state rho_in\n",
    "        rho_BA = to_state(matmul(matmul(E, self.rho_in.data), E_dagger))\n",
    "        \n",
    "        # Take partial_trace() to get rho_encode and rho_trash\n",
    "        rho_encode = to_state(partial_trace(rho_BA, 2 ** N_B, 2 ** N_A, 1))\n",
    "        rho_trash = to_state(partial_trace(rho_BA, 2 ** N_B, 2 ** N_A, 2))\n",
    "\n",
    "        # Decode and get the quantum state rho_out\n",
    "        rho_CA = to_state(kron(self.rho_C.data, rho_encode.data))\n",
    "        rho_out = to_state(matmul(matmul(D, rho_CA.data), D_dagger))\n",
    "        \n",
    "        # Calculate the loss function with rho_trash\n",
    "        zero_Hamiltonian = paddle.to_tensor(np.diag([1,0]).astype('complex64'))\n",
    "        loss = 1 - real(trace(matmul(zero_Hamiltonian, rho_trash.data)))\n",
    "\n",
    "        return loss, rho_out\n",
    "\n",
    "# Generate network\n",
    "net = NET(cir_Encoder, rho_in, rho_C)\n",
    "# Generally speaking, we use Adam optimizer to get relatively good convergence\n",
    "# Of course, it can be changed to SGD or RMS prop\n",
    "opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())\n",
    "\n",
    "# Optimization loops\n",
    "for itr in range(1, ITR + 1):\n",
    "    # Forward propagation for calculating loss function\n",
    "    loss, rho_out = net()\n",
    "    # Use back propagation to minimize the loss function\n",
    "    loss.backward()\n",
    "    opt.minimize(loss)\n",
    "    opt.clear_grad()\n",
    "    # Calculate and print fidelity\n",
    "    fid = state_fidelity(rho_in, rho_out)\n",
    "    if itr % 10 == 0:\n",
    "        print('iter:', itr, 'loss:', '%.4f' % loss, 'fid:', '%.4f' % np.square(fid.item()))\n",
    "    if itr == ITR:\n",
    "        print(\"\\nThe trained circuit:\") \n",
    "        print(cir_Encoder)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If the dimension of system A is denoted by $d_A$, it is easy to prove that the maximum fidelity can be achieved by quantum autoencoder is the sum of $d_A$ largest eigenvalues ​​of $\\rho_{in}$. In our case $d_A = 4$ and the maximum fidelity is\n",
    "\n",
    "$$\n",
    "F_{\\text{max}}(\\rho_{in}, \\rho_{out})  = \\sum_{j=1}^{d_A} \\lambda_j(\\rho_{in})= 0.4 + 0.2 + 0.2 + 0.1 = 0.9.\n",
    "\\tag{3}\n",
    "$$\n",
    "\n",
    "After 100 iterations, the fidelity achieved by the quantum autoencoder we trained reaches above 0.89, which is very close to the optimal value."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_______\n",
    "\n",
    "## References\n",
    "\n",
    "[1] Romero, J., Olson, J. P. & Aspuru-Guzik, A. Quantum autoencoders for efficient compression of quantum data. [Quantum Sci. Technol. 2, 045001 (2017).](https://iopscience.iop.org/article/10.1088/2058-9565/aa8072)\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}