introduction_en.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction of quantum portfolio optimization\n",
    "*Copyright (c) 2022 Institute for Quantum Computing, Baidu Inc. All Rights Reserved.*\n",
    "\n",
    "If you are an active investment manager who wants to invest $K$ dollars to $N$ projects, each with its return and risk, your goal is to find an optimal way to invest in the projects, taking into account the market impact and transaction costs.\n",
    "\n",
    "To make the modeling easy to formulate, two assumptions are made to constrain the problem：\n",
    "    1.Each asset is invested with an equal amount of money；\n",
    "    2.Budget is a multiple of each investment amount and must be fully spent.\n",
    "\n",
    "In the theory of portfolio optimization, the overall risk of a portfolio is related to the covariance between assets, which is proportional to the correlation coefficients of any two assets. The smaller the correlation coefficients, the smaller the covariance, and then the smaller the overall risk of the portfolio. Here we use the mean-variance approach to model this problem:\n",
    "$$\n",
    "\\omega=\\max _{x \\in\\{0,1\\}^n} \\mu^T x-q x^T S x \\quad \\text { subject to: } 1^T x=B,\n",
    "$$\n",
    "where each symbol has the following meaning:\n",
    "- $x \\in \\{0, 1\\}^{n}$ denotes the vector of binary decision variables, which indicate which each assets is picked ($x_i$=1) or not ($x_i=0$)\n",
    "- $\\mu \\in \\mathbb{R}^n$ defines the expected returns for the assets\n",
    "- $S \\in \\mathbb{R}^{n \\times n}$ represents the covariances between the assets\n",
    "- $q > 0$ represents the risk factor of investment decision making\n",
    "- $B$ denotes the budget, i.e. the number of assets to be selected out of $N$\n",
    "\n",
    "Let us illustrate on the meaning of this equation. $\\mu^T x$ describes the expected benefit of the investment plan represented by $x$. $x^T S x$ describes the correlation between the projects, which, after producting with the risk coefficient $q$, represents the risk incorporated in the investment plan. The restriction $1^T x=B$ requires the number of invested projects equals to our total budget. Therefore, $\\omega$ represents the largest benefit we could get theoretically.\n",
    "\n",
    "In order to find the optimal investment plan more easily, we can define the loss function\n",
    "$$\n",
    "C_x=q \\sum_i \\sum_j S_{j i} x_i x_j-\\sum_i x_i \\mu_i+A\\left(B-\\sum_i x_i\\right)^2,\n",
    "$$\n",
    "where the restriction condition enters the function with the form of Lagrange multiplier. Therefore, our task becomes finding the investment plan $x$ that minimizes the loss $C_x$."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quantum encoding and solution\n",
    "\n",
    "We now need to transform the cost function $C_x$ into a Hamiltonian to realize the encoding of the portfolio optimization problem. One just needs to do the following transformation:\n",
    "$\n",
    "x_i \\mapsto \\frac{I-Z_i}{2},\n",
    "$\n",
    "where $Z_i=I \\otimes I \\otimes \\ldots \\otimes Z \\otimes \\ldots \\otimes I$, i.e., $Z_{i}$ is the Pauli operator acting solely on the $i$-th qubit. Thus using the above mapping, we can transform the cost function $C_x$ into a Hamiltonian $H_C$ for the system of $n$ qubits, the ground state of which represents the solution of the portfolio optimization problem. In order to find the ground state, we use the idea of variational quantum algorithms. We implement a parametric quantum circuit, and use it to generate a trial state $|\\theta^* \\rangle$. We use the quantum circuit to measure the expectation value of the Hamiltonian on this state. Then, classical gradient descent algorithm is implemented to adjust the parameters of the parametric circuit, where the expectation value evolves towards the ground state energy. After some iterations, we arrive at the optimal value\n",
    "$$\n",
    "|\\theta^* \\rangle  = \\arg\\min_\\theta L(\\vec{\\theta})=\\arg\\min_\\theta \\left\\langle\\vec{\\theta}\\left|H_C\\right| \\vec{\\theta}\\right\\rangle.\n",
    "$$\n",
    "\n",
    "Finally, we read out the probability distribution from the measurement result (i.e. decoding the quantum problem to give information about the original bit string)\n",
    "$\n",
    "p(z)=\\left|\\left\\langle z \\mid \\vec{\\theta}^*\\right\\rangle\\right|^2.\n",
    "$\n",
    "In the case of quantum parameterized circuits with sufficient expressiveness, the greater the probability of a certain bit string, the greater the probability that it corresponds to an optimal solution to the portfolio optimization problem."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## User's guide\n",
    "### Configuration file and input parameters\n",
    "We provide a configuration file with previously chosen parameter. The user just needs to change the parameters in the `config.toml` file, and run `python qpo.py --config config.toml --logger qpo_log.log` in the terminal, to solve the portfolio optimization problem.\n",
    "### Output\n",
    "The results will be output to the `qpo_log.log` file. First of all, the process of optimization will be documented in the log. Users can see the evolution of loss function as the looping times increases. \n",
    "### Parameters\n",
    "- `stock`, default is `'demo'`, i.e., using the stock data we provide in the demo file. Users can switch to `'random'` or `'custom'` to generate random stock data or use custom stock data. If user chooses to generate data randomly, the parameters `start_time` and `endtime` can be altered to specify the start and end date of the stock data. If user chooses to use custom data, he or she can store the information of the stock in a csv file, and write in the configuration file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "custom_data_path = 'file_name.csv'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Online demonstration\n",
    "Here, we provide an online demonstration version. Firstly, we define the parameters in the configuration file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "config_toml = r\"\"\"\n",
    "# # The configuration file of quantum portfolio optimization problem.\n",
    "# Use demo stock data\n",
    "stock = 'demo' \n",
    "demo_data_path = 'demo_stock.csv'\n",
    "# Number of investable projects\n",
    "num_asset = 7\n",
    "# Risk of decision making\n",
    "risk_weight = 0.5\n",
    "# Budget\n",
    "budget = 0\n",
    "# Penalty\n",
    "penalty = 0\n",
    "# The depth of the quantum circuit\n",
    "circuit_depth = 2\n",
    "# Number of loop cycles used in the optimization\n",
    "iterations = 600\n",
    "# Learning rate of gradient descent\n",
    "learning_rate = 0.2\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The finance module in PaddleQuantum realizes the numerical simulation of the quantum portfolio optimization problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 600/600 [01:04<00:00,  9.24it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "******************* The optimal investment plan is: [2, 5, 6]  *******************\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'\n",
    "import pandas as pd\n",
    "\n",
    "import toml\n",
    "from paddle_quantum.finance.qpo import portfolio_combination_optimization\n",
    "from paddle_quantum.finance import DataSimulator\n",
    "\n",
    "config = toml.loads(config_toml)\n",
    "demo_data_path = config[\"demo_data_path\"]\n",
    "num_asset = config[\"num_asset\"]\n",
    "risk_weight = config[\"risk_weight\"]\n",
    "budget = config[\"budget\"]\n",
    "penalty = config[\"penalty\"]\n",
    "circuit_depth = config[\"circuit_depth\"]\n",
    "iterations = config[\"iterations\"]\n",
    "learning_rate = config[\"learning_rate\"]\n",
    "\n",
    "stocks_name = [(\"STOCK%s\" % i) for i in range(num_asset)]\n",
    "source_data = pd.read_csv(demo_data_path)\n",
    "processed_data = [source_data['closePrice'+str(i)].tolist() for i in range(num_asset)]\n",
    "data = DataSimulator(stocks_name)\n",
    "data.set_data(processed_data)\n",
    "\n",
    "invest = portfolio_combination_optimization(num_asset, data, iterations, learning_rate, risk_weight, budget,\n",
    "                                       penalty, circuit=circuit_depth)\n",
    "print(f\"******************* The optimal investment plan is: {invest}  *******************\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Note\n",
    "If the number of investable projects is small (`num_asset`$< 12$), we can diagonalize the Hamiltonian exactly, and compare the real minimum loss value with that found by the optimization process. If the difference is large, the optimization result may be unreliable, and re-choosing of the training parameters might be necessary. Finally, we output the optimal investment plan."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "```\n",
    "@article{ORUS2019100028,\n",
    "title = {Quantum computing for finance: Overview and prospects},\n",
    "journal = {Reviews in Physics},\n",
    "volume = {4},\n",
    "pages = {100028},\n",
    "year = {2019},\n",
    "issn = {2405-4283},\n",
    "doi = {https://doi.org/10.1016/j.revip.2019.100028},\n",
    "url = {https://www.sciencedirect.com/science/article/pii/S2405428318300571},\n",
    "author = {Román Orús and Samuel Mugel and Enrique Lizaso}\n",
    "}\n",
    "\n",
    "@ARTICLE{2020arXiv200614510E,\n",
    "       author = {{Egger}, Daniel J. and {Gambella}, Claudio and {Marecek}, Jakub and {McFaddin}, Scott and {Mevissen}, Martin and {Raymond}, Rudy and {Simonetto}, Andrea and {Woerner}, Stefan and {Yndurain}, Elena},\n",
    "        title = \"{Quantum Computing for Finance: State of the Art and Future Prospects}\",\n",
    "      journal = {arXiv e-prints},\n",
    "     keywords = {Quantum Physics, Quantitative Finance - Statistical Finance},\n",
    "         year = 2020,\n",
    "        month = jun,\n",
    "          eid = {arXiv:2006.14510},\n",
    "        pages = {arXiv:2006.14510},\n",
    "archivePrefix = {arXiv},\n",
    "       eprint = {2006.14510},\n",
    " primaryClass = {quant-ph},\n",
    "       adsurl = {https://ui.adsabs.harvard.edu/abs/2020arXiv200614510E},\n",
    "      adsnote = {Provided by the SAO/NASA Astrophysics Data System}\n",
    "}\n",
    "\n",
    "@article{10.2307/2975974,\n",
    " ISSN = {00221082, 15406261},\n",
    " URL = {http://www.jstor.org/stable/2975974},\n",
    " author = {Harry Markowitz},\n",
    " journal = {The Journal of Finance},\n",
    " number = {1},\n",
    " pages = {77--91},\n",
    " publisher = {[American Finance Association, Wiley]},\n",
    " title = {Portfolio Selection},\n",
    " urldate = {2022-12-07},\n",
    " volume = {7},\n",
    " year = {1952}\n",
    "}\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pq-dev",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.15 (default, Nov 10 2022, 13:17:42) \n[Clang 14.0.6 ]"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "5fea01cac43c34394d065c23bb8c1e536fdb97a765a18633fd0c4eb359001810"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}