{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 量子金融应用:投资组合优化\n", "\n", " Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 概览\n", "\n", "当前量子计算应用到金融问题上的解决方案通常可分为三类量子算法,即量子模拟,量子优化以及量子机器学习 [1,2]。许多的金融问题本质上是一个组合优化问题,解决这些问题的算法通常具有较高的时间复杂度,实现难度较大。得益于量子计算强大的计算性能,未来有望通过量子算法解决这些复杂问题。\n", "\n", "量桨的 Quantum Finance 模块主要讨论的是量子优化部分的内容,即如何通过一些量子算法解决实际金融应用中的优化问题。本文主要介绍如何使用量子算法求解主动投资管理中投资组合优化问题。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 投资组合优化问题\n", "\n", "投资组合是金融投资的集合,比如股票、债权、现金等。投资组合优化是许多主动型投资管理者需要面对的问题,它需要从业者应用相关数学理论方法,根据目标收益和风险对一笔资金进行投资,以期在收益一定的情况下风险最小化或者是风险一定的情况下投资收益最大化。\n", "\n", "对投资组合优化的一个具体描述如下:假如你是一位资产管理人,想要将数额为 $K$ 的资金一次性投入到 $N$ 个可投资的项目中,各项目都有自己的投资回报率和风险,你的目标就是在考虑到市场影响和交易费用的的基础上找到一个最佳的投资组合空间,使得该笔资产以最优的投资方案实施。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 编码投资组合优化问题\n", "\n", "为了将投资组合优化问题转化成一个参数化量子电路(parameterized quantum circuits, PQC)可解的问题,我们首先需要编码该问题的哈密顿量。\n", "为了方便建模实现,需要做两点假设对问题加以限制:\n", "* 每个项目都是等额投资的,\n", "* 给定的预算是投资一个项目金额的整数倍,且必须全部花完。\n", "\n", "在该计算模型中我们把对项目的投资金额单位化,即如果投资预算为 $3$,那么就要投 $3$ 个项目。因为实际的投资中,预算是有限的,而可投资的项目是很多的,所以在设定参数时也要注意可投资项目数是要大于预算的。\n", "\n", "在投资组合的基本理论中,投资组合的总体风险与项目间的协方差有关,而协方差与任意两项目的相关系数成正比。相关系数越小,其协方差就越小,投资组合的总体风险也就越小 [3]。\n", "在这里我们采用均值方差组合优化的方法,给出该问题的建模方程:\n", "\n", "$$\n", "\\omega = \\max _{x \\in\\{0,1\\}^{n}} \\mu^{T} x - q x^{T} S x \\quad\\quad \\tag{1}\n", "\\text { subject to: } \\mathbb{1}^{T} x=B,\n", "$$\n", "\n", "该式子中各符号代表的含义如下:\n", "* $x\\in {\\{0,1\\}}^n$ 表示一个向量,其中每一个元素均为二进制变量,即如果资产 $i$ 被投资了,则 $x_i = 1$,如果没有被选择,则 $x_i = 0$ \n", "* $\\mu \\in \\mathbb{R}^n$ 表示投资每个项目的预期回报率\n", "* $S \\in \\mathbb{R}^{n \\times n}$ 表示各投资项目回报率之间的协方差矩阵\n", "* $q > 0$ 表示做出该投资决定的风险系数\n", "* $\\mathbb{1}$ 表示 $n$ 维值全为 $1$ 向量\n", "* $B$ 代表投资预算,即我们可以投资的项目数\n", "\n", "\n", "根据模型方程,可以给出损失函数:\n", "\n", "$$\n", "C_x = q \\sum_i \\sum_j S_{ji}x_ix_j - \\sum_{i}x_i \\mu_i + A \\left(B - \\sum_i x_i\\right)^2, \\tag{2}\n", "$$\n", "\n", "其中,$S_{ij}$ 表示协方差矩阵 $S$ 的内部元素。\n", "\n", "由于要对损失函数做梯度下降优化,所以在定义时就根据模型的方程做了一定修改:其中第一项为风险项,表示该笔投资的风险;第二项为投资收益项;$A$ 为惩罚参数,通常设置为一个较大的数字,该项限定一笔资金预算 $B$ 必须均匀的投入到不同的投资项目中。\n", "\n", "现在我们需要将损失函数转为一个哈密顿量,从而完成投资组合优化问题的编码。每一个二进制变量可以取 $0$ 和 $1$ 两个值,分别对应量子态 $|0\\rangle$ 和 $|1\\rangle$。每个二进制变量都对应一个量子比特,所以我们需要 $n$ 个量子比特来解决投资组合优化问题。\n", "因为我们的变量 $x_i$ 的值为 $0$ 和 $1$,所以我们要构造一个本征值和它对应的哈密顿量。泡利 $Z$ 的本征值为 $\\pm 1$,于是我们构造的哈密顿量为 $\\frac{I-Z}{2}$, 对应的本征值即为 $0$ 和 $1$。我们现在将二进制变量映射到泡利 $Z$ 矩阵上,从而使 $C_x$ 转化成哈密顿矩阵:\n", "\n", "$$\n", "x_{i} \\mapsto \\frac{I-Z_{i,}}{2}, \\tag{3}\n", "$$\n", "\n", "这里 $Z_{i} = I \\otimes I \\otimes \\ldots \\otimes Z \\otimes \\ldots \\otimes I$,也就是说 $Z$ 作用在第 $i$ 个量子比特上。通过这个映射,如果一个编号为 $i$ 的量子比特的量子态为 $|1\\rangle$,那么对应的二进制变量的取值为 $x_{i} |1\\rangle = \\frac{I-Z_{i}}{2} |1\\rangle = 1|1\\rangle $,也就是说该项目是我们要投资的。同样地,对于量子态为 $|0\\rangle$的量子比特 $i$,它所对应的二进制变量的取值为 $x_{i}|0\\rangle = \\frac{I-Z_{i}}{2} |0\\rangle = 0 |0\\rangle $。\n", "\n", "我们用上述映射将 $C_x$ 转化成量子比特数为 $n$ 的系统的哈密顿矩阵 $H_C$,从而实现了投资组合优化问题的量子化。这个哈密顿矩阵 $H_C$ 的基态即为投资组合优化问题的最优解。在接下来的部分,我们将展示如何用参数化量子电路找到这个矩阵的基态,也就是对应最小本征值的本征态。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Paddle Quantum 实现\n", "\n", "要在量桨上实现用参数化量子电路解决量子金融中的投资组合优化问题,首先要做的便是加载需要用到的包。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-05-17T08:00:15.901429Z", "start_time": "2021-05-17T08:00:12.708945Z" } }, "outputs": [], "source": [ "#加载需要的包\n", "import numpy as np\n", "import pandas as pd\n", "import datetime\n", "\n", "#加载飞桨,量桨相关的模块\n", "import paddle\n", "import paddle_quantum\n", "from paddle_quantum.ansatz import Circuit\n", "from paddle_quantum.finance import DataSimulator, portfolio_optimization_hamiltonian" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 准备实验数据\n", "在本问题中,我们选定的投资项目为股票。对于实验测试要用的数据,提供了两种选择:\n", "* 第一种方法是根据设定的条件,随机生成实验数据。\n", "\n", "如果采用这种方法准备数据,用户在初始化数据时,就需要给出可投资股票的名字列表,交易数据的开始日期和结束日期。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "num_assets = 7 # 可投资的项目数量\n", "stocks = [(\"STOCK%s\" % i) for i in range(num_assets)] \n", "data = DataSimulator(stocks=stocks, start=datetime.datetime(2016, 1, 1), end=datetime.datetime(2016, 1, 30)) \n", "data.randomly_generate() # 随机生成实验数据" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2021-05-17T08:00:16.212260Z", "start_time": "2021-05-17T08:00:15.918792Z" } }, "source": [ "* 第二种方法是用户可以选择读取本地收集到的真实数据集用于实验。考虑到文件中包含的股票数可能会很多,用户可以指定用于该实验的股票数量,即上面初始化的 `num_assets`。\n", "\n", "我们收集了 $12$ 支股票 $35$ 个交易日的收盘价格存放到 `realStockData_12.csv` 文件中,在这里我们只选择读取前 $7$ 个股票的信息。\n", "\n", "在本教程中,我们选择读取真实数据作为实验数据。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[16.87, 17.18, 17.07, 17.15, 16.66, 16.79, 16.69, 16.99, 16.76, 16.52, 16.33, 16.39, 16.45, 16.0, 16.09, 15.54, 13.99, 14.6, 14.63, 14.77, 14.62, 14.5, 14.79, 14.77, 14.65, 15.03, 15.37, 15.2, 15.24, 15.59, 15.58, 15.23, 15.04, 14.99, 15.11, 14.5], [32.56, 32.05, 31.51, 31.76, 31.68, 32.2, 31.46, 31.68, 31.39, 30.49, 30.53, 30.46, 29.87, 29.21, 30.11, 28.98, 26.63, 27.62, 27.64, 27.9, 27.5, 28.67, 29.08, 29.08, 29.95, 30.8, 30.42, 29.7, 29.65, 29.85, 29.25, 28.9, 29.33, 30.11, 29.67, 29.59], [5.4, 5.48, 5.46, 5.49, 5.39, 5.47, 5.46, 5.53, 5.5, 5.47, 5.39, 5.35, 5.37, 5.24, 5.26, 5.08, 4.57, 4.44, 4.5, 4.56, 4.52, 4.59, 4.66, 4.67, 4.66, 4.72, 4.84, 4.81, 4.84, 4.88, 4.89, 4.82, 4.74, 4.84, 4.79, 4.63], [3.71, 3.75, 3.73, 3.79, 3.72, 3.77, 3.76, 3.74, 3.78, 3.71, 3.61, 3.58, 3.61, 3.53, 3.5, 3.42, 3.08, 2.95, 3.04, 3.05, 3.05, 3.13, 3.12, 3.14, 3.11, 3.07, 3.23, 3.3, 3.31, 3.3, 3.33, 3.31, 3.22, 3.31, 3.25, 3.12], [5.72, 5.75, 5.74, 5.81, 5.69, 5.79, 5.77, 5.8, 5.89, 5.78, 5.7, 5.69, 5.75, 5.7, 5.71, 5.54, 4.99, 4.89, 4.94, 5.08, 5.39, 5.35, 5.23, 5.26, 5.19, 5.18, 5.31, 5.33, 5.31, 5.38, 5.39, 5.41, 5.28, 5.3, 5.38, 5.12], [7.62, 7.56, 7.68, 7.75, 7.79, 7.84, 7.82, 7.8, 7.92, 7.96, 7.93, 7.87, 7.86, 7.82, 7.9, 7.7, 6.93, 6.91, 7.18, 7.31, 7.35, 7.53, 7.47, 7.48, 7.35, 7.33, 7.46, 7.47, 7.39, 7.47, 7.48, 8.06, 8.02, 8.01, 8.11, 7.87], [3.7, 3.7, 3.68, 3.7, 3.63, 3.66, 3.63, 3.63, 3.66, 3.63, 3.6, 3.59, 3.63, 3.6, 3.61, 3.54, 3.19, 3.27, 3.27, 3.31, 3.3, 3.32, 3.33, 3.38, 3.36, 3.34, 3.39, 3.39, 3.37, 3.42, 3.43, 3.37, 3.32, 3.36, 3.37, 3.3]]\n" ] } ], "source": [ "df = pd.read_csv('realStockData_12.csv')\n", "dt = []\n", "for i in range(num_assets): \n", " mylist = df['closePrice'+str(i)].tolist()\n", " dt.append(mylist) \n", "print(dt) # 输出从文件中读取的七个股票在35个交易日中的收盘价格\n", "\n", "data.set_data(dt) # 指定实验数据为用户读取的数据" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 编码哈密顿量\n", "\n", "这里我们将式(2)中的二进制变量用式(3)替换,从而构建哈密顿量 $H_C$。\n", "\n", "在编码哈密顿量的过程中,首先需要计算各股票回报率之间的协方差矩阵 $S$。量桨平台的 finance 模块有支持计算该协方差矩阵的函数,用户可以直接调用。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "s = data.get_asset_return_covariance_matrix()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "第二个是需要计算出各个股票的平均投资回报率向量 $\\mu$。同样的,量桨也提供有计算各股票平均投资回报率的函数。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "mu = data.get_asset_return_mean_vector()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面根据设定和计算出来的参数来构建哈密顿量,这里我们设置惩罚参数为可投资的股票数量。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "q = 0.5 # 风险系数\n", "budget = num_assets // 2 # 资金预算\n", "penalty = num_assets # 惩罚参数 \n", "hamiltonian = portfolio_optimization_hamiltonian(penalty, mu, s, q, budget)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 计算损失函数\n", "\n", "我们使用 $U_3(\\vec{\\theta})$ 和 $\\text{CNOT}$ 门构造的参数化量子电路,通过调用量桨内置的 [`complex_entangled_layer()`](https://qml.baidu.com/api/paddle_quantum.ansatz.circuit.html#Circuit.complex_entangled_layer) 构造实现。该电路会返回一个输出态 $|\\vec{\\theta}\\rangle$,根据该参数便可以计算投资组合优化问题在经典-量子混合模型下损失的函数:\n", "\n", "$$\n", "L(\\vec{\\theta}) = \\langle\\vec{\\theta}|H_C|\\vec{\\theta}\\rangle.\n", "\\tag{4}\n", "$$\n", "\n", "之后我们利用经典的优化算法寻找最优参数 $\\vec{\\theta}^*$。下面的代码给出了通过量桨和飞桨搭建网络的过程。" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "class PONet(paddle.nn.Layer):\n", "\n", " def __init__(self, num_qubits, p, dtype=\"float64\"):\n", " super(PONet, self).__init__()\n", "\n", " self.depth = p\n", " self.num_qubits = num_qubits\n", " self.cir = Circuit(self.num_qubits)\n", " self.cir.complex_entangled_layer(depth=self.depth)\n", "\n", "\n", " def forward(self):\n", " \"\"\"\n", " 前向传播\n", " \"\"\"\n", " state = self.cir(init_state)\n", " loss = loss_func(state)\n", "\n", " return loss, self.cir" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 训练量子神经网络\n", "\n", "定义好了量子神经网络后,我们使用梯度下降的方法来更新其中的参数,使得式(4)的期望值最小。" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "SEED = 1000 # 随机数种子\n", "p = 2 # 量子电路的层数\n", "ITR = 600 # 迭代次数\n", "LR = 0.4 # 梯度下降优化速率 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "使用飞桨,优化上面定义的网络。" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "循环数: 50 损失: 0.0399189\n", "循环数: 100 损失: 0.0098760\n", "循环数: 150 损失: 0.0085572\n", "循环数: 200 损失: 0.0074596\n", "循环数: 250 损失: 0.0066504\n", "循环数: 300 损失: 0.0061929\n", "循环数: 350 损失: 0.0059874\n", "循环数: 400 损失: 0.0059097\n", "循环数: 450 损失: 0.0058763\n", "循环数: 500 损失: 0.0058761\n", "循环数: 550 损失: 0.0058756\n", "循环数: 600 损失: 0.0058689\n" ] } ], "source": [ "# 比特数量\n", "num_qubits = len(mu)\n", "# 固定随机数种子\n", "paddle.seed(SEED)\n", "# 定义量子神经网络\n", "net = PONet(num_qubits, p)\n", "# 定义初始态\n", "init_state = paddle_quantum.state.zero_state(num_qubits)\n", "# 定义损失函数\n", "loss_func = paddle_quantum.loss.ExpecVal(hamiltonian)\n", "# 使用 Adam 优化器\n", "opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())\n", "\n", "# 梯度下降优循环\n", "for itr in range(1, ITR + 1):\n", " # 运行上面定义的网络\n", " loss, cir = net()\n", " #计算梯度并优化\n", " loss.backward()\n", " opt.minimize(loss)\n", " opt.clear_grad()\n", " if itr % 50 == 0:\n", " print(\"循环数:\", itr,\" 损失:\", \"%.7f\"% loss.numpy())\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 理论最小损失值\n", "\n", "理论 $C_x$ 的最小值对应的是我们所构建的哈密顿量的最小特征值。所以我们希望参数化电路优化的损失函数的值接近理论最小值。对于小一点的 ``num_assets``,我们可以根据以下代码进行验证。" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "理论最小损失值: 0.0058722496\n", "实际最小损失值: 0.0058689117431640625\n" ] } ], "source": [ "H_C_matrix = hamiltonian.construct_h_matrix()\n", "print(\"理论最小损失值:\", np.linalg.eigvalsh(H_C_matrix)[0]) \n", "print(\"实际最小损失值:\", float(loss.numpy()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在这个例子中,上面参数化电路优化出来的的最小损失和理论最小损失是非常接近的,这代表着我们之后给出的投资方案是最优的。如果两个值不太吻合,可以通过改变随机种子 `SEED`,量子电路的层数 `p`,迭代次数 `ITR` 和梯度下降优化速率 `LR` 等参数重新计算。 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 解码量子答案\n", "\n", "当调用优化器求得损失函数的最小值以及相对应的一组参数 $\\vec{\\theta}^*$后,为了进一步求得投资组合优化问题的近似解,需要从电路输出的量子态 $|\\vec{\\theta}^*\\rangle$ 中解码出经典优化问题的答案。物理上,解码量子态需要对量子态进行测量,然后统计测量结果的概率分布:\n", "\n", "$$\n", "p(z) = |\\langle z|\\vec{\\theta}^*\\rangle|^2.\n", "\\tag{5}\n", "$$\n", "\n", "在量子参数化电路表达能力足够的情况下,某个比特串出现的概率越大,意味着其是投资组合优化问题最优解的可能性越大。\n", "\n", "量桨提供了查看参数化量子电路输出状态的测量结果概率分布的函数。\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "利用哈密顿量找到的解的比特串形式: 0100110\n" ] } ], "source": [ "# 模拟重复测量电路输出态 2048 次\n", "final_state = cir(init_state)\n", "prob_measure = final_state.measure(shots=2048)\n", "investment = max(prob_measure, key=prob_measure.get)\n", "print(\"利用哈密顿量找到的解的比特串形式:\",investment)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们的测量结果是表示投资组合优化问题答案的比特串:字符串中该位置为 $1$,表示该笔资产被选定投资。如上面的结果 `0100110` 就表示在可选的 $7$ 支可投资的项目中,选择了第二、第五、第六三支股票。同时,字符串中 $1$ 的数量应该和预算数相同。如果最后的情况不是这样,读者依然可以通过调整参数化量子电路的参数值或参数化量子电路的结构来获得更好的训练效果。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 结语\n", "\n", "本教程中,投资组合优化问题的最优解是在均值-方差组合优化方法基础上通过变分量子本征求解器(Variational Quantum Eigensolver, VQE)近似得到的。在给定投资预算和可投资项目信息以及投资风险的基础上,通过计算投资项目的回报率以及各投资项目回报率之间的协方差矩阵,应用参数化量子电路寻找最优的投资组合。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_______\n", "\n", "## 参考文献\n", "\n", "[1] Orus, Roman, Samuel Mugel, and Enrique Lizaso. \"Quantum computing for finance: Overview and prospects.\" [Reviews in Physics 4 (2019): 100028.](https://arxiv.org/abs/1807.03890)\n", "\n", "[2] Egger, Daniel J., et al. \"Quantum computing for Finance: state of the art and future prospects.\" [IEEE Transactions on Quantum Engineering (2020).](https://arxiv.org/abs/2006.14510)\n", "\n", "[3] Markowitz, H.M. (March 1952). \"Portfolio Selection\". [The Journal of Finance. 7 (1): 77–91. doi:10.2307/2975974. JSTOR 2975974.](https://www.jstor.org/stable/2975974)" ] } ], "metadata": { "interpreter": { "hash": "3b61f83e8397e1c9fcea57a3d9915794102e67724879b24295f8014f41a14d85" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }