{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quantum Fisher Information\n", "\n", " Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "\n", "In this tutorial, we briefly introduce the concepts of the classical and quantum Fisher information, along with their applications in quantum machine learning, and show how to compute them with Paddle Quantum." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Background\n", "\n", "The quantum Fisher information (QFI) originates from the field of quantum sensing and have been versatile tools to study parameterized quantum systems [[1]](https://arxiv.org/abs/2103.15191), such as characterizing the overparameterization [[2]](https://arxiv.org/abs/2102.01659) and performing the quantum natural gradient descent [[3]](https://arxiv.org/abs/1909.02108). The QFI is a quantum analogue of the classical Fisher information (CFI). The CFI characterizes the sensibility of a parameterized **probability distribution** to parameter changes, while the QFI characterizes the sensibility of a parameterized **quantum state** to parameter changes.\n", "\n", "In a traditional introduction, the CFI will appear as a quantity of parameter estimation in mathematical statistics, which might be complicated and confusing for the beginners. This tutorial will introduce the CFI from a geometric point of view, which is not only helpful for intuitive understanding, but also easier to see the relationship between the CFI and QFI." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Classical Fisher information\n", "\n", "Let's consider the classical Fisher information first. Suppose we now have a parameterized probability distribution $p(\\boldsymbol{x};\\boldsymbol{\\theta})$. Here comes a question:\n", "\n", "- How much does a small parameter change result in the probability distribution change ?\n", "\n", "Since the question sounds like a perturbation problem, an intuition is to perform something like the Taylor expansion. But before expansion, we need to know which function to expand, i.e. we need to quantify the probability distribution change first. More formally, we need to define a distance measure between any two probability distributions, denoted by $d(p(\\boldsymbol{x};\\boldsymbol{\\theta}),p(\\boldsymbol{x};\\boldsymbol{\\theta}'))$, or $d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}')$ for short.\n", "\n", "Generally, a legal distance measure is supposed to be non-negative and equal to zero if and only if two points are identical, i.e.\n", "\n", "$$\n", "\\begin{aligned}\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}')\\geq 0,\\\\\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}')=0~\\Leftrightarrow~\\boldsymbol{\\theta}=\\boldsymbol{\\theta}'.\n", "\\end{aligned}\n", "\\tag{1}\n", "$$\n", "\n", "Considering the expansion of a small distance $d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})$, the conditions above lead to\n", "\n", "$$\n", "\\begin{aligned}\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta})=0~\\Rightarrow~\\text{the zero order}=0,\\\\\n", "&d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})\\geq 0~\\Rightarrow~\\boldsymbol{\\delta}=0~\\text{takes minimum}\n", "~\\Rightarrow~\\text{the first order}=0.\n", "\\end{aligned}\n", "\\tag{2}\n", "$$\n", "\n", "Thus, the second order is the lowest order that does not vanish in the expansion. So the expansion can be written as\n", "\n", "$$\n", "\\begin{aligned}\n", "d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})\n", "=\\frac{1}{2}\\sum_{ij}\\delta_iM_{ij}\\delta_j+O(\\|\\boldsymbol{\\delta}\\|^3) \n", "=\\frac{1}{2} \\boldsymbol{\\delta}^T M \\boldsymbol{\\delta} + O(\\|\\boldsymbol{\\delta}\\|^3),\n", "\\end{aligned}\n", "\\tag{3}\n", "$$\n", "\n", "where\n", "\n", "$$\n", "M_{ij}(\\boldsymbol{\\theta})=\\left.\\frac{\\partial^2}{\\partial\\delta_i\\partial\\delta_j}d(\\boldsymbol{\\theta},\\boldsymbol{\\theta}+\\boldsymbol{\\delta})\\right|_{\\boldsymbol{\\delta}=0},\n", "\\tag{4}\n", "$$\n", "\n", "is exactly the Hessian matrix of the distance expansion, which is called [metric](http://en.wikipedia.org/wiki/Metric_tensor) of manifold in the context of differentiable geometry. The brief derivation above tells us that we can approximate a small distance as a quadratic form of the corresponding parameters, as shown in Fig.1, and the coefficient matrix of the quadratic form is exactly the Hessian matrix from the distance expansion, up to a $1/2$ factor.\n", "\n", "![feature map](./figures/FIM-fig-Sphere-metric.png \"Figure 1. Approximate a small distance on the 2-sphere as a quadratic form\")\n", "