{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "在PyTorch中,一切神经网络的核心是`autograd`模块。\n", "\n", "`autograd`模块为张量的所有操作提供自动微分。这是一个由运行定义的框架,这意味着你的反向传播是由代码的运行的方式定义的,并且每一次迭代都可能不同。\n", "\n", "让我们通过一些示例以更简单的方式看待这一点。\n", "\n", "# Tensor\n", "`torch.Tensor`是这个包的核心类。如果将`.requires_grad`属性设置为`True`,这将开始追踪它上的所有操作。当计算结束,你可以调用`.backward()`方法,自动计算所有梯度。该张量的梯度将累加到`.grad`属性中。 \n", "\n", "为了停止一个张量上的历史记录跟踪,你可以调用`.detach()`方法将它从计算历史中分离,并阻止将来计算中的跟踪。\n", "\n", "为了防止跟踪历史记录(和使用内存),你还可以将代码块包装在`with torch.no_grad():`中。这在评估模型时特别有用,因为模型可能存在带`requires_grad=True`的可训练参数,但事实上我们不需要梯度。\n", "\n", "还有一个类对于`autograd`实现非常重要——`Function`。\n", "\n", "`Tensor`和`Function`相互连接并建立一个无环图,该图对完整的计算历史进行编码。每个张量都有一个`.grad_fn`属性,该属性引用一个创建这个张量的`Function`(用户创建的张量除外,他们的`.grad_fn`属性为None)。\n", "\n", "如果你想计算导数,你可以在张量上调用`.backward()`。如果`Tensor`为标量(即,它包含一个元素数据),则无需为`.backward()`指定参数,但是如果它有更多元素,则需要指定gradient 参数为匹配形状的张量。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import torch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "创建一个张量,并设置`requires_grad=True`来跟踪计算。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[1., 1.],\n", " [1., 1.]], requires_grad=True)\n" ] } ], "source": [ "x = torch.ones(2, 2, requires_grad=True)\n", "print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "执行一项操作:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[3., 3.],\n", " [3., 3.]], grad_fn=)\n" ] } ], "source": [ "y = x + 2\n", "print(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`y`是由相加操作得来的张量,所以它有`grad_fn`属性。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "print(y.grad_fn)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们在`y`上做更多的操作:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[27., 27.],\n", " [27., 27.]], grad_fn=) tensor(27., grad_fn=)\n" ] } ], "source": [ "z = y * y * 3\n", "out = z.mean()\n", "\n", "print(z, out)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`.requires_grad_( ... )`方法可以变换一个张量的`requires_grad`属性。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "False\n", "None\n", "True\n", "\n" ] } ], "source": [ "a = torch.randn(2, 2)\n", "a = ((a * 3) / (a - 1))\n", "print(a.requires_grad)\n", "b = (a * a).sum()\n", "print(b.grad_fn)\n", "\n", "a.requires_grad_(True)\n", "print(a.requires_grad)\n", "b = (a * a).sum()\n", "print(b.grad_fn)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Gradients \n", "现在让我们开始反向传播。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "out.backward()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "打印梯度d(out)/dx" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[4.5000, 4.5000],\n", " [4.5000, 4.5000]])\n" ] } ], "source": [ "print(x.grad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "接下来是一个vector-Jacobian product的例子" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([-0.1796, -0.6436, -1.6582], grad_fn=)\n", "tensor([-0.3592, -1.2871, -3.3164], grad_fn=)\n", "tensor([-0.7184, -2.5743, -6.6329], grad_fn=)\n", "tensor([ -1.4368, -5.1485, -13.2657], grad_fn=)\n", "tensor([ -2.8737, -10.2970, -26.5314], grad_fn=)\n", "tensor([ -5.7474, -20.5940, -53.0628], grad_fn=)\n", "tensor([ -11.4948, -41.1880, -106.1257], grad_fn=)\n", "tensor([ -22.9896, -82.3761, -212.2513], grad_fn=)\n", "tensor([ -45.9791, -164.7521, -424.5026], grad_fn=)\n", "tensor([ -91.9583, -329.5043, -849.0052], grad_fn=)\n", "tensor([ -183.9165, -659.0085, -1698.0105], grad_fn=)\n" ] } ], "source": [ "x = torch.randn(3, requires_grad=True)\n", "\n", "y = x * 2\n", "print(y)\n", "while y.data.norm() < 1000:\n", " y = y * 2\n", " print(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "现在,`y`不再是标量。`torch.autograd`无法直接计算完整雅可比行列,但是如果只想要 vector-Jacobian product,只需通过 backward 将向量作为参数传入:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([2.0480e+02, 2.0480e+03, 2.0480e-01])\n" ] } ], "source": [ "v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)\n", "y.backward(v)\n", "\n", "print(x.grad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "你也可以使用`with torch.no_grad():`来包装代码块来使`.requires_grad=True`的张量通过追踪历史进行自动求导。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n", "False\n" ] } ], "source": [ "print(x.requires_grad)\n", "print((x ** 2).requires_grad)\n", "\n", "with torch.no_grad():\n", " print((x ** 2).requires_grad)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 2 }