電子發(fā)燒友網(wǎng)>電子資料下載>電子資料>PyTorch教程6.5之自定義圖層

PyTorch教程6.5之自定義圖層

2512958 2023-06-05 | pdf | 0.12 MB | 次下載 | 免費(fèi)

資料介紹

深度學(xué)習(xí)成功背后的一個(gè)因素是廣泛的層的可用性，這些層可以以創(chuàng)造性的方式組合以設(shè)計(jì)適合各種任務(wù)的架構(gòu)。例如，研究人員發(fā)明了專門用于處理圖像、文本、循環(huán)順序數(shù)據(jù)和執(zhí)行動(dòng)態(tài)規(guī)劃的層。遲早，您會(huì)遇到或發(fā)明深度學(xué)習(xí)框架中尚不存在的層。在這些情況下，您必須構(gòu)建自定義層。在本節(jié)中，我們將向您展示如何操作。

						import torch
from torch import nn
from torch.nn import functional as F
from d2l import torch as d2l

						 

						from mxnet import np, npx
from mxnet.gluon import nn
from d2l import mxnet as d2l

npx.set_np()

						import jax
from flax import linen as nn
from jax import numpy as jnp
from d2l import jax as d2l

						 

						No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

					

						import tensorflow as tf
from d2l import tensorflow as d2l

6.5.1. 沒(méi)有參數(shù)的圖層

首先，我們構(gòu)建一個(gè)自定義層，它自己沒(méi)有任何參數(shù)。如果您還記得我們?cè)?/font>第 6.1 節(jié)中對(duì)模塊的介紹，這應(yīng)該看起來(lái)很熟悉。以下 CenteredLayer類只是從其輸入中減去平均值。要構(gòu)建它，我們只需要繼承基礎(chǔ)層類并實(shí)現(xiàn)前向傳播功能。

							class CenteredLayer(nn.Module):
  def __init__(self):
    super().__init__()

  def forward(self, X):
    return X - X.mean()

							 

							class CenteredLayer(nn.Block):
  def __init__(self, **kwargs):
    super().__init__(**kwargs)

  def forward(self, X):
    return X - X.mean()

							 

							class CenteredLayer(nn.Module):
  def __call__(self, X):
    return X - X.mean()

							 

							class CenteredLayer(tf.keras.Model):
  def __init__(self):
    super().__init__()

  def call(self, X):
    return X - tf.reduce_mean(X)

							 

讓我們通過(guò)提供一些數(shù)據(jù)來(lái)驗(yàn)證我們的層是否按預(yù)期工作。

							layer = CenteredLayer()
layer(torch.tensor([1.0, 2, 3, 4, 5]))

							tensor([-2., -1., 0., 1., 2.])

						

							layer = CenteredLayer()
layer(np.array([1.0, 2, 3, 4, 5]))

							array([-2., -1., 0., 1., 2.])

						

							layer = CenteredLayer()
layer(jnp.array([1.0, 2, 3, 4, 5]))

							Array([-2., -1., 0., 1., 2.], dtype=float32)

						

							layer = CenteredLayer()
layer(tf.constant([1.0, 2, 3, 4, 5]))

							<tf.Tensor: shape=(5,), dtype=float32, numpy=array([-2., -1., 0., 1., 2.], dtype=float32)>

						

我們現(xiàn)在可以將我們的層合并為構(gòu)建更復(fù)雜模型的組件。

							net = nn.Sequential(nn.LazyLinear(128), CenteredLayer())

							 

							net = nn.Sequential()
net.add(nn.Dense(128), CenteredLayer())
net.initialize()

							 

							net = nn.Sequential([nn.Dense(128), CenteredLayer()])

							 

							net = tf.keras.Sequential([tf.keras.layers.Dense(128), CenteredLayer()])

							 

作為額外的健全性檢查，我們可以通過(guò)網(wǎng)絡(luò)發(fā)送隨機(jī)數(shù)據(jù)并檢查均值實(shí)際上是否為 0。因?yàn)槲覀兲幚淼氖歉↑c(diǎn)數(shù)，由于量化，我們可能仍然會(huì)看到非常小的非零數(shù)。

							Y = net(torch.rand(4, 8))
Y.mean()

							tensor(0., grad_fn=<MeanBackward0>)

						

							Y = net(np.random.rand(4, 8))
Y.mean()

							array(3.783498e-10)

						

Here we utilize the init_with_output method which returns both the output of the network as well as the parameters. In this case we only focus on the output.

							Y, _ = net.init_with_output(d2l.get_key(), jax.random.uniform(d2l.get_key(),
                               (4, 8)))
Y.mean()

							 

							Array(5.5879354e-09, dtype=float32)

						

							Y = net(tf.random.uniform((4, 8)))
tf.reduce_mean(Y)

							<tf.Tensor: shape=(), dtype=float32, numpy=1.8626451e-09>

						

6.5.2. 帶參數(shù)的圖層

現(xiàn)在我們知道如何定義簡(jiǎn)單的層，讓我們繼續(xù)定義具有可通過(guò)訓(xùn)練調(diào)整的參數(shù)的層。我們可以使用內(nèi)置函數(shù)來(lái)創(chuàng)建參數(shù)，這些參數(shù)提供了一些基本的內(nèi)務(wù)處理功能。特別是，它們管理訪問(wèn)、初始化、共享、保存和加載模型參數(shù)。這樣，除了其他好處之外，我們將不需要為每個(gè)自定義層編寫自定義序列化例程。

現(xiàn)在讓我們實(shí)現(xiàn)我們自己的全連接層版本。回想一下，該層需要兩個(gè)參數(shù)，一個(gè)代表權(quán)重，另一個(gè)代表偏差。在此實(shí)現(xiàn)中，我們將 ReLU 激活作為默認(rèn)值進(jìn)行烘焙。該層需要兩個(gè)輸入?yún)?shù)： in_units和units，分別表示輸入和輸出的數(shù)量。

							class MyLinear(nn.Module):
  def __init__(self, in_units, units):
    super().__init__()
    self.weight = nn.Parameter(torch.randn(in_units, units))
    self.bias = nn.Parameter(torch.randn(units,))

  def forward(self, X):
    linear = torch.matmul(X, self.weight.data) + self.bias.data
    return F.relu(linear)

							 

接下來(lái)，我們實(shí)例化該類MyLinear并訪問(wèn)其模型參數(shù)。

							linear = MyLinear(5, 3)
linear.weight

							Parameter containing:
tensor([[-1.2894e+00, 6.5869e-01, -1.3933e+00],
    [ 7.2590e-01, 7.1593e-01, 1.8115e-03],
    [-1.5900e+00, 4.1654e-01, -1.3358e+00],
    [ 2.2732e-02, -2.1329e+00, 1.8811e+00],
    [-1.0993e+00, 2.9763e-01, -1.4413e+00]], requires_grad=True)

						

							class MyDense(nn.Block):
  def __init__(self, units, in_units, **kwargs):
    super().__init__(**kwargs)
    self.weight = self.params.get('weight', shape=(in_units, units))
    self.bias = self.params.get('bias', shape=(units,))

  def forward(self, x):
    linear = np.