torch.nn.init.uniform_(tensor, a=0, b=1)
从均匀分布U(a, b)中采样,初始化张量。
参数:
例子:
w = torch.empty(3, 5)nn.init.uniform_(w)"""tensor([[0.2116, 0.3085, 0.5448, 0.6113, 0.7697], [0.8300, 0.2938, 0.4597, 0.4698, 0.0624], [0.5034, 0.1166, 0.3133, 0.3615, 0.3757]])"""均匀分布详解:
若 $x$ 服从均匀分布,即 $x~U(a,b)$,其概率密度函数(表征随机变量每个取值有多大的可能性)为,
$f(x)=\left\{\begin{array}{l}\frac{1}{b-a}, \quad a<x<b \\ 0, \quad else \end{array}\right.$
则有期望和方差,
$\begin{array}{c}E(x)=\int_{-\infty}^{\infty} x f(x) d x=\frac{1}{2}(a+b) \\D(x)=E\left(x^{2}\right)-[E(x)]^{2}=\frac{(b-a)^{2}}{12}\end{array}$
torch.nn.init.normal_(tensor, mean=0.0, std=1.0)
从给定的均值和标准差的正态分布 $N\left(\right. mean, \left.s t d^{2}\right)$ 中生成值,初始化张量。
参数:
例子:
w = torch.Tensor(3, 5)torch.nn.init.normal_(w, mean=0, std=1)"""tensor([[-1.3903, 0.4045, 0.3048, 0.7537, -0.5189], [-0.7672, 0.1891, -0.2226, 0.2913, 0.1295], [ 1.4719, -0.3049, 0.3144, -1.0047, -0.5424]])"""正态分布详解:
若随机变量 $x$ 服从正态分布,即 $x \sim N\left(\mu, \sigma^{2}\right) $, 其概率密度函数为,
$f(x)=\frac{1}{\sigma \sqrt{2 \pi}} \exp \left(-\frac{\left(x-\mu^{2}\right)}{2 \sigma^{2}}\right)$
正态分布概率密度函数中一些特殊的概率值:
$\mu=0$, $\sigma=1$ 时的正态分布是标准正态分布。
torch.nn.init.xavier_uniform_(tensor, gain=1.0)
又称 Glorot 初始化,按照 Glorot, X. & Bengio, Y.(2010)在论文Understanding the difficulty of training deep feedforward neural networks 中描述的方法,从均匀分布 $U(?a, a)$ 中采样,初始化输入张量 $tensor$,其中 $a $ 值由下式确定:
$a=\text { gain } \times \sqrt{\frac{6}{\text { fan_in }+\text { fan_out }}}$
例子:
w = torch.Tensor(3, 5)nn.init.xavier_uniform_(w, gain=torch.nn.init.calculate_gain('relu'))"""tensor([[ 0.7695, -0.7687, -0.2561, -0.5307, 0.5195], [-0.6187, 0.4913, 0.3037, -0.6374, 0.9725], [-0.2658, -0.4051, -1.1006, -1.1264, -0.1310]])""" torch.nn.init.xavier_normal_(tensor, gain=1.0)
又称 Glorot 初始化,按照 Glorot, X. & Bengio, Y.(2010)在论文Understanding the difficulty of training deep feedforward neural networks 中描述的方法,从均匀分布 $N\left(0, s t d^{2}\right)$ 中采样,初始化输入张量 $tensor$,其中 $std$ 值由下式确定:
$\operatorname{std}=\text { gain } \times \sqrt{\frac{2}{\text { fan_in }+\text { fan_out }}}$
参数:
例子:
w = torch.arange(10).view(2,-1).type(torch.float32)torch.nn.init.xavier_normal_(w)"""tensor([[-0.3139, -0.3557, 0.1285, -0.9556, 0.3255], [-0.6212, 0.3405, -0.4150, -1.3227, -0.0069]])""" torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
又称 He 初始化,按照He, K. et al. (2015)在论文Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification中描述的方法,从均匀分布$U(?bound, bound)$ 中采样,初始化输入张量 tensor,其中 bound 值由下式确定:
$\text { bound }=\text { gain } \times \sqrt{\frac{3}{\text { fan_mode }}}$
参数:
例子:
w = torch.Tensor(3, 5)torch.nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu')"""tensor([[-0.4362, -0.8177, -0.7034, 0.7306, -0.6457], [-0.5749, -0.6480, -0.8016, -0.1434, 0.0785], [ 1.0369, -0.0676, 0.7430, -0.2484, -0.0895]])""" torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
又称He初始化,按照He, K. et al. (2015)在论文Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification中描述的方法,从正态分布 $N\left(0, s t d^{2}\right)$ 中采样,初始化输入张量tensor,其中std值由下式确定:
参数:
torch.nn.init.orthogonal_(tensor, gain=1)
用一个(半)正交矩阵初始化输入张量,参考Saxe, A. et al. (2013) - Exact solutions to the nonlinear dynamics of learning in deep linear neural networks。输入张量必须至少有 2 维,对于大于 2 维的张量,超出的维度将被flatten化。
正交初始化可以使得卷积核更加紧凑,可以去除相关性,使模型更容易学到有效的参数。
参数:
例子:
w = torch.Tensor(3, 5)torch.nn.init.orthogonal_(w)"""tensor([[ 0.7395, -0.1503, 0.4474, 0.4321, -0.2090], [-0.2625, 0.0112, 0.6515, -0.4770, -0.5282], [ 0.4554, 0.6548, 0.0970, -0.4851, 0.3453]])""" torch.nn.init.sparse_(tensor, sparsity, std=0.01)
将2维的输入张量作为稀疏矩阵填充,其中非零元素由正态分布 $N\left(0,0.01^{2}\right)$ 生成。 参考Martens, J.(2010)的 Deep learning via Hessian-free optimization。
参数:
例子:
w = torch.Tensor(3, 5)torch.nn.init.sparse_(w, sparsity=0.1)"""tensor([[-0.0026, 0.0000, 0.0100, 0.0046, 0.0048], [ 0.0106, -0.0046, 0.0000, 0.0000, 0.0000], [ 0.0000, -0.0005, 0.0150, -0.0097, -0.0100]])""" torch.nn.init.constant_(tensor, val)
使值为常数 val 。
例子:
w=torch.Tensor(3,5)nn.init.constant_(w,1.2)"""tensor([[1.2000, 1.2000, 1.2000, 1.2000, 1.2000], [1.2000, 1.2000, 1.2000, 1.2000, 1.2000], [1.2000, 1.2000, 1.2000, 1.2000, 1.2000]])""" torch.nn.init.eye_(tensor)
将二维 tensor 初始化为单位矩阵(the identity matrix)
例子:
w=torch.Tensor(3,5)nn.init.eye_(w)"""tensor([[1., 0., 0., 0., 0.], [0., 1., 0., 0., 0.], [0., 0., 1., 0., 0.]])""" torch.nn.init.zeros_(tensor)
例子:
w = torch.empty(3, 5)nn.init.zeros_(w)"""tensor([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]])"""例子:
print('module-----------')print(model)print('setup-----------')for m in model.modules(): if isinstance(m,nn.Linear): nn.init.xavier_uniform_(m.weight, gain=nn.init.calculate_gain('relu'))"""module-----------Sequential( (flatten): FlattenLayer() (linear1): Linear(in_features=784, out_features=512, bias=True) (activation): ReLU() (linear2): Linear(in_features=512, out_features=256, bias=True) (linear3): Linear(in_features=256, out_features=10, bias=True))setup-----------"""例子:
for param in model.parameters(): nn.init.uniform_(param)例子:
def weights_init(m): classname = m.__class__.__name__ if classname.find('Conv2d') != -1: nn.init.xavier_normal_(m.weight.data) nn.init.constant_(m.bias.data, 0.0) elif classname.find('Linear') != -1: nn.init.xavier_normal_(m.weight) nn.init.constant_(m.bias, 0.0)model.apply(weights_init) #apply函数会递归地搜索网络内的所有module并把参数表示的函数应用到所有的module上。
因上求缘,果上努力~~~~ 作者:Learner-,转载请注明原文链接:https://www.cnblogs.com/BlairGrowing/p/15981694.html