激活函数

激活函数#

flax.nnx.celu(x, alpha=1.0)[source]#

连续可微的指数线性单元激活。

计算元素级函数

\[\begin{split}\mathrm{celu}(x) = \begin{cases} x, & x > 0\\ \alpha \left(\exp(\frac{x}{\alpha}) - 1\right), & x \le 0 \end{cases}\end{split}\]

有关更多信息，请参阅连续可微的指数线性单元.

参数

x – 输入数组
alpha – 数组或标量（默认值：1.0）

返回值

一个数组。

flax.nnx.elu(x, alpha=1.0)[source]#

指数线性单元激活函数。

计算元素级函数

\[\begin{split}\mathrm{elu}(x) = \begin{cases} x, & x > 0\\ \alpha \left(\exp(x) - 1\right), & x \le 0 \end{cases}\end{split}\]

参数

x – 输入数组
alpha – alpha 值的标量或数组（默认值：1.0）

返回值

一个数组。

另请参阅

selu()

flax.nnx.gelu(x, approximate=True)[source]#

高斯误差线性单元激活函数。

如果 approximate=False，计算元素级函数

\[\mathrm{gelu}(x) = \frac{x}{2} \left(1 + \mathrm{erf} \left( \frac{x}{\sqrt{2}} \right) \right)\]

如果 approximate=True，使用 GELU 的近似公式

\[\mathrm{gelu}(x) = \frac{x}{2} \left(1 + \mathrm{tanh} \left( \sqrt{\frac{2}{\pi}} \left(x + 0.044715 x^3 \right) \right) \right)\]

有关更多信息，请参阅高斯误差线性单元 (GELUs)，第 2 节。

参数

x – 输入数组
approximate – 是否使用近似公式或精确公式。

flax.nnx.glu(x, axis=-1)[source]#

门控线性单元激活函数。

计算函数

\[\mathrm{glu}(x) = x\left[\ldots, 0:\frac{n}{2}, \ldots\right] \cdot \mathrm{sigmoid} \left( x\left[\ldots, \frac{n}{2}:n, \ldots\right] \right)\]

其中数组沿着 axis 分成两部分。 axis 维度的大小必须能被 2 整除。

参数

x – 输入数组
axis – 应该计算分割的轴（默认值：-1）

返回值

一个数组。

另请参阅

sigmoid()

flax.nnx.hard_sigmoid(x)[source]#

硬 Sigmoid 激活函数。

计算元素级函数

\[\mathrm{hard\_sigmoid}(x) = \frac{\mathrm{relu6}(x + 3)}{6}\]

参数: x – 输入数组
返回值: 一个数组。

另请参阅

relu6()

flax.nnx.hard_silu(x)[source]#

硬 SiLU（swish）激活函数

计算元素级函数

\[\mathrm{hard\_silu}(x) = x \cdot \mathrm{hard\_sigmoid}(x)\]

两者 hard_silu() 和 hard_swish() 是同一个函数的别名。

参数: x – 输入数组
返回值: 一个数组。

另请参阅

hard_sigmoid()

flax.nnx.hard_swish(x)#

硬 SiLU（swish）激活函数

计算元素级函数

\[\mathrm{hard\_silu}(x) = x \cdot \mathrm{hard\_sigmoid}(x)\]

两者 hard_silu() 和 hard_swish() 是同一个函数的别名。

参数: x – 输入数组
返回值: 一个数组。

另请参阅

hard_sigmoid()

flax.nnx.hard_tanh(x)[source]#

硬 \(\mathrm{tanh}\) 激活函数。

计算元素级函数

\[\begin{split}\mathrm{hard\_tanh}(x) = \begin{cases} -1, & x < -1\\ x, & -1 \le x \le 1\\ 1, & 1 < x \end{cases}\end{split}\]

参数: x – 输入数组
返回值: 一个数组。

flax.nnx.leaky_relu(x, negative_slope=0.01)[source]#

泄漏校正线性单元激活函数。

计算元素级函数

\[\begin{split}\mathrm{leaky\_relu}(x) = \begin{cases} x, & x \ge 0\\ \alpha x, & x < 0 \end{cases}\end{split}\]

其中 \(\alpha\) = negative_slope.

参数

x – 输入数组
negative_slope – 指定负斜率的数组或标量（默认值：0.01）

返回值

一个数组。

另请参阅

relu()

flax.nnx.log_sigmoid(x)[source]#

对数 sigmoid 激活函数。

计算元素级函数

\[\mathrm{log\_sigmoid}(x) = \log(\mathrm{sigmoid}(x)) = -\log(1 + e^{-x})\]

参数: x – 输入数组
返回值: 一个数组。

另请参阅

sigmoid()

flax.nnx.log_softmax(x, axis=-1, where=None, initial=_UNSPECIFIED)[source]#

对数 Softmax 函数。

计算 softmax 函数的对数，它将元素重新缩放到范围 \([-\infty, 0)\).

\[\mathrm{log\_softmax}(x)_i = \log \left( \frac{\exp(x_i)}{\sum_j \exp(x_j)} \right)\]

参数

x – 输入数组
axis – 应该计算 log_softmax 的轴或轴。可以是整数或整数元组。
where – 要包含在 log_softmax 中的元素。

返回值

一个数组。

注意

如果任何输入值是 +inf，则结果将全部为 NaN：这反映了 inf / inf 在浮点数数学中没有明确定义的事实。

另请参阅

softmax()

flax.nnx.logsumexp(a, axis=None, b=None, keepdims=False, return_sign=False, where=None)[source]#

对数-和-指数归约。

JAX 实现 scipy.special.logsumexp().

\[\mathrm{logsumexp}(a) = \mathrm{log} \sum_j b \cdot \mathrm{exp}(a_{ij})\]

其中 \(j\) 索引涵盖要减少的一个或多个维度。

参数

a – 输入数组
axis – 要减少的轴或轴。可以是 None、整数或整数元组。
b – \(\mathrm{exp}(a)\) 的比例因子。必须可广播到 a 的形状。
keepdims – 如果为 True，则将减少的轴保留在输出中，作为大小为 1 的维度。
return_sign – 如果为 True，则输出将为 (result, sign) 对，其中 sign 是总和的符号，而 result 包含其绝对值的对数。如果为 False，则仅返回 result，如果总和为负数，它将包含 NaN 值。
where – 要包含在减少中的元素。

返回值

取决于 return_sign 参数的值，可以是数组 result 或数组对 (result, sign)。

flax.nnx.one_hot(x, num_classes, *, dtype=<class 'jax.numpy.float64'>, axis=-1)[source]#

对给定索引进行独热编码。

输入 x 中的每个索引都被编码为长度为 num_classes 的零向量，其中 index 处的元素设置为 1。

>>> jax.nn.one_hot(jnp.array([0, 1, 2]), 3)
Array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]], dtype=float32)

范围 [0, num_classes) 之外的索引将被编码为零。

>>> jax.nn.one_hot(jnp.array([-1, 3]), 3)
Array([[0., 0., 0.],
       [0., 0., 0.]], dtype=float32)

参数

x – 索引张量。
num_classes – 独热维度中的类别数。
dtype – 可选，返回值的浮点类型（默认值 jnp.float_）。
axis – 函数应计算的轴或轴。

flax.nnx.relu(x)[source]#

线性整流单元激活函数。

计算元素级函数

\[\mathrm{relu}(x) = \max(x, 0)\]

除了在微分下，我们采用

\[\nabla \mathrm{relu}(0) = 0\]

有关更多信息，请参阅 ReLU’(0) 对反向传播的数值影响。

参数: x – 输入数组
返回值: 一个数组。

示例

>>> jax.nn.relu(jax.numpy.array([-2., -1., -0.5, 0, 0.5, 1., 2.]))
Array([0. , 0. , 0. , 0. , 0.5, 1. , 2. ], dtype=float32)

另请参阅

relu6()

flax.nnx.selu(x)[source]#

缩放指数线性单元激活。

计算元素级函数

\[\begin{split}\mathrm{selu}(x) = \lambda \begin{cases} x, & x > 0\\ \alpha e^x - \alpha, & x \le 0 \end{cases}\end{split}\]

其中 \(\lambda = 1.0507009873554804934193349852946\) 和 \(\alpha = 1.6732632423543772848170429916717\)。

有关更多信息，请参阅自归一化神经网络。

参数: x – 输入数组
返回值: 一个数组。

另请参阅

elu()

flax.nnx.sigmoid(x)[source]#

Sigmoid 激活函数。

计算元素级函数

\[\mathrm{sigmoid}(x) = \frac{1}{1 + e^{-x}}\]

参数: x – 输入数组
返回值: 一个数组。

另请参阅

log_sigmoid()

flax.nnx.silu(x)[source]#

SiLU（又名 swish）激活函数。

计算元素级函数

\[\mathrm{silu}(x) = x \cdot \mathrm{sigmoid}(x) = \frac{x}{1 + e^{-x}}\]

swish() 和 silu() 都是同一个函数的别名。

参数: x – 输入数组
返回值: 一个数组。

另请参阅

sigmoid()

flax.nnx.soft_sign(x)[source]#

Soft-sign 激活函数。

计算元素级函数

\[\mathrm{soft\_sign}(x) = \frac{x}{|x| + 1}\]

参数: x – 输入数组

flax.nnx.softmax(x, axis=-1, where=None, initial=_UNSPECIFIED)[source]#

Softmax 函数。

计算将元素重新缩放到范围 \([0, 1]\) 的函数，使得沿 axis 的元素之和为 \(1\)。

\[\mathrm{softmax}(x) = \frac{\exp(x_i)}{\sum_j \exp(x_j)}\]

参数

x – 输入数组
axis – 应计算 softmax 的轴或轴。跨这些维度求和的 softmax 输出应总计为 \(1\)。可以是整数或整数元组。
where – 要包含在 softmax 中的元素。

返回值

一个数组。

注意

如果任何输入值是 +inf，则结果将全部为 NaN：这反映了 inf / inf 在浮点数数学中没有明确定义的事实。

另请参阅

log_softmax()

flax.nnx.softplus(x)[source]#

Softplus 激活函数。

计算元素级函数

\[\mathrm{softplus}(x) = \log(1 + e^x)\]

参数: x – 输入数组

flax.nnx.standardize(x, axis=-1, mean=None, variance=None, epsilon=1e-05, where=None)[source]#: 通过减去 mean 并除以 \(\sqrt{\mathrm{variance}}\) 来标准化数组。

flax.nnx.swish(x)#

SiLU（又名 swish）激活函数。

计算元素级函数

\[\mathrm{silu}(x) = x \cdot \mathrm{sigmoid}(x) = \frac{x}{1 + e^{-x}}\]

swish() 和 silu() 都是同一个函数的别名。

参数: x – 输入数组
返回值: 一个数组。

另请参阅

sigmoid()

flax.nnx.tanh(x, /)[source]#

逐元素计算双曲正切。

LAX 后端实现的 numpy.tanh()。

原始文档字符串如下。

等效于 np.sinh(x)/np.cosh(x) 或 -1j * np.tan(1j*x)。

参数: x (array_like) – 输入数组。
返回值: y – 对应的双曲正切值。如果 x 是标量，则它也是标量。
返回类型: ndarray

参考

1: M. Abramowitz 和 I. A. Stegun，《数学函数手册》。纽约州纽约：多佛，1972 年，第 83 页。 https://personal.math.ubc.ca/~cbm/aands/page_83.htm
2: 维基百科，“双曲函数”，https://en.wikipedia.org/wiki/Hyperbolic_function

激活函数

内容

激活函数#