► Keras 3 API 文件 / 層 API / 層激活函數

層激活函數

激活函數的用法

激活函數可以透過 Activation 層使用，或透過所有前饋層支援的 activation 參數使用

model.add(layers.Dense(64, activation=activations.relu))

這等效於

from keras import layers
from keras import activations

model.add(layers.Dense(64))
model.add(layers.Activation(activations.relu))

所有內建的激活函數也可以透過它們的字串識別符傳遞

model.add(layers.Dense(64, activation='relu'))

可用的激活函數

[原始碼]

`celu` 函數

keras.activations.celu(x, alpha=1.0)

連續可微分指數線性單元。

CeLU 激活函數定義為

celu(x) = alpha * (exp(x / alpha) - 1) 當 x < 0，celu(x) = x 當 x >= 0。

其中 alpha 是一個縮放參數，用於控制激活函數的形狀。

參數

x：輸入張量。
alpha：CeLU 公式的 α 值。預設值為 1.0。

參考文獻

Barron, J. T., 2017

[原始碼]

`elu` 函數

keras.activations.elu(x, alpha=1.0)

指數線性單元。

alpha > 0 的指數線性單元 (ELU) 定義為

x 若 x > 0
alpha * exp(x) - 1 若 x < 0

ELU 具有負值，這會將激活函數的平均值推向零。

更接近零的平均激活函數可以實現更快的學習，因為它們使梯度更接近自然梯度。當參數變小時，ELU 會飽和至負值。飽和意味著導數很小，這會減少變異性以及傳播到下一層的資訊。

參數

x：輸入張量。

參考文獻

Clevert 等人，2016

[原始碼]

`exponential` 函數

keras.activations.exponential(x)

指數激活函數。

參數

x：輸入張量。

[原始碼]

`gelu` 函數

keras.activations.gelu(x, approximate=False)

高斯誤差線性單元 (GELU) 激活函數。

高斯誤差線性單元 (GELU) 定義為

gelu(x) = x * P(X <= x)，其中 P(X) ~ N(0, 1)，即 gelu(x) = 0.5 * x * (1 + erf(x / sqrt(2)))。

GELU 根據輸入的值而不是像 ReLU 那樣根據符號來閘控輸入。

參數

x：輸入張量。
approximate：一個 bool，是否啟用近似。

參考文獻

Hendrycks 等人，2016

[原始碼]

`glu` 函數

keras.activations.glu(x, axis=-1)

閘控線性單元 (GLU) 激活函數。

GLU 激活函數定義為

glu(x) = a * sigmoid(b),

其中 x 沿給定軸分成兩個相等的部分 a 和 b。

參數

x：輸入張量。
axis：沿其分割輸入張量的軸。預設值為 -1。

參考文獻

Dauphin 等人，2017

[原始碼]

`hard_shrink` 函數

keras.activations.hard_shrink(x, threshold=0.5)

硬收縮激活函數。

它定義為

hard_shrink(x) = x 若 |x| > threshold，hard_shrink(x) = 0 否則。

參數

x：輸入張量。
threshold：閾值。預設值為 0.5。

[原始碼]

`hard_sigmoid` 函數

keras.activations.hard_sigmoid(x)

硬 sigmoid 激活函數。

硬 sigmoid 激活函數定義為

0 若 if x <= -3
1 若 x >= 3
(x/6) + 0.5 若 -3 < x < 3

它是 sigmoid 激活函數的更快、分段線性近似。

參數

x：輸入張量。

參考文獻

維基百科「硬 sigmoid」

[原始碼]

`hard_silu` 函數

keras.activations.hard_silu(x)

硬 SiLU 激活函數，也稱為硬 Swish。

它定義為

0 若 if x < -3
x 若 x > 3
x * (x + 3) / 6 若 -3 <= x <= 3

它是 silu 激活函數的更快、分段線性近似。

參數

x：輸入張量。

參考文獻

A Howard, 2019

[原始碼]

`hard_tanh` 函數

keras.activations.hard_tanh(x)

HardTanh 激活函數。

它定義為：hard_tanh(x) = -1 當 x < -1，hard_tanh(x) = x 當 -1 <= x <= 1，hard_tanh(x) = 1 當 x > 1。

參數

x：輸入張量。

[原始碼]

`leaky_relu` 函數

keras.activations.leaky_relu(x, negative_slope=0.2)

Leaky relu 激活函數。

參數

x：輸入張量。
negative_slope：一個 float，用於控制低於閾值的值的斜率。

[原始碼]

`linear` 函數

keras.activations.linear(x)

線性激活函數（直通）。

「線性」激活函數是一個恆等函數：它返回未修改的輸入。

參數

x：輸入張量。

[原始碼]

`log_sigmoid` 函數

keras.activations.log_sigmoid(x)

sigmoid 激活函數的對數。

它定義為 f(x) = log(1 / (1 + exp(-x)))。

參數

x：輸入張量。

[原始碼]

`log_softmax` 函數

keras.activations.log_softmax(x, axis=-1)

Log-Softmax 激活函數。

每個輸入向量都是獨立處理的。axis 參數設定函數沿輸入的哪個軸應用。

參數

x：輸入張量。
axis：整數，softmax 應用於其上的軸。

[原始碼]

`mish` 函數

keras.activations.mish(x)

Mish 激活函數。

它定義為

mish(x) = x * tanh(softplus(x))

其中 softplus 定義為

softplus(x) = log(exp(x) + 1)

參數

x：輸入張量。

參考文獻

Misra, 2019

[原始碼]

`relu` 函數

keras.activations.relu(x, negative_slope=0.0, max_value=None, threshold=0.0)

應用整流線性單元激活函數。

使用預設值，這會返回標準 ReLU 激活函數：max(x, 0)，0 和輸入張量的逐元素最大值。

修改預設參數可讓您使用非零閾值、更改激活函數的最大值，以及對低於閾值的值使用輸入的非零倍數。

範例

>>> x = [-10, -5, 0.0, 5, 10]
>>> keras.activations.relu(x)
[ 0.,  0.,  0.,  5., 10.]
>>> keras.activations.relu(x, negative_slope=0.5)
[-5. , -2.5,  0. ,  5. , 10. ]
>>> keras.activations.relu(x, max_value=5.)
[0., 0., 0., 5., 5.]
>>> keras.activations.relu(x, threshold=5.)
[-0., -0.,  0.,  0., 10.]

參數

x：輸入張量。
negative_slope：一個 float，用於控制低於閾值的值的斜率。
max_value：一個 float，設定飽和閾值（函數將返回的最大值）。
threshold：一個 float，給出激活函數的閾值，低於該閾值的值將被抑制或設定為零。

回傳

與輸入 x 具有相同形狀和 dtype 的張量。

[原始碼]

`relu6` 函數

keras.activations.relu6(x)

Relu6 激活函數。

它是 ReLU 函數，但截斷為最大值 6。

參數

x：輸入張量。

[原始碼]

`selu` 函數

keras.activations.selu(x)

縮放指數線性單元 (SELU)。

縮放指數線性單元 (SELU) 激活函數定義為

scale * x 若 x > 0
scale * alpha * (exp(x) - 1) 若 x < 0

其中 alpha 和 scale 是預定義的常數（alpha=1.67326324 和 scale=1.05070098）。

基本上，SELU 激活函數將 scale (> 1) 與 keras.activations.elu 函數的輸出相乘，以確保正輸入的斜率大於 1。

alpha 和 scale 的值經過選擇，因此只要權重正確初始化（請參閱 keras.initializers.LecunNormal 初始化器）且輸入單元的數量「足夠大」（有關更多資訊，請參閱參考論文），則兩個連續層之間的輸入的平均值和變異數會被保留。

參數

x：輸入張量。

注意事項

與 keras.initializers.LecunNormal 初始化器一起使用。
與 dropout 變體 keras.layers.AlphaDropout（而不是常規 dropout）一起使用。

參考文獻

Klambauer 等人，2017

[原始碼]

`sigmoid` 函數

keras.activations.sigmoid(x)

Sigmoid 激活函數。

它定義為：sigmoid(x) = 1 / (1 + exp(-x))。

對於小值 (<-5)，sigmoid 返回接近零的值，對於大值 (>5)，函數的結果接近 1。

Sigmoid 等效於 2 元素 softmax，其中第二個元素假定為零。sigmoid 函數始終返回介於 0 和 1 之間的值。

參數

x：輸入張量。

[原始碼]

`silu` 函數

keras.activations.silu(x)

Swish（或 Silu）激活函數。

它定義為：swish(x) = x * sigmoid(x)。

Swish（或 Silu）激活函數是一個平滑、非單調函數，其上方無界，下方有界。

參數

x：輸入張量。

參考文獻

Ramachandran 等人，2017

[原始碼]

`softmax` 函數

keras.activations.softmax(x, axis=-1)

Softmax 將值向量轉換為機率分佈。

輸出向量的元素範圍在 [0, 1] 之間，總和為 1。

每個輸入向量都是獨立處理的。axis 參數設定函數沿輸入的哪個軸應用。

Softmax 通常用作分類網路的最後一層的激活函數，因為結果可以解釋為機率分佈。

每個向量 x 的 softmax 計算為 exp(x) / sum(exp(x))。

輸入值是結果機率的對數優勢比。

參數

x：輸入張量。
axis：整數，softmax 應用於其上的軸。

[原始碼]

`soft_shrink` 函數

keras.activations.soft_shrink(x, threshold=0.5)

軟收縮激活函數。

它定義為

soft_shrink(x) = x - threshold 若 x > threshold，soft_shrink(x) = x + threshold 若 x < -threshold，soft_shrink(x) = 0 否則。

參數

x：輸入張量。
threshold：閾值。預設值為 0.5。

[原始碼]

`softplus` 函數

keras.activations.softplus(x)

Softplus 激活函數。

它定義為：softplus(x) = log(exp(x) + 1)。

參數

x：輸入張量。

[原始碼]

`softsign` 函數

keras.activations.softsign(x)

Softsign 激活函數。

Softsign 定義為：softsign(x) = x / (abs(x) + 1)。

參數

x：輸入張量。

[原始碼]

`sparse_plus` 函數

keras.activations.sparse_plus(x)

SparsePlus 激活函數。

SparsePlus 定義為

sparse_plus(x) = 0 當 x <= -1 時。sparse_plus(x) = (1/4) * (x + 1)^2 當 -1 < x < 1 時。sparse_plus(x) = x 當 x >= 1 時。

參數

x：輸入張量。

[原始碼]

`sparsemax` 函數

keras.activations.sparsemax(x, axis=-1)

Sparsemax 激活函數。

對於每個批次 i 和類別 j，sparsemax 激活函數定義為

sparsemax(x)[i, j] = max(x[i, j] - τ(x[i, :]), 0)。

參數

x：輸入張量。
axis：int，sparsemax 運算應用於其上的軸。

回傳

張量，sparsemax 轉換的輸出。與 x 具有相同的類型和形狀。

參考文獻

Martins 等人，2016

[原始碼]

`squareplus` 函數

keras.activations.squareplus(x, b=4)

Squareplus 激活函數。

Squareplus 激活函數定義為

f(x) = (x + sqrt(x^2 + b)) / 2

其中 b 是一個平滑參數。

參數

x：輸入張量。
b：平滑參數。預設值為 4。

參考文獻

Ramachandran 等人，2021

[原始碼]

`tanh` 函數

keras.activations.tanh(x)

雙曲正切激活函數。

它定義為：tanh(x) = sinh(x) / cosh(x)，即 tanh(x) = ((exp(x) - exp(-x)) / (exp(x) + exp(-x)))。

參數

x：輸入張量。

[原始碼]

`tanh_shrink` 函數

keras.activations.tanh_shrink(x)

Tanh shrink 激活函數。

它定義為

f(x) = x - tanh(x).

參數

x：輸入張量。

[原始碼]

`threshold` 函數

keras.activations.threshold(x, threshold, default_value)

閾值激活函數。

它定義為

threshold(x) = x 若 x > threshold，threshold(x) = default_value 否則。

參數

x：輸入張量。
threshold：決定何時保留或替換 x 的值。
default_value：當 x <= threshold 時要賦予的值。

建立自訂激活函數

您也可以使用可呼叫物件作為激活函數（在這種情況下，它應該接受張量並返回形狀和 dtype 相同的張量）

model.add(layers.Dense(64, activation=keras.ops.tanh))

關於「進階激活」層

比簡單函數更複雜的激活函數（例如，可學習的激活函數，它們維護狀態）以進階激活層的形式提供。

層激活函數

◆ 激活函數的用法

◆ 可用的激活函數

celu 函數

elu 函數

exponential 函數

gelu 函數

glu 函數

hard_shrink 函數

hard_sigmoid 函數

hard_silu 函數

hard_tanh 函數

leaky_relu 函數

linear 函數

log_sigmoid 函數

log_softmax 函數

mish 函數

relu 函數

relu6 函數

selu 函數

sigmoid 函數

silu 函數

softmax 函數

soft_shrink 函數

softplus 函數

softsign 函數

sparse_plus 函數

sparsemax 函數

squareplus 函數

tanh 函數

tanh_shrink 函數

threshold 函數

◆ 建立自訂激活函數

◆ 關於「進階激活」層

層激活函數

激活函數的用法

可用的激活函數

celu 函數

elu 函數

exponential 函數

gelu 函數

glu 函數

hard_shrink 函數

hard_sigmoid 函數

hard_silu 函數

hard_tanh 函數

leaky_relu 函數

linear 函數

log_sigmoid 函數

log_softmax 函數

mish 函數

relu 函數

relu6 函數

selu 函數

sigmoid 函數

silu 函數

softmax 函數

soft_shrink 函數

softplus 函數

softsign 函數

sparse_plus 函數

sparsemax 函數

squareplus 函數

tanh 函數

tanh_shrink 函數

threshold 函數

建立自訂激活函數

關於「進階激活」層

`celu` 函數

`elu` 函數

`exponential` 函數

`gelu` 函數

`glu` 函數

`hard_shrink` 函數

`hard_sigmoid` 函數

`hard_silu` 函數

`hard_tanh` 函數

`leaky_relu` 函數

`linear` 函數

`log_sigmoid` 函數

`log_softmax` 函數

`mish` 函數

`relu` 函數

`relu6` 函數

`selu` 函數

`sigmoid` 函數

`silu` 函數

`softmax` 函數

`soft_shrink` 函數

`softplus` 函數

`softsign` 函數

`sparse_plus` 函數

`sparsemax` 函數

`squareplus` 函數

`tanh` 函數

`tanh_shrink` 函數

`threshold` 函數