► 開發人員指南 / 循序模型

循序模型

作者： fchollet
建立日期 2020/04/12
上次修改日期 2023/06/25
描述： 循序模型的完整指南。

設定

import keras
from keras import layers
from keras import ops

何時使用循序模型

Sequential 模型適用於層的簡單堆疊，其中每一層都有正好一個輸入張量和一個輸出張量。

示意圖來看，以下的 Sequential 模型

# Define Sequential model with 3 layers
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)
# Call model on a test input
x = ops.ones((3, 3))
y = model(x)

等同於此函數

# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = ops.ones((3, 3))
y = layer3(layer2(layer1(x)))

當以下情況時，循序模型不適用：

您的模型有多個輸入或多個輸出
您的任何層有多個輸入或多個輸出
您需要執行層共享
您需要非線性拓撲（例如殘差連接、多分支模型）

建立循序模型

您可以通過將層列表傳遞給 Sequential 建構函式來建立循序模型

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

它的層可以通過 layers 屬性訪問

model.layers

[<Dense name=dense, built=False>,
 <Dense name=dense_1, built=False>,
 <Dense name=dense_2, built=False>]

您也可以通過 add() 方法以遞增方式建立循序模型

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

請注意，還有一個對應的 pop() 方法可以移除層：循序模型的行為非常像層的列表。

model.pop()
print(len(model.layers))  # 2

另請注意，與 Keras 中的任何層或模型一樣，Sequential 建構函式接受 name 引數。這對於使用語義上有意義的名稱來註釋 TensorBoard 圖形很有用。

model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

預先指定輸入形狀

一般來說，Keras 中的所有層都需要知道其輸入的形狀，以便能夠建立其權重。因此，當您像這樣建立層時，最初它沒有權重

layer = layers.Dense(3)
layer.weights  # Empty

[]

它在第一次在輸入上呼叫時建立其權重，因為權重的形狀取決於輸入的形狀

# Call layer on a test input
x = ops.ones((1, 4))
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)

[<KerasVariable shape=(4, 3), dtype=float32, path=dense_6/kernel>,
 <KerasVariable shape=(3,), dtype=float32, path=dense_6/bias>]

當然，這也適用於循序模型。當您在沒有輸入形狀的情況下實例化循序模型時，它不會被「建構」：它沒有權重（且呼叫 model.weights 會產生指出此情況的錯誤）。當模型第一次看到一些輸入資料時，就會建立權重

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)  # No weights at this stage!

# At this point, you can't do this:
# model.weights

# You also can't do this:
# model.summary()

# Call the model on a test input
x = ops.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights))  # 6

Number of weights after calling the model: 6

一旦模型被「建構」，您可以呼叫其 summary() 方法來顯示其內容

model.summary()

Model: "sequential_3"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ dense_7 (Dense)                 │ (1, 2)                    │         10 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense_8 (Dense)                 │ (1, 3)                    │          9 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense_9 (Dense)                 │ (1, 4)                    │         16 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 35 (140.00 B)

 Trainable params: 35 (140.00 B)

 Non-trainable params: 0 (0.00 B)

但是，當以遞增方式建立循序模型時，能夠顯示目前模型的摘要（包括目前輸出形狀）會非常有用。在這種情況下，您應該通過將 Input 物件傳遞給您的模型來啟動模型，使其從一開始就了解其輸入形狀

model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()

Model: "sequential_4"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ dense_10 (Dense)                │ (None, 2)                 │         10 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 10 (40.00 B)

 Trainable params: 10 (40.00 B)

 Non-trainable params: 0 (0.00 B)

請注意，Input 物件不會顯示為 model.layers 的一部分，因為它不是層

model.layers

[<Dense name=dense_10, built=True>]

像這樣使用預定義的輸入形狀建構的模型始終具有權重（甚至在看到任何資料之前），且始終具有定義的輸出形狀。

一般來說，如果您知道循序模型的輸入形狀，建議的最佳做法是始終預先指定它。

常見的偵錯工作流程：`add()` + `summary()`

當建立新的循序架構時，可以使用 add() 以遞增方式堆疊層，並經常列印模型摘要，這會很有用。例如，這使您可以監控 Conv2D 和 MaxPooling2D 層的堆疊如何對圖像特徵圖進行降採樣

model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3)))  # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

# Can you guess what the current output shape is at this point? Probably not.
# Let's just print it:
model.summary()

# The answer was: (40, 40, 32), so we can keep downsampling...

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

# And now?
model.summary()

# Now that we have 4x4 feature maps, time to apply global max pooling.
model.add(layers.GlobalMaxPooling2D())

# Finally, we add a classification layer.
model.add(layers.Dense(10))

Model: "sequential_5"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 123, 123, 32)      │      2,432 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_1 (Conv2D)               │ (None, 121, 121, 32)      │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 40, 40, 32)        │          0 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 11,680 (45.62 KB)

 Trainable params: 11,680 (45.62 KB)

 Non-trainable params: 0 (0.00 B)

Model: "sequential_5"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 123, 123, 32)      │      2,432 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_1 (Conv2D)               │ (None, 121, 121, 32)      │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 40, 40, 32)        │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_2 (Conv2D)               │ (None, 38, 38, 32)        │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_3 (Conv2D)               │ (None, 36, 36, 32)        │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 12, 12, 32)        │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_4 (Conv2D)               │ (None, 10, 10, 32)        │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d_5 (Conv2D)               │ (None, 8, 8, 32)          │      9,248 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d_2 (MaxPooling2D)  │ (None, 4, 4, 32)          │          0 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 48,672 (190.12 KB)

 Trainable params: 48,672 (190.12 KB)

 Non-trainable params: 0 (0.00 B)

非常實用，對吧？

建立模型後該怎麼做

一旦您的模型架構準備就緒，您將需要：

訓練您的模型、評估它並執行推論。請參閱我們使用內建迴圈進行訓練和評估的指南
將您的模型儲存到磁碟並還原它。請參閱我們的序列化與儲存指南。

使用循序模型進行特徵提取

一旦循序模型被建構，它的行為就像函數式 API 模型。這表示每一層都有 input 和 output 屬性。這些屬性可用於執行很棒的事情，例如快速建立一個模型，提取循序模型中所有中間層的輸出

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = ops.ones((1, 250, 250, 3))
features = feature_extractor(x)

這是一個類似的範例，僅從一層提取特徵

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = ops.ones((1, 250, 250, 3))
features = feature_extractor(x)

使用循序模型進行遷移學習

遷移學習包括凍結模型中的底部層，且僅訓練頂層。如果您不熟悉它，請務必閱讀我們的遷移學習指南。

以下是兩個涉及循序模型的常見遷移學習藍圖。

首先，假設您有一個循序模型，並且您想要凍結除最後一層之外的所有層。在這種情況下，您只需迭代 model.layers 並在每一層上設定 layer.trainable = False，最後一層除外。像這樣

model = keras.Sequential([
    keras.Input(shape=(784)),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(10),
])

# Presumably you would want to first load pre-trained weights.
model.load_weights(...)

# Freeze all layers except the last one.
for layer in model.layers[:-1]:
  layer.trainable = False

# Recompile and train (this will only update the weights of the last layer).
model.compile(...)
model.fit(...)

另一個常見的藍圖是使用循序模型來堆疊預訓練模型和一些新初始化的分類層。像這樣

# Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

# Freeze the base model
base_model.trainable = False

# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
    base_model,
    layers.Dense(1000),
])

# Compile & train
model.compile(...)
model.fit(...)