Keras 3 API 文件 / Keras 應用程式

Keras 應用程式

Keras 應用程式是深度學習模型,它們與預訓練的權重一起提供。這些模型可用於預測、特徵提取和微調。

當實例化模型時,權重會自動下載。它們儲存在 ~/.keras/models/

在實例化時,模型將根據您的 Keras 設定檔 ~/.keras/keras.json 中設定的影像資料格式來建構。例如,如果您已設定 image_data_format=channels_last,則從此儲存庫載入的任何模型都將根據「高度-寬度-深度」的資料格式慣例來建構。

可用模型

模型 大小 (MB) Top-1 準確率 Top-5 準確率 參數 深度 每次推論步驟的時間 (毫秒) (CPU) 每次推論步驟的時間 (毫秒) (GPU)
Xception 88 79.0% 94.5% 22.9M 81 109.4 8.1
VGG16 528 71.3% 90.1% 138.4M 16 69.5 4.2
VGG19 549 71.3% 90.0% 143.7M 19 84.8 4.4
ResNet50 98 74.9% 92.1% 25.6M 107 58.2 4.6
ResNet50V2 98 76.0% 93.0% 25.6M 103 45.6 4.4
ResNet101 171 76.4% 92.8% 44.7M 209 89.6 5.2
ResNet101V2 171 77.2% 93.8% 44.7M 205 72.7 5.4
ResNet152 232 76.6% 93.1% 60.4M 311 127.4 6.5
ResNet152V2 232 78.0% 94.2% 60.4M 307 107.5 6.6
InceptionV3 92 77.9% 93.7% 23.9M 189 42.2 6.9
InceptionResNetV2 215 80.3% 95.3% 55.9M 449 130.2 10.0
MobileNet 16 70.4% 89.5% 4.3M 55 22.6 3.4
MobileNetV2 14 71.3% 90.1% 3.5M 105 25.9 3.8
DenseNet121 33 75.0% 92.3% 8.1M 242 77.1 5.4
DenseNet169 57 76.2% 93.2% 14.3M 338 96.4 6.3
DenseNet201 80 77.3% 93.6% 20.2M 402 127.2 6.7
NASNetMobile 23 74.4% 91.9% 5.3M 389 27.0 6.7
NASNetLarge 343 82.5% 96.0% 88.9M 533 344.5 20.0
EfficientNetB0 29 77.1% 93.3% 5.3M 132 46.0 4.9
EfficientNetB1 31 79.1% 94.4% 7.9M 186 60.2 5.6
EfficientNetB2 36 80.1% 94.9% 9.2M 186 80.8 6.5
EfficientNetB3 48 81.6% 95.7% 12.3M 210 140.0 8.8
EfficientNetB4 75 82.9% 96.4% 19.5M 258 308.3 15.1
EfficientNetB5 118 83.6% 96.7% 30.6M 312 579.2 25.3
EfficientNetB6 166 84.0% 96.8% 43.3M 360 958.1 40.4
EfficientNetB7 256 84.3% 97.0% 66.7M 438 1578.9 61.6
EfficientNetV2B0 29 78.7% 94.3% 7.2M - - -
EfficientNetV2B1 34 79.8% 95.0% 8.2M - - -
EfficientNetV2B2 42 80.5% 95.1% 10.2M - - -
EfficientNetV2B3 59 82.0% 95.8% 14.5M - - -
EfficientNetV2S 88 83.9% 96.7% 21.6M - - -
EfficientNetV2M 220 85.3% 97.4% 54.4M - - -
EfficientNetV2L 479 85.7% 97.5% 119.0M - - -
ConvNeXtTiny 109.42 81.3% - 28.6M - - -
ConvNeXtSmall 192.29 82.3% - 50.2M - - -
ConvNeXtBase 338.58 85.3% - 88.5M - - -
ConvNeXtLarge 755.07 86.3% - 197.7M - - -
ConvNeXtXLarge 1310 86.7% - 350.1M - - -

Top-1 和 top-5 準確率是指模型在 ImageNet 驗證資料集上的效能。

深度是指網路的拓撲深度。這包括激活層、批次正規化層等。

每次推論步驟的時間是 30 個批次和 10 次重複的平均值。

  • CPU:AMD EPYC 處理器 (具 IBPB) (92 核心)
  • RAM:1.7T
  • GPU:Tesla A100
  • 批次大小:32

深度計算具有參數的層數。


影像分類模型的使用範例

使用 ResNet50 分類 ImageNet 類別

import keras
from keras.applications.resnet50 import ResNet50
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

使用 VGG16 提取特徵

import keras
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
import numpy as np

model = VGG16(weights='imagenet', include_top=False)

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)

使用 VGG19 從任意中間層提取特徵

from keras.applications.vgg19 import VGG19
from keras.applications.vgg19 import preprocess_input
from keras.models import Model
import numpy as np

base_model = VGG19(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

block4_pool_features = model.predict(x)

在新的一組類別上微調 InceptionV3

from keras.applications.inception_v3 import InceptionV3
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D

# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
    layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit(...)

# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.

# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
   layer.trainable = False
for layer in model.layers[249:]:
   layer.trainable = True

# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')

# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit(...)

在自訂輸入張量上建構 InceptionV3

from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input

# this could also be the output a different Keras model or layer
input_tensor = Input(shape=(224, 224, 3))

model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)