深度夢境

作者: fchollet
建立日期 2016/01/13
上次修改日期 2020/05/02
描述: 使用 Keras 生成深度夢境。

ⓘ 此範例使用 Keras 3

在 Colab 中檢視 GitHub 來源


簡介

「深度夢境」是一種圖像過濾技術,其概念是採用圖像分類模型,並在輸入圖像上執行梯度上升,以嘗試最大化特定層(有時是特定層中的特定單元)對於此輸入的活化。它會產生類似幻覺的視覺效果。

它最初由 Google 的 Alexander Mordvintsev 在 2015 年 7 月提出。

流程

  • 載入原始圖像。
  • 定義多個處理尺度(「八度音階」),從最小到最大。
  • 將原始圖像調整為最小尺度。
  • 對於每個尺度,從最小的尺度開始(即目前尺度): - 執行梯度上升 - 將圖像放大到下一個尺度 - 重新注入在放大時遺失的細節
  • 當我們回到原始大小時停止。為了獲得在放大過程中遺失的細節,我們只需將原始圖像縮小,放大它,然後將結果與(調整大小的)原始圖像進行比較。

設定

import os

os.environ["KERAS_BACKEND"] = "tensorflow"

import numpy as np
import tensorflow as tf
import keras
from keras.applications import inception_v3

base_image_path = keras.utils.get_file("sky.jpg", "https://i.imgur.com/aGBdQyK.jpg")
result_prefix = "sky_dream"

# These are the names of the layers
# for which we try to maximize activation,
# as well as their weight in the final loss
# we try to maximize.
# You can tweak these setting to obtain new visual effects.
layer_settings = {
    "mixed4": 1.0,
    "mixed5": 1.5,
    "mixed6": 2.0,
    "mixed7": 2.5,
}

# Playing with these hyperparameters will also allow you to achieve new effects
step = 0.01  # Gradient ascent step size
num_octave = 3  # Number of scales at which to run gradient ascent
octave_scale = 1.4  # Size ratio between scales
iterations = 20  # Number of ascent steps per scale
max_loss = 15.0

這是我們的基礎圖像

from IPython.display import Image, display

display(Image(base_image_path))

jpeg

讓我們設定一些圖像預處理/後處理工具

def preprocess_image(image_path):
    # Util function to open, resize and format pictures
    # into appropriate arrays.
    img = keras.utils.load_img(image_path)
    img = keras.utils.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = inception_v3.preprocess_input(img)
    return img


def deprocess_image(x):
    # Util function to convert a NumPy array into a valid image.
    x = x.reshape((x.shape[1], x.shape[2], 3))
    # Undo inception v3 preprocessing
    x /= 2.0
    x += 0.5
    x *= 255.0
    # Convert to uint8 and clip to the valid range [0, 255]
    x = np.clip(x, 0, 255).astype("uint8")
    return x

計算深度夢境損失

首先,建立一個特徵提取模型,以檢索給定輸入圖像的目標層的活化。

# Build an InceptionV3 model loaded with pre-trained ImageNet weights
model = inception_v3.InceptionV3(weights="imagenet", include_top=False)

# Get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict(
    [
        (layer.name, layer.output)
        for layer in [model.get_layer(name) for name in layer_settings.keys()]
    ]
)

# Set up a model that returns the activation values for every target layer
# (as a dict)
feature_extractor = keras.Model(inputs=model.inputs, outputs=outputs_dict)

實際的損失計算非常簡單

def compute_loss(input_image):
    features = feature_extractor(input_image)
    # Initialize the loss
    loss = tf.zeros(shape=())
    for name in features.keys():
        coeff = layer_settings[name]
        activation = features[name]
        # We avoid border artifacts by only involving non-border pixels in the loss.
        scaling = tf.reduce_prod(tf.cast(tf.shape(activation), "float32"))
        loss += coeff * tf.reduce_sum(tf.square(activation[:, 2:-2, 2:-2, :])) / scaling
    return loss

設定一個八度音階的梯度上升迴圈

@tf.function
def gradient_ascent_step(img, learning_rate):
    with tf.GradientTape() as tape:
        tape.watch(img)
        loss = compute_loss(img)
    # Compute gradients.
    grads = tape.gradient(loss, img)
    # Normalize gradients.
    grads /= tf.maximum(tf.reduce_mean(tf.abs(grads)), 1e-6)
    img += learning_rate * grads
    return loss, img


def gradient_ascent_loop(img, iterations, learning_rate, max_loss=None):
    for i in range(iterations):
        loss, img = gradient_ascent_step(img, learning_rate)
        if max_loss is not None and loss > max_loss:
            break
        print("... Loss value at step %d: %.2f" % (i, loss))
    return img

執行訓練迴圈,疊代不同的八度音階

original_img = preprocess_image(base_image_path)
original_shape = original_img.shape[1:3]

successive_shapes = [original_shape]
for i in range(1, num_octave):
    shape = tuple([int(dim / (octave_scale**i)) for dim in original_shape])
    successive_shapes.append(shape)
successive_shapes = successive_shapes[::-1]
shrunk_original_img = tf.image.resize(original_img, successive_shapes[0])

img = tf.identity(original_img)  # Make a copy
for i, shape in enumerate(successive_shapes):
    print("Processing octave %d with shape %s" % (i, shape))
    img = tf.image.resize(img, shape)
    img = gradient_ascent_loop(
        img, iterations=iterations, learning_rate=step, max_loss=max_loss
    )
    upscaled_shrunk_original_img = tf.image.resize(shrunk_original_img, shape)
    same_size_original = tf.image.resize(original_img, shape)
    lost_detail = same_size_original - upscaled_shrunk_original_img

    img += lost_detail
    shrunk_original_img = tf.image.resize(original_img, shape)

keras.utils.save_img(result_prefix + ".png", deprocess_image(img.numpy()))
Processing octave 0 with shape (326, 489)
... Loss value at step 0: 0.45
... Loss value at step 1: 0.63
... Loss value at step 2: 0.91
... Loss value at step 3: 1.24
... Loss value at step 4: 1.57
... Loss value at step 5: 1.91
... Loss value at step 6: 2.20
... Loss value at step 7: 2.50
... Loss value at step 8: 2.82
... Loss value at step 9: 3.11
... Loss value at step 10: 3.40
... Loss value at step 11: 3.70
... Loss value at step 12: 3.95
... Loss value at step 13: 4.20
... Loss value at step 14: 4.48
... Loss value at step 15: 4.72
... Loss value at step 16: 4.99
... Loss value at step 17: 5.23
... Loss value at step 18: 5.47
... Loss value at step 19: 5.69
Processing octave 1 with shape (457, 685)
... Loss value at step 0: 1.11
... Loss value at step 1: 1.77
... Loss value at step 2: 2.35
... Loss value at step 3: 2.82
... Loss value at step 4: 3.25
... Loss value at step 5: 3.67
... Loss value at step 6: 4.05
... Loss value at step 7: 4.44
... Loss value at step 8: 4.79
... Loss value at step 9: 5.15
... Loss value at step 10: 5.50
... Loss value at step 11: 5.84
... Loss value at step 12: 6.18
... Loss value at step 13: 6.49
... Loss value at step 14: 6.82
... Loss value at step 15: 7.12
... Loss value at step 16: 7.42
... Loss value at step 17: 7.71
... Loss value at step 18: 8.01
... Loss value at step 19: 8.30
Processing octave 2 with shape (640, 960)
... Loss value at step 0: 1.27
... Loss value at step 1: 2.02
... Loss value at step 2: 2.63
... Loss value at step 3: 3.15
... Loss value at step 4: 3.66
... Loss value at step 5: 4.12
... Loss value at step 6: 4.58
... Loss value at step 7: 5.01
... Loss value at step 8: 5.42
... Loss value at step 9: 5.80
... Loss value at step 10: 6.19
... Loss value at step 11: 6.54
... Loss value at step 12: 6.89
... Loss value at step 13: 7.22
... Loss value at step 14: 7.57
... Loss value at step 15: 7.88
... Loss value at step 16: 8.21
... Loss value at step 17: 8.53
... Loss value at step 18: 8.80
... Loss value at step 19: 9.10

顯示結果。

您可以使用 Hugging Face Hub 上託管的訓練模型,並在 Hugging Face Spaces 上嘗試演示。

display(Image(result_prefix + ".png"))

png