► Keras 3 API 文件 / KerasCV / 模型 / 任務 / StableDiffusion 圖像生成模型

StableDiffusion 圖像生成模型

`StableDiffusion` 類別

keras_cv.models.StableDiffusion(img_height=512, img_width=512, jit_compile=True)

Stable Diffusion 的 Keras 實作。

請注意，StableDiffusion API 以及 StableDiffusion 子組件（例如 ImageEncoder、DiffusionModel）的 API 在目前階段應視為不穩定。我們不保證未來變更這些 API 時的回溯相容性。

Stable Diffusion 是一款強大的圖像生成模型，可用於根據簡短的文字描述（稱為「提示」）生成圖像等用途。

參數

img_height：整數，要生成的圖像高度（以像素為單位）。請注意，僅支援 128 的倍數；提供的數值將會四捨五入至最接近的有效值。預設值為 512。
img_width：整數，要生成的圖像寬度（以像素為單位）。請注意，僅支援 128 的倍數；提供的數值將會四捨五入至最接近的有效值。預設值為 512。
jit_compile：布林值，是否將底層模型編譯為 XLA。這在某些系統上可能會大幅提升速度。預設值為 False。

範例

from keras_cv.src.models import StableDiffusion
from PIL import Image

model = StableDiffusion(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=25,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")
print("saved at horse.png")

參考

[來源]

`StableDiffusionV2` 類別

keras_cv.models.StableDiffusionV2(img_height=512, img_width=512, jit_compile=True)

Stable Diffusion v2 的 Keras 實作。

請注意，StableDiffusion API 以及 StableDiffusionV2 子組件（例如 ImageEncoder、DiffusionModelV2）的 API 在目前階段應視為不穩定。我們不保證未來變更這些 API 時的回溯相容性。

Stable Diffusion 是一款強大的圖像生成模型，可用於根據簡短的文字描述（稱為「提示」）生成圖像等用途。

參數

img_height：整數，要生成的圖像高度（以像素為單位）。請注意，僅支援 128 的倍數；提供的數值將會四捨五入至最接近的有效值。預設值為 512。
img_width：整數，要生成的圖像寬度（以像素為單位）。請注意，僅支援 128 的倍數；提供的數值將會四捨五入至最接近的有效值。預設值為 512。
jit_compile：布林值，是否將底層模型編譯為 XLA。這在某些系統上可能會大幅提升速度。預設值為 False。

範例

from keras_cv.src.models import StableDiffusionV2
from PIL import Image

model = StableDiffusionV2(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=25,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")
print("saved at horse.png")

參考

[來源]

`Decoder` 類別

keras_cv.models.stable_diffusion.Decoder(
    img_height, img_width, name=None, download_weights=True
)

Sequential 將線性堆疊的層組合成 Model。

範例

model = keras.Sequential()
model.add(keras.Input(shape=(16,)))
model.add(keras.layers.Dense(8))

# Note that you can also omit the initial `Input`.
# In that case the model doesn't have any weights until the first call
# to a training/evaluation method (since it isn't yet built):
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(4))
# model.weights not created yet

# Whereas if you specify an `Input`, the model gets built
# continuously as you are adding layers:
model = keras.Sequential()
model.add(keras.Input(shape=(16,)))
model.add(keras.layers.Dense(8))
len(model.weights)  # Returns "2"

# When using the delayed-build pattern (no input shape specified), you can
# choose to manually build your model by calling
# `build(batch_input_shape)`:
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(4))
model.build((None, 16))
len(model.weights)  # Returns "4"

# Note that when using the delayed-build pattern (no input shape specified),
# the model gets built the first time you call `fit`, `eval`, or `predict`,
# or the first time you call the model on some input data.
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(1))
model.compile(optimizer='sgd', loss='mse')
# This builds the model for the first time:
model.fit(x, y, batch_size=32, epochs=10)

[來源]

`DiffusionModel` 類別

keras_cv.models.stable_diffusion.DiffusionModel(
    img_height, img_width, max_text_length, name=None, download_weights=True
)

將層組合成具有訓練/推論功能的物件的模型。

有三種方式可以實例化 Model

使用「函數式 API」

從 Input 開始，串聯層呼叫以指定模型的前向傳遞，最後從輸入和輸出建立模型

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意：僅支援輸入張量的字典、清單和元組。不支援巢狀輸入（例如清單的清單或字典的字典）。

您也可以使用中間張量來創建新的 Functional API 模型。這使您可以快速提取模型的子組件。

範例

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

請注意，backbone 和 activations 模型不是使用 keras.Input 物件創建的，而是使用源自 keras.Input 物件的張量創建的。在底層，這些模型將共享層和權重，以便用戶可以訓練 full_model，並使用 backbone 或 activations 進行特徵提取。模型的輸入和輸出也可以是張量的嵌套結構，並且創建的模型是標準 Functional API 模型，支持所有現有 API。

通過繼承 `Model` 類別

在這種情況下，您應該在 __init__() 中定義您的層，並且您應該在 call() 中實現模型的前向傳遞。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果您繼承 Model，您可以選擇在 call() 中使用 training 參數（布林值），您可以使用它來指定訓練和推論中的不同行為。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

創建模型後，您可以使用 model.compile() 配置模型的損失函數和指標，使用 model.fit() 訓練模型，或使用 model.predict() 使用模型進行預測。

使用 `Sequential` 類別

此外，keras.Sequential 是一種特殊的模型，其中模型完全是由單輸入、單輸出層組成的堆疊。

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

[來源]

`ImageEncoder` 類別

keras_cv.models.stable_diffusion.ImageEncoder(download_weights=True)

ImageEncoder 是用於 StableDiffusion 的 VAE 編碼器。

[來源]

`NoiseScheduler` 類別

keras_cv.models.stable_diffusion.NoiseScheduler(
    train_timesteps=1000,
    beta_start=0.0001,
    beta_end=0.02,
    beta_schedule="linear",
    variance_type="fixed_small",
    clip_sample=True,
)

參數

train_timesteps: number of diffusion steps used to train the model.
beta_start: the starting `beta` value of inference.
beta_end: the final `beta` value.
beta_schedule: the beta schedule, a mapping from a beta range to a
    sequence of betas for stepping the model. Choose from `linear` or
    `quadratic`.
variance_type: options to clip the variance used when adding noise to
    the de-noised sample. Choose from `fixed_small`, `fixed_small_log`,
    `fixed_large`, `fixed_large_log`, `learned` or `learned_range`.
clip_sample: option to clip predicted sample between -1 and 1 for
    numerical stability.

[來源]

`SimpleTokenizer` 類別

keras_cv.models.stable_diffusion.SimpleTokenizer(bpe_path=None)

[來源]

`TextEncoder` 類別

keras_cv.models.stable_diffusion.TextEncoder(
    max_length, vocab_size=49408, name=None, download_weights=True
)

將層組合成具有訓練/推論功能的物件的模型。

有三種方式可以實例化 Model

使用「函數式 API」

從 Input 開始，串聯層呼叫以指定模型的前向傳遞，最後從輸入和輸出建立模型

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意：僅支援輸入張量的字典、清單和元組。不支援巢狀輸入（例如清單的清單或字典的字典）。

您也可以使用中間張量來創建新的 Functional API 模型。這使您可以快速提取模型的子組件。

範例

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

通過繼承 `Model` 類別

在這種情況下，您應該在 __init__() 中定義您的層，並且您應該在 call() 中實現模型的前向傳遞。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果您繼承 Model，您可以選擇在 call() 中使用 training 參數（布林值），您可以使用它來指定訓練和推論中的不同行為。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

創建模型後，您可以使用 model.compile() 配置模型的損失函數和指標，使用 model.fit() 訓練模型，或使用 model.predict() 使用模型進行預測。

使用 `Sequential` 類別

此外，keras.Sequential 是一種特殊的模型，其中模型完全是由單輸入、單輸出層組成的堆疊。

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

[來源]

`TextEncoderV2` 類別

keras_cv.models.stable_diffusion.TextEncoderV2(
    max_length, vocab_size=49408, name=None, download_weights=True
)

將層組合成具有訓練/推論功能的物件的模型。

有三種方式可以實例化 Model

使用「函數式 API」

從 Input 開始，串聯層呼叫以指定模型的前向傳遞，最後從輸入和輸出建立模型

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意：僅支援輸入張量的字典、清單和元組。不支援巢狀輸入（例如清單的清單或字典的字典）。

您也可以使用中間張量來創建新的 Functional API 模型。這使您可以快速提取模型的子組件。

範例

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

通過繼承 `Model` 類別

在這種情況下，您應該在 __init__() 中定義您的層，並且您應該在 call() 中實現模型的前向傳遞。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果您繼承 Model，您可以選擇在 call() 中使用 training 參數（布林值），您可以使用它來指定訓練和推論中的不同行為。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

創建模型後，您可以使用 model.compile() 配置模型的損失函數和指標，使用 model.fit() 訓練模型，或使用 model.predict() 使用模型進行預測。

使用 `Sequential` 類別

此外，keras.Sequential 是一種特殊的模型，其中模型完全是由單輸入、單輸出層組成的堆疊。

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

StableDiffusion 圖像生成模型

StableDiffusion 類別

StableDiffusionV2 類別

Decoder 類別

DiffusionModel 類別

◆ 使用「Functional API」

◆ 通過繼承 Model 類別

◆ 使用 Sequential 類別

ImageEncoder 類別

NoiseScheduler 類別

參數

SimpleTokenizer 類別

TextEncoder 類別

TextEncoderV2 類別

StableDiffusion 圖像生成模型

StableDiffusion 類別

StableDiffusionV2 類別

Decoder 類別

DiffusionModel 類別

使用「函數式 API」

通過繼承 Model 類別

使用 Sequential 類別

ImageEncoder 類別

NoiseScheduler 類別

參數

SimpleTokenizer 類別

TextEncoder 類別

使用「函數式 API」

通過繼承 Model 類別

使用 Sequential 類別

TextEncoderV2 類別

使用「函數式 API」

通過繼承 Model 類別

使用 Sequential 類別

`StableDiffusion` 類別

`StableDiffusionV2` 類別

`Decoder` 類別

`DiffusionModel` 類別

通過繼承 `Model` 類別

使用 `Sequential` 類別

`ImageEncoder` 類別

`NoiseScheduler` 類別

`SimpleTokenizer` 類別

`TextEncoder` 類別

通過繼承 `Model` 類別

使用 `Sequential` 類別

`TextEncoderV2` 類別

通過繼承 `Model` 類別

使用 `Sequential` 類別