-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notebook 07: Cannot save model checkpoint for FoodVision Big #449
Comments
After troubleshooting this for a while, it seems there may be something up with the What exactly, I'm not sure. It could be due to the use of In saying that, a fix I've found to demonstrate the "cloning" and loading of weights is to create a copy of the model by using the exact same code to create it: # Create a function to recreate the original model
def create_model():
# Create base model
input_shape = (224, 224, 3)
base_model = tf.keras.applications.efficientnet.EfficientNetB0(include_top=False)
base_model.trainable = False # freeze base model layers
# Create Functional model
inputs = layers.Input(shape=input_shape, name="input_layer")
# Note: EfficientNetBX models have rescaling built-in but if your model didn't you could have a layer like below
# x = layers.Rescaling(1./255)(x)
x = base_model(inputs, training=False) # set base_model to inference mode only
x = layers.GlobalAveragePooling2D(name="pooling_layer")(x)
x = layers.Dense(len(class_names))(x) # want one output neuron per class
# Separate activation of output layer so we can output float32 activations
outputs = layers.Activation("softmax", dtype=tf.float32, name="softmax_float32")(x)
model = tf.keras.Model(inputs, outputs)
return model
# Create and compile a new version of the original model (new weights)
created_model = create_model()
created_model.compile(loss="sparse_categorical_crossentropy",
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# Load the saved weights
created_model.load_weights(checkpoint_path)
# Evaluate the model with loaded weights
results_created_model_with_loaded_weights = created_model.evaluate(test_data)
# Compare results with original model
import numpy as np
assert np.isclose(results_feature_extract_model, results_created_model_with_loaded_weights).all(), "Loaded weights results are not close to original model." # check if all elements in array are close In short, instead of using |
Continuing this here: #550 In short, it looks like TensorFlow 2.13+ (available via |
Getting an error when training FoodVision Big:
Looks like it's an issue with the
model_checkpoint
callback.This causes the assertion for the cloned model later on to fail:
Need to update the model checkpoint to make sure it can save a model whilst training.
The text was updated successfully, but these errors were encountered: