Saving & Loading¶
A fitted GlassBoxUMAP is a PyTorch model, plus the bits of state that connect raw input features to that model (PCA basis, feature means, training hyperparameters). The save and load methods round-trip all of this to a single file on disk.
Fit a model¶
We’ll use scikit-learn’s digits dataset, which is 1,797 handwritten digit images flattened to 64 features (8x8 grayscale pixels).
from glass_box_umap import GlassBoxUMAP
def load_data():
from sklearn.datasets import load_digits
from sklearn.preprocessing import StandardScaler
digits, _ = load_digits(return_X_y=True)
return StandardScaler().fit_transform(digits)
X = load_data()
embedder = GlassBoxUMAP(quiet=True)
embedder.fit(X)
Save the model¶
save takes a path and writes a PyTorch checkpoint.
from pathlib import Path
model_path = Path.cwd() / "embedder.pt"
embedder.save(model_path)
print(f"saved model ({model_path.stat().st_size / 1024:.1f} KiB) to {model_path.name}")
saved model (167.2 KiB) to embedder.pt
Load the model¶
GlassBoxUMAP.load is a classmethod that reconstructs the embedder from a checkpoint.
loaded = GlassBoxUMAP.load(model_path)
The reloaded embedder is functionally identical to the original. We can confirm that with ==:
loaded == embedder
True
Note
== performs a semantic comparison: it checks that both embedders describe the same trained model — matching architecture, hyperparameters, learned weights, PCA fit, and centering vector — and ignores incidental runtime state like device placement, logging verbosity, and DataLoader workers. So two consecutive loads of the same file always equate, even if one was moved to CPU and the other left on GPU.
Concretely, that means transform and compute_contributions produce bitwise-identical outputs from either embedder:
import numpy as np
Z_original = embedder.transform(X)
Z_loaded = loaded.transform(X)
assert np.array_equal(Z_original, Z_loaded)
C_original = embedder.compute_contributions(X)
C_loaded = loaded.compute_contributions(X)
assert np.array_equal(C_original, C_loaded)
print("embeddings and contributions match exactly")
embeddings and contributions match exactly
Caveats¶
Custom encoders¶
If you fit a model with a custom encoder, the loading process must register the same encoder before load runs. See Saving/Loading in the Custom Encoders guide for details.