Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

VAE Training

The Variational Autoencoder (VAE) is the first stage of the Chemeleon2 pipeline. It learns to encode crystal structures into a continuous latent space.

What VAE Does

The VAE is the first stage of Chemeleon2 that encodes crystal structures into continuous latent space representations. For architectural details, see VAE Module.

Quick Start

# Train VAE on MP-20 dataset (src/train_vae.py)
python src/train_vae.py experiment=mp_20/vae_dng

Training script: src/train_vae.py Example config: configs/experiment/mp_20/vae_dng.yaml

Training Commands

Basic Training

# Use experiment config
python src/train_vae.py experiment=mp_20/vae_dng

# Override training parameters
python src/train_vae.py experiment=mp_20/vae_dng \
    trainer.max_epochs=3000 \
    data.batch_size=128

Resume from Checkpoint

python src/train_vae.py experiment=mp_20/vae_dng \
    ckpt_path=ckpts/vae_checkpoint.ckpt

Configuration

Key Hyperparameters

ParameterDefaultDescription
latent_dim8Dimension of latent space
hidden_dim512Hidden dimension in encoder/decoder (d_model)
num_layers8Number of transformer layers
kl_weight1e-5KL divergence loss weight

Example Config Override

python src/train_vae.py experiment=mp_20/vae_dng \
    vae_module.latent_dim=16 \
    vae_module.kl_weight=1e-4

Available Experiments

ExperimentDatasetDescription
mp_20/vae_dngMP-20VAE for de novo generation

Training Tips

Monitoring

Key metrics to watch in WandB:

Typical Training

Next Steps

After training VAE:

  1. Note the checkpoint path

  2. Proceed to LDM Training using your VAE checkpoint