This guide covers evaluating generated crystal structures against reference datasets for Available Metrics.
Prerequisites¶
Before running evaluation metrics, download and extract the reference dataset files containing structure embeddings, composition features, and phase diagram data.
Download from Figshare¶
You can download directly from the web:
Download benchmarks_mp_20.tar.gz from Figshare
Or use the command line (from project root):
# Download the reference dataset
curl -L -A "Mozilla/5.0" -o benchmarks_mp_20.tar.gz https://figshare.com/ndownloader/files/59462369
# Extract the dataset
tar -zxvf benchmarks_mp_20.tar.gzThis will create the following directory structure:
benchmarks/
└── assets/
├── mp_20_all_composition_features.pt # VAE composition embeddings for diversity metrics
├── mp_20_all_structure_features.pt # VAE structure embeddings for diversity metrics
├── mp_20_all_structure.json.gz # MP-20 reference structures for novelty checking
├── mp_all_unique_structure_250416.json.gz # All MP unique structures for novelty checking
└── ppd-mp_all_entries_uncorrected_250409.pkl.gz # Phase diagram data for energy above hullThese files contain the reference data required for computing evaluation metrics against the MP-20 dataset.
Generate Samples¶
Generate crystal structures using a pre-trained LDM model. (Default model is trained on alex-mp-20 dataset.)
# Generate 10000 samples with 2000 batch size using DDIM sampler
python src/sample.py --num_samples=10000 --batch_size=2000 --output_dir=outputs/samplesEvaluate Models¶
Evaluate generated structures against reference datasets (i.e., MP-20) to assess quality and diversity.
Generate and Evaluate Together¶
Generate new structures and evaluate them in one command:
python src/evaluate.py \
--model_path=ckpts/mp_20/ldm/ldm_null.ckpt \
--structure_path=outputs/eval_samples \
--reference_dataset=mp-20 \
--num_samples=10000 \
--batch_size=2000Evaluate Pre-generated Structures¶
If you already have generated structures:
python src/evaluate.py \
--structure_path=outputs/dng_samples \
--reference_dataset=mp-20 \
--output_file=benchmark/results/my_results.csvEvaluation Metrics¶
The evaluation script computes several metrics to assess generation quality:
For detailed implementation, see src/utils/metrics.py.
Python API Usage¶
You can also compute metrics using the Python API directly:
from monty.serialization import loadfn
from src.utils.metrics import Metrics
# Load generated structures
gen_structures = loadfn("outputs/eval_samples/structures.json.gz")
# Create metrics object
metrics = Metrics(
metrics=["unique", "novel", "e_above_hull", "composition_validity"],
reference_dataset="mp-20",
phase_diagram="mp-all",
metastable_threshold=0.1,
progress_bar=True,
)
# Compute metrics
results = metrics.compute(gen_structures=gen_structures)
# Save results
metrics.to_csv("outputs/results.csv")
# Or get as DataFrame
df = metrics.to_dataframe()
print(df.head())Reference Datasets¶
Available reference datasets:
mp-20: Materials Project structures with ≤20 atomsalex-mp-20: Alexandria MP structures with ≤20 atoms
Results are saved to the specified output file in CSV format for further analysis.
Benchmarks for Chemeleon2 DNG¶
Pre-computed benchmark results for de novo generation (DNG) are available in the benchmarks/dng/ directory:
MP-20:
benchmarks/dng/chemeleon2_rl_dng_mp_20.json.gz- 10,000 generated structures using RL-trained model on MP-20Alex-MP-20:
benchmarks/dng/chemeleon2_rl_dng_alex_mp_20.json.gz- 10,000 generated structures using RL-trained model on Alex-MP-20
Loading Benchmark Data¶
These files contain generated crystal structures in compressed JSON format:
from monty.serialization import loadfn
# Load benchmark structures
structures = loadfn("benchmarks/dng/chemeleon2_rl_dng_mp_20.json.gz")
print(f"Loaded {len(structures)} structures")