Tutorial: Atomic Density Reward - Chemeleon2 Documentation

Learn to create a custom reward that maximizes atomic density in generated crystals.

Objective¶

Create a reward function that encourages denser crystal structures:

\text{density} = \frac{M_{\text{total}}}{V_{\text{cell}}} \quad [\text{g/cm}^3]

(1)

Higher density = more mass packed per unit volume.

Step 1: Understand the CustomReward Class¶

The CustomReward class in src/rl_module/components.py is a placeholder for user-defined logic:

class CustomReward(RewardComponent):
    """Wrapper for user-defined custom reward functions."""

    def compute(self, gen_structures: list[Structure], **kwargs) -> torch.Tensor:
        """Placeholder for custom reward function."""
        return torch.zeros(len(gen_structures))

The compute() method receives:

gen_structures: List of pymatgen Structure objects
Additional kwargs like batch_gen, device, metrics_obj

Step 2: Implement Atomic Density Reward¶

Edit src/rl_module/components.py and modify the CustomReward class:

class CustomReward(RewardComponent):
    """Atomic density reward - maximize atoms per unit volume."""

    def compute(self, gen_structures: list[Structure], **kwargs) -> torch.Tensor:
        """
        Compute atomic density for each structure.

        Returns higher rewards for denser structures.
        """
        rewards = []
        for structure in gen_structures:
            density = structure.density  # atomic mass / volume [g/cm³]
            rewards.append(density)
        return torch.as_tensor(rewards)

Step 3: Create Configuration File¶

See configs/custom_reward/atomic_density.yaml:

# @package _global_
# RL Custom Reward Experiment Configuration

data:
  data_dir: ${paths.data_dir}/mp-20
  batch_size: 5

trainer:
  max_steps: 200

rl_module:
  ldm_ckpt_path: ${hub:alex_mp_20_ldm_base}
  vae_ckpt_path: ${hub:alex_mp_20_vae}

  rl_configs:
    num_group_samples: 64
    group_reward_norm: true

  reward_fn:
    normalize_fn: std
    components:
      - _target_: custom_reward.atomic_density.AtomicDensityReward

logger:
  wandb:
    name: rl_custom_reward

Step 4: Run Training¶

python src/train_rl.py custom_reward=atomic_density

Training script: src/train_rl.py

Step 5: Monitor Training¶

In WandB, watch these metrics:

Metric	Description
`train/reward`	Mean reward from reward function (should increase)
`val/reward`	Validation reward
`train/advantages`	Normalized rewards used for policy gradient
`train/kl_div`	KL divergence from reference policy
`train/entropy`	Policy entropy
`train/loss`	Total policy loss

As training progresses, the model should generate increasingly dense structures.

Step 6: Evaluate Results¶

Generate Samples¶

python src/sample.py \
    --ldm_ckpt_path=logs/train_rl/runs/<your-run>/checkpoints/last.ckpt \
    --num_samples=10 \
    --output_dir=outputs/rl_samples

Analyze Density¶

from monty.serialization import loadfn
import numpy as np

structures = loadfn("outputs/rl_samples/generated_structures.json.gz")
densities = [s.density for s in structures]

print(f"Mean density: {np.mean(densities):.3f} g/cm³")
print(f"Max density:  {np.max(densities):.3f} g/cm³")

Extensions¶

Target Density¶

Instead of maximizing density, optimize toward a specific target. Create custom_reward/target_density.py:

"""Target density reward for RL training."""

import torch
from pymatgen.core import Structure

from src.rl_module.components import RewardComponent


class TargetDensityReward(RewardComponent):
    """Reward based on distance from target density."""

    def __init__(self, target_density: float = 0.05, **kwargs):
        super().__init__(**kwargs)
        self.target_density = target_density

    def compute(self, gen_structures: list[Structure], **kwargs) -> torch.Tensor:
        rewards = []
        for structure in gen_structures:
            density = len(structure) / structure.lattice.volume
            # Negative distance from target (higher = closer to target)
            reward = -abs(density - self.target_density)
            rewards.append(reward)
        return torch.tensor(rewards, dtype=torch.float32)

Create a config file (configs/custom_reward/rl_target_density.yaml):

# @package _global_
rl_module:
  reward_fn:
    components:
      - _target_: custom_reward.target_density.TargetDensityReward
        target_density: 5.0  # atoms/Å³

Combined with built-in Reward Components¶

Ensure dense structures are also stable by adding EnergyReward and StructureDiversityReward:

# @package _global_
rl_module:
  reward_fn:
    components:
      - _target_: custom_reward.atomic_density.AtomicDensityReward
        weight: 1.0
        normalize_fn: norm
      - _target_: src.rl_module.components.EnergyReward
        weight: 0.5
        normalize_fn: norm
      - _target_: src.rl_module.components.StructureDiversityReward
        weight: 0.5
        normalize_fn: norm

This encourages the model to generate structures that are dense, low-energy, and diverse.

Summary¶

Create your reward class in custom_reward/ folder
Create config in configs/custom_reward/ referencing your reward
Run training: python src/train_rl.py custom_reward=your_config
Combine with other components for multi-objective optimization

Next Steps¶

DNG Reward - Multi-objective optimization from the paper
Predictor Reward - Use ML models as reward