Model Card for mlpf-clic-clusters-v2.2.0
This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector.
Model Details
The performance is measured with respect to generator-level jets and MET computed from Pythia particles, i.e. the truth-level jets and MET. The primary difference with respect to v2.1.0 is the inclusion of the sqrt(pt) weight in the pT and energy loss term.
Jet performance
![ttbar jet resolution](/jpata/particleflow/resolve/main/clic/clusters/v2.2.0/pyg-clic_20250106_193536_269746/plots_checkpoint-05-1.995116/clic_edm_ttbar_pf/jet_response_iqr_over_med_pt.png)
![qq jet resolution](/jpata/particleflow/resolve/main/clic/clusters/v2.2.0/pyg-clic_20250106_193536_269746/plots_checkpoint-05-1.995116/clic_edm_qq_pf/jet_response_iqr_over_med_pt.png)
![ttbar jet resolution](/jpata/particleflow/resolve/main/clic/clusters/v2.2.0/pyg-clic_20250106_193536_269746/plots_checkpoint-05-1.995116/clic_edm_ww_fullhad_pf/jet_response_iqr_over_med_pt.png)
MET performance
![ttbar MET resolution](/jpata/particleflow/resolve/main/clic/clusters/v2.2.0/pyg-clic_20250106_193536_269746/plots_checkpoint-05-1.995116/clic_edm_ttbar_pf/met_response_iqr_over_med.png)
![qq MET resolution](/jpata/particleflow/resolve/main/clic/clusters/v2.2.0/pyg-clic_20250106_193536_269746/plots_checkpoint-05-1.995116/clic_edm_qq_pf/met_response_iqr_over_med.png)
![ttbar MET resolution](/jpata/particleflow/resolve/main/clic/clusters/v2.2.0/pyg-clic_20250106_193536_269746/plots_checkpoint-05-1.995116/clic_edm_ww_fullhad_pf/met_response_iqr_over_med.png)
Model Description
- Developed by: Joosep Pata, Eric Wulff, Farouk Mokhtar, Mengke Zhang, David Southwick, Maria Girone, David Southwick, Javier Duarte, Michael Kagan
- Model type: transformer
- License: Apache License
Model Sources
Uses
Direct Use
This model may be used to study the physics and computational performance on ML-based reconstruction in simulation.
Out-of-Scope Use
This model is not intended for physics measurements on real data.
Bias, Risks, and Limitations
The model has only been trained on simulation data and has not been validated against real data. The model has not been peer reviewed or published in a peer-reviewed journal.
How to Get Started with the Model
Use the code below to get started with the model.
#get the code
git clone https://github.com/jpata/particleflow
cd particleflow
git checkout v2.2.0
#get the models
git clone https://huggingface.co/jpata/particleflow models
Training Details
Trained on 1x A100 for 5 epochs over ~6 days. The training was continued from a checkpoint due to a runtime limit.
Training Data
The following datasets were used:
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/1/2.5.0
4.8G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/2/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/3/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/4/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/5/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/6/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/7/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/8/2.5.0
4.7G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/9/2.5.0
4.8G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_qq_pf/10/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/1/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/2/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/3/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/4/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/5/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/6/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/7/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/8/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/9/2.5.0
9.3G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ttbar_pf/10/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/1/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/2/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/3/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/4/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/5/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/6/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/7/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/8/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/9/2.5.0
7.4G /scratch/persistent/joosep/tensorflow_datasets/clic_edm_ww_fullhad_pf/10/2.5.0
The datasets were generated using Key4HEP with the following scripts:
- https://github.com/HEP-KBFI/key4hep-sim/releases/tag/v1.1.0
- https://github.com/HEP-KBFI/key4hep-sim/blob/v1.1.0/clic/run_sim.sh
Training Procedure
#!/bin/bash
#SBATCH --partition gpu
#SBATCH --gres gpu:a100:1
#SBATCH --mem-per-gpu 250G
#SBATCH -o logs/slurm-%x-%j-%N.out
IMG=/home/software/singularity/pytorch.simg:2024-12-03
cd ~/particleflow
ulimit -n 100000
singularity exec -B /scratch/persistent --nv \
--env PYTHONPATH=`pwd` \
--env KERAS_BACKEND=torch \
$IMG python3 mlpf/pipeline.py --gpus 1 \
--data-dir /scratch/persistent/joosep/tensorflow_datasets --config parameters/pytorch/pyg-clic.yaml \
--train --conv-type attention \
--gpu-batch-multiplier 256 --checkpoint-freq 1 --num-workers 8 --prefetch-factor 100 --comet --ntest 2000 --test-datasets clic_edm_qq_pf
Evaluation
#!/bin/bash
#SBATCH --partition gpu
#SBATCH --gres gpu:a100-mig:1
#SBATCH --mem-per-gpu 100G
#SBATCH -o logs/slurm-%x-%j-%N.out
IMG=/home/software/singularity/pytorch.simg:2024-12-03
cd ~/particleflow
WEIGHTS=experiments/pyg-clic_20250106_193536_269746/checkpoints/checkpoint-05-1.995116.pth
singularity exec -B /scratch/persistent --nv \
--env PYTHONPATH=`pwd` \
--env KERAS_BACKEND=torch \
$IMG python3 mlpf/pipeline.py --gpus 1 \
--data-dir /scratch/persistent/joosep/tensorflow_datasets --config parameters/pytorch/pyg-clic.yaml \
--test --make-plots --gpu-batch-multiplier 100 --load $WEIGHTS --dtype bfloat16 --num-workers 0 --ntest 50000
Citation
Glossary
- PF: particle flow reconstruction
- MLPF: machine learning for particle flow
- CLIC: Compact Linear Collider
Model Card Contact
Joosep Pata, [email protected]