license: mit
datasets:
- jet-universe/jetclass2
tags:
- particle physics
- jet tagging
Model Card: Sophon
The Sophon model is a jet tagging model pre-trained on a 188-class classification task using the JetClass-II dataset. It is based on the Particle Transformer architecture. This model represents the first practical implementation under the Sophon (Signature-Oriented Pre-training for Heavy-resonance ObservatioN) methodology.
For more details, refer to the following links: [Paper], [Github].
Try out this [demo on Colab] to get started with the model.
Model Details
The Sophon model functions both as a generic jet tagging model and a pre-trained model tailored for LHC's analysis needs. As a jet tagger, the model is trained to distinguish among 188 classes:
Key features of the model include:
- Training in a mass-decorrelated scenario, achieved by (1) ensuring the training dataset covers a wide range of jet transverse momentum (pT) and soft-drop mass (mSD), and (2) reweighting samples in each major training class to achieve similar jet distributions for pT and mSD.
- Enhanced "scale invariance" through the use of normalized 4-vectors as input.
Uses and Impact
The Sophon model is valuable for future LHC phenomenological research, particularly for estimating physics measurement sensitivity using fast-simulation (Delphes) datasets. For a quick example of using this model in Python, or integrating this model in C++ workflows to process Delphes files, check [here].
This model also offers insights for the future development of generic and foundational AI models for particle physics experiments.
Training Details
Install dependencies
The Sophon model is based on the ParT architecture. It is implemented in PyTorch, with training based on the weaver framework for dataset loading and transformation. To install weaver
, run:
pip install git+https://github.com/hqucms/weaver-core.git@dev/custom_train_eval
Note: We are temporarily using a development branch of
weaver
.
For instructions on setting up Miniconda and installing PyTorch, refer to the weaver
page.
Download Sophon repository
git clone https://github.com/jet-universe/sophon.git
cd sophon
Download dataset
Download the JetClass-II dataset from [Hugging Face Dataset]. The training and validation files are used in this work, while the test files are not used.
Ensure that all ROOT files are accessible from:
./datasets/JetClassII/Pythia/{Res2P,Res34P,QCD}_*.root
Training
Step 1: Generate dataset sampling weights according to the weights
section in the data configuration. The processed config with pre-calculated weights will be saved to data/JetClassII
.
./train_sophon.sh make_weight
Step 2: Start training.
./train_sophon.sh train
Note: Depending on your machine and GPU configuration, additional settings may be useful. Here are a few examples:
- Enable PyTorch's DDP for parallel training, e.g.,
CUDA_VISIBLE_DEVICES=0,1,2,3 DDP_NGPUS=4 ./train_sophon.sh train --start-lr 2e-3
(the learning rate should be scaled according toDDP_NGPUS
).- Configure the number of data loader workers, the fetch step for loading each ROOT file, and the dataset split number to alleviate memory burden. Example command:
./train_sophon.sh train --num-workers 8 --fetch-step 0.02 --data-split-num 4
.
Step 3 (optional): Convert the model to ONNX.
./train_sophon.sh convert
Evaluation
The Sophon model has been evaluated on several LHC experimental tasks. The evaluation dataset is a dedicated Standard Model dataset, collected using a generic large-R jet trigger that selects large-R (R = 0.8) jets with pT > 400 GeV and trimmed mass mtrim > 50 GeV.
Key evaluation results include:
- Superior performance in directly tagging X→bb jets against QCD background jets, and in tagging X→bs jets against QCD jets after fine-tuning. The model outperforms the best experimental taggers for X→bb and X→bs.
- Significant potential for searching for unknown heavy resonances by constructing various tagging discriminants, selecting data, and performing generic bump hunts.
- Excellent results in anomaly detection using a weakly-supervised training approach, showing greater sensitivity to signals at very low signal injection levels and improved significance with adequate signal.
For more details, refer to the [Paper].
Citation
If you use the JetClass-II dataset or the Sophon model, please cite:
@article{Li:2024htp,
author = "Li, Congqiao and Agapitos, Antonios and Drews, Jovin and Duarte, Javier and Fu, Dawei and Gao, Leyun and Kansal, Raghav and Kasieczka, Gregor and Moureaux, Louis and Qu, Huilin and Suarez, Cristina Mantilla and Li, Qiang",
title = "{Accelerating Resonance Searches via Signature-Oriented Pre-training}",
eprint = "2405.12972",
archivePrefix = "arXiv",
primaryClass = "hep-ph",
month = "5",
year = "2024"
}