Object Condensation Loss Functions and Utilities#
The Object Condensation loss - developed by Jan Kieseler - is now being used by several groups in high energy physics for both track reconstruction and shower reconstruction in calorimeters.
Several implementations of this idea already exist, but often they are maintained by very few people. This repository aims to provide an easy to use implementation for both the TensorFlow and PyTorch backend.
Existing Implementations:
cms-pepr [TensorFlow]
gnn-tracking [PyTorch]
Installation#
python3 -m pip install -e 'object_condensation[pytorch]'
# or
python3 -m pip install -e 'object_condensation[tensorflow]'
Development setup#
For the development setup, clone this repository and also add the dev
and
testing
extra options, e.g.,
python3 -m pip install -e '.[pytorch,dev,testing]'
Please also install pre-commit:
python3 -m pip install pre-commit
pre-commit install # in top-level directory of repository
Conventions#
Implementations#
[!NOTE] For a comparison of the performance of the different implementations, see the docs.
Default#
condensation_loss
is a straightforward implementation that is easy to read and
to verify. It is used as baseline.
Tiger#
condensation_loss_tiger
saves memory by “masking instead of multiplying”.
Consider the repulsive loss: It is an aggregation of potentials between
condensation points (CPs) and individual nodes. If these potentials are taken to
be hinge losses relu(1-dist)
, then they vanish for most CP-node pairs
(assuming a sufficiently well-trained model).
Compare now the following two implementation strategies (where dist
is the
CP-node distance matrix):
# Simplified by assuming that all points belong to repulsive potential
# Strategy 1
v_rep = sum(relu(1 - dist))
# Strategy 2 (tiger)
rep_mask = dist < 1
v_rep = sum((1 - dist)[rep_mask])
In strategy 1, pytorch will keep all elements of dist
in memory for
backpropagation (even though most of the relu
-differentials will be 0). In
strategy 2 (because the mask is detached from the computational graph), the
number of elements to backpropagate with will be greatly reduced.
However, there is still one problem: What if our latent space collapses at some
point (or at the beginning of the training)? This would result in batches with a
greatly increased memory consumption, possibly crashing the run. To counter
this, an additional parameter max_n_rep
is introduced. If the number of
repulsive pairs (rep_mask.sum()
in the example above) exceeds max_n_rep
,
then rep_mask
will sample max_n_rep
elements and upweight them by
n_rep/max_n_rep
. To check whether this approximation is being used,
condensation_loss_tiger
will return n_rep
in addition to the losses.
Benchmarks#
Tracking#
We use the trackML dataset from the Codalab challenge.
Parameter |
Pixel |
Full detector |
---|---|---|
N objects |
10k |
11k |
N hits |
66k |
126k |
The benchmarks below are for a GNN model with 1.9M parameters trained on the pixel dataset. The exact numbers might depend on the model architecture and the edges that were built.
Data |
Loss fct |
Max memory |
Persistent memory |
Time |
Notes |
---|---|---|---|---|---|
Pixel |
default |
25 |
14 |
0.1 |
|
Pixel |
tiger no compilation |
12.3 |
4 |
0.58 |
|
Pixel |
tiger compiled |
9 |
6 |
0.14 |
|
Full |
tiger compiled |
26 |
8 |
0.18 |
|
Memory is in GB, time in s. Persistent memory is an estimate of the memory consumption after loss has been evaluated but before backwards propagation. Max memory is the maximum memory consumption encountered during the evaluation of the loss function minus the memory just before starting evaluation of the loss function.
PyTorch API#
- object_condensation.pytorch.losses.condensation_loss(*, beta, x, object_id, weights=None, q_min, noise_threshold=0)#
Condensation losses
- Parameters:
beta (
Tensor
) – Condensation likelihoodsx (
Tensor
) – Clustering coordinatesobject_id (
Tensor
) – Labels for objects.weights (
Tensor
) – Weights per hit, multiplied to attractive/repulsive potentialsq_min (
float
) – Minimal chargenoise_threshold (
int
) – Threshold for noise hits. Hits withobject_id <= noise_thld
are considered to be noise
- Return type:
- Returns:
Dictionary of scalar tensors; see readme.
- object_condensation.pytorch.losses.condensation_loss_tiger(*, beta, x, object_id, weights=None, q_min, noise_threshold=0, max_n_rep=0, torch_compile=False)#
Condensation losses
- Parameters:
beta (T) – Condensation likelihoods
x (T) – Clustering coordinates
object_id (T) – Labels for objects.
weights (T | None) – Weights per hit, multiplied to attractive/repulsive potentials
q_min (float) – Minimal charge
noise_threshold (int) – Threshold for noise hits. Hits with
object_id <= noise_thld
are considered to be noisemax_n_rep (int) – Maximum number of elements to consider for repulsive loss. Set to 0 to disable.
torch_compile – Torch compile loss function. This is recommended, but might not work in older pytorch version or in cutting edge python.
- Return type:
dict[str, T]
- Returns:
Dictionary of scalar tensors; see readme. n_rep: Number of repulsive elements (before sampling).
TensorFlow API#
- object_condensation.tensorflow.losses.condensation_loss(*, q_min, object_id, beta, x, weights=None, noise_threshold=0)#
Condensation losses
- Parameters:
beta (
Tensor
) – Condensation likelihoodsx (
Tensor
) – Clustering coordinatesobject_id (
Tensor
) – Labels for objects. Objects with object_id <= 0 are considered noiseweights (
Tensor
) – Weights per hit, multiplied to attractive/repulsive potentialsq_min (
float
) – Minimal chargenoise_threshold (
int
) – Threshold for noise hits. Hits withobject_id <= noise_thld
are considered to be noise
- Return type:
- Returns:
Dictionary of scalar tensors.
attractive
: Averaged over object, then averaged over all objects.repulsive
: Averaged likeattractive
coward
: Averaged over all objectsnoise
: Averaged over all noise hits