Add option to reweight loss to account for differences in relation frequency #1142

sbonner0 · 2022-10-16T17:18:32Z

Sharing some initial work I had done around adding support for reweighting the loss to account for relation frequency. The current code is very much work in progress but I wanted to share at this early stage for initial feedback.

TODO and ideas list:

Add support to the remaining loss functions (only MRL and BCE done currently)
Add to LCWA training
Documentation
Add support for custom sample weights

Link to the relevant feature request

This was initially discussed here: #973

Description of the Change

In order to help address the issues associated with relation imbalance, this PR proposes the ability to enable the loss value to be reweighted at the triple level. This reweighting is computed as follows for relation type r: weight_r = n_triples / (n_unique_relations * count(r))

The way the current code for the loss functions is, where the reduction happens in the loss class, means that support has to be added for reweighting to each loss class. Currently I only have added support to MRL and BCE.

Reweighted training can be enabled for these two losses as follows:

from pykeen.pipeline import pipeline

pipeline_result = pipeline(
    dataset='Nations',
    model='ComplEx',
    loss='BCEWithLogitsLoss',
    loss_kwargs=dict(reweight_triples=True),
)

I am certain that I have not done this in the most efficient manner possible and would welcome any and all feedback! For example, I wonder if this process makes sense for more advanced loss functions, such as the adversarial ones, where the triple loss is already being modified?

Possible Drawbacks

Adding more complexity to the pipeline for a niche use-case perhaps.

Verification Process

Ran a series of datasets with quite heavy imbalance in relation frequency and noticed improved performance on rarer relation types. For example Hetionet and the Compound-treats-Disease relation.

Release Notes

Added the ability to weight the loss of triples when training based on the frequency of the relation type.

…le_weights

francesco-tuveri · 2022-10-19T11:44:15Z

Hi @sbonner0,

Thanks for working on this, having some form of loss weighting would definitely be useful from my point of view.

I think that the main question in terms of design is: what are we weighting?

In my opinion the samples that drive the loss are always the positive ones. The negatives are a sort of function of the positives, so maybe just using positive weights (like you propose for the margin loss) could simplify things.
For something like BCE loss then the negative weights would be 1 - pos_triple_weights. This is also the approach that usually is used for weighting losses in standard classification tasks/frameworks, so maybe it will make it easier to generalise this to the other losses in PyKeen.
However, I am aware that with KG embeddings the training is more complicated, so there might be some complications I haven't thought of 🙂

At a high level the API looks good to me. Maybe a potential improvement, to bring this feature in line with the rest of the codebase, would be to allow the pipeline parameter to either be a string (e.g. balanced as a default) or a callable/class that can be used to compute the weights. I would also call the parameter edge_weights (or something similar) to account for the flexibility.

I would be happy to hear other people's thoughts.

Hope this helps,
Francesco

mberr · 2022-10-21T20:19:39Z

Hi @sbonner0 ,

first of all, thanks for starting with a proposal. 🚀

I like the idea of having (positive) triple weights, too, since it allows users to provide any (static, precalculated?) weights, including inverse entity or relation frequency weighting. In that case, it would make sense to extend the training instances, defined in pykeen.triples.instances by another field for weights, and make sure to forward them through the training loop to the loss function.

I think the easiest option is to re-use these weights for the negative examples generated from a positive one. Some losses, e.g.. the adversarial one(s), calculate some dynamic negative loss weights, which could then be multiplied with the corresponding positive triple weight.

sbonner0 · 2022-10-24T21:04:55Z

Hey @francesco-tuveri - thanks so much for your interest and feedback. I take your point on the weighting of the positive versus negative samples however I still feel there might be some value in also allowing the negatives to be weighted separately also. Really agree with your suggestion to improve the API in relation to the pipeline, I'll look into doing something along these lines.

Hey @mberr - thanks for the suggestion to add weights into the training instances. I guess that the appropriate weights could then be returned from the data loader via __getitem__? Thinking further on how it would be possible to extend this to allow any custom pre-calculated weights to be provided by users, do you think the best approach is to allow them to be passed to the pipeline object?

I was wondering if there is a way in which the weights could be applied in a more standard way for all the loss functions? As it stands, the weights would need to be applied in each loss function which feels like a lot of redundant code. It would be ideal if loss functions returned the unreduced values which could then, if required, be modified by the weights, before being reduced via the chosen method.

mberr · 2022-11-06T12:17:20Z

Hey @mberr - thanks for the suggestion to add weights into the training instances. I guess that the appropriate weights could then be returned from the data loader via getitem? Thinking further on how it would be possible to extend this to allow any custom pre-calculated weights to be provided by users, do you think the best approach is to allow them to be passed to the pipeline object?

For the start, it might be easier to start with support for relation-balanced triple weights

adding (optional) instance weights to the training instances in pykeen.triples.instances.py
introduce an instance weighting component, e.g.,

class InstanceWeighting:
  def calculate_weights(self, mapped_triples: MappedTriples) -> torch.FloatTensor:
    raise NotImplementedError

class RelationBalancedInstanceWeighting(InstanceWeighting):
  def calculate_weights(self, mapped_triples: MappedTriples) -> torch.FloatTensor:
    inverse, counts =  mapped_triples[:, 1].unique(return_inverse=True, return_counts=True)[1:]
    return counts.reciprocal()[inverse]

instance_weighting_resolver = ClassResolver.from_subclasses(InstanceWeighting)

add instance_weighting: Hint[InstanceWeighting] parameter to CoreTriplesFactory.create_slcwa_instances and CoreTriplesFactory.create_lcwa_instances, which instantiates a weighting if necessary (via make_safe) and calculates weights to pass to the instantiation of training instances
pass instance_weighting from TrainingLoop.train down via TrainingLoop._train, TrainingLoop._create_training_data_loader to CoreTriplesFactory.create_(s)lcwa_instances. This will allow passing, e.g. training_kwargs=dict(instance_weighting="relation-balanced") to pipeline
extend the loss functions to be able to process weights. On a high-level, your proposal sounds good (i.e., let loss functions calculate non-reduced losses first (maybe aggregating negatives to a single value?), then apply the weights, and then calculate the aggregation. For the latter, we may need to small adjustments, e.g., mean reduction would need to calculate sum(weighted_losses) / sum(weights).

To support custom weights, we could now add

class CustomInstanceWeighting(InstanceWeighting):
  def __init__(self, weights: torch.FloatTensor):
    self.weights = weights

  def calculate_weights(self, mapped_triples: MappedTriples) -> torch.FloatTensor:
    # check for consistency
    return self.weights

although there are some things to consider, e.g., about inverse relations.

sbonner0 · 2023-01-28T15:24:31Z

Hey @mberr sorry for the delay in this but I have some free time and would love to try and make some progress on this from my end finally!

Thanks for your fantastic reply above, having been some time since I have looked at the codebase however, should the InstanceWeighting classes be placed in pykeen.triples.instances.py? I am also unsure what changes would be needed in that file to add support for instance weights.

mberr · 2023-01-28T18:49:38Z

Hey @mberr sorry for the delay in this but I have some free time and would love to try and make some progress on this from my end finally!

Thanks for your fantastic reply above, having been some time since I have looked at the codebase however, should the InstanceWeighting classes be placed in pykeen.triples.instances.py?

The place seems to be fine for now.

I am also unsure what changes would be needed in that file to add support for instance weights.

This file contains the code to transform triples into torch.utils.data.Dataset, and to define custom collation methods, if necessary.

The datasets are generally "map-style-datasets", i.e., the define a __getitem__ method to get the i-th training sample. In case of sLCWA, this is a triple of a positive triple, negative triples, and an optional mask. For LCWA, this is a pair of (head-relation) / (relation-tail), and a dense target.

For sample weights, we thus need to extend the dataset (variants) to add an optional additional entry, the sample weight.

In places where we need custom collators (i.e., functions which combine samples into a batch), this needs to be considered, too.

The Instances (i.e. sub-classes of torch.utils.data.Dataset) are later on used by the training loops to obtain data loaders for them (these are built-in classes from PyTorch taking care of preparing training batch in parallel).

In the training loop, we iterate over the batches from these data loaders and use them to calculate batch losses. Here, we may need some adaptions to make sure that the "sample weight" part of the batch is not given to the model but forwarded to the loss function.

vcharpenay

Hi,

I'm also interested in having weighted loss (for different purposes than what @sbonner0 describes) so I looked at the current state of the PR. Here are a few comments.

vcharpenay · 2023-03-15T14:20:17Z

src/pykeen/training/slcwa.py

@@ -83,6 +84,7 @@ def _process_batch_static(
        stop: Optional[int],
        label_smoothing: float = 0.0,
        slice_size: Optional[int] = None,
+        relation_weights: dict = None,


the term relation is a bit confusing here. Does it refer to the fact weights are based on relation frequencies? If so, I assume the argument should eventually be renamed instance_weights, which is more generic.

Do I understand it correctly?

vcharpenay · 2023-03-15T14:30:19Z

src/pykeen/training/slcwa.py

@@ -57,6 +57,7 @@ def _create_training_data_loader(
            dataset=triples_factory.create_slcwa_instances(
                batch_size=batch_size,
                shuffle=kwargs.pop("shuffle", True),
+                instance_weighting="test",


as I see it, the TraningLoop gets a reference to some InstanceWeighting passed via train(). Then, it must inject it in the data loader, which injects it in the batched instance (the dataset) via create_(s)lcwa_instances. But the one object to use the InstanceWeighting actually is the training loop itself, when processing a batch.

Is the multi-step dependency injection necessary here? Instead, the TrainingLoop could simply keep a reference to the InstanceWeighting and pass it as argument to _process_batch_static().

by the way, I assume the weighting procedure should be passed via train (i.e. via training_kwargs) and not via training_loop_kwargs because the same TrainingLoop may be used multiple times with different weights. Is that right?

vcharpenay · 2023-03-15T14:57:39Z

src/pykeen/training/training_loop.py

@@ -181,6 +181,12 @@ def __init__(

        logger.debug("we don't really need the triples factory: %s", triples_factory)

+        # Give all unique relation types equal weight by weighting each triple inversely related to its relations


eventually, this piece of code should be removed (and put in RelationBalancedInstanceWeighting.calculate_weights() instead), shouldn't it?

sbonner0 · 2023-04-11T16:53:41Z

Hi,

I'm also interested in having weighted loss (for different purposes than what @sbonner0 describes) so I looked at the current state of the PR. Here are a few comments.

Hey @vcharpenay thanks for all your input and sorry for the slow reply. I will try and address your comments later on but I just wanted to say that I would be very happy to receive some help with this PR so please do feel free to contribute and change whatever I have done thus far!

sbonner0 and others added 15 commits August 12, 2022 21:20

start of sample weight

97344be

Merge branch 'pykeen:master' into sample_weights

3258196

relation weight lookup

303add2

Merge branch 'sample_weights' of github.com:sbonner0/pykeen into samp…

fa1086c

…le_weights

Merge branch 'pykeen:master' into sample_weights

146c5ff

score-based reweighting

0e8ca87

added basic BCE reweighted loss

cef1c94

Merge branch 'pykeen:master' into sample_weights

dcb3d87

more debugging

6e00e75

added weights to MRL

b595db3

update comments

2a7a105

send weights to device

a2ddbb7

more device placement

d55682e

Merge branch 'pykeen:master' into sample_weights

c946af9

cleanup and recalculation of weights

6eca349

sbonner0 marked this pull request as draft October 16, 2022 17:18

Merge branch 'master' into sample_weights

df265a4

error attempt at following suggestions from mberr

d87f3a3

sbonner0 added 3 commits January 30, 2023 09:49

Merge branch 'master' into sample_weights

6e9ca80

Merge branch 'master' into sample_weights

929cce7

Merge branch 'master' into sample_weights

76a8462

mberr mentioned this pull request Mar 14, 2023

Access background information on triples within a batch #1238

Closed

vcharpenay reviewed Mar 15, 2023

View reviewed changes

sbonner0 and others added 3 commits April 11, 2023 17:53

Merge branch 'master' into sample_weights

5824358

Merge branch 'master' into sample_weights

9b8018b

Merge branch 'master' into sample_weights

132cf62

mberr mentioned this pull request Feb 19, 2024

MatKG #1368

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to reweight loss to account for differences in relation frequency #1142

Add option to reweight loss to account for differences in relation frequency #1142

sbonner0 commented Oct 16, 2022

francesco-tuveri commented Oct 19, 2022

mberr commented Oct 21, 2022

sbonner0 commented Oct 24, 2022

mberr commented Nov 6, 2022

sbonner0 commented Jan 28, 2023

mberr commented Jan 28, 2023

vcharpenay left a comment

vcharpenay Mar 15, 2023

vcharpenay Mar 15, 2023

vcharpenay Mar 15, 2023

vcharpenay Mar 15, 2023

sbonner0 commented Apr 11, 2023

		@@ -181,6 +181,12 @@ def __init__(

		logger.debug("we don't really need the triples factory: %s", triples_factory)

		# Give all unique relation types equal weight by weighting each triple inversely related to its relations

Add option to reweight loss to account for differences in relation frequency #1142

Are you sure you want to change the base?

Add option to reweight loss to account for differences in relation frequency #1142

Conversation

sbonner0 commented Oct 16, 2022

Link to the relevant feature request

Description of the Change

Possible Drawbacks

Verification Process

Release Notes

francesco-tuveri commented Oct 19, 2022

mberr commented Oct 21, 2022

sbonner0 commented Oct 24, 2022

mberr commented Nov 6, 2022

sbonner0 commented Jan 28, 2023

mberr commented Jan 28, 2023

vcharpenay left a comment

Choose a reason for hiding this comment

vcharpenay Mar 15, 2023

Choose a reason for hiding this comment

vcharpenay Mar 15, 2023

Choose a reason for hiding this comment

vcharpenay Mar 15, 2023

Choose a reason for hiding this comment

vcharpenay Mar 15, 2023

Choose a reason for hiding this comment

sbonner0 commented Apr 11, 2023