Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix batch computation when the global index is written in the output buffer #400

Merged
merged 16 commits into from May 13, 2024

Conversation

mairooni
Copy link
Collaborator

Description

This PR provides a fix for a corner case in batch processing, which was showcased by the uk.ac.manchester.tornado.unittests.batches.TestBatches.testBatchNotEven unittest. The issue occurs when the loop index is directly written in the output buffer, e.g.:

for (@Parallel int i = 0; i < data.getSize(); i++) {
      data.set(i, i);
}

Since, in batch processing, the loop bound of each kernel is equal to the batch size, the i written in the output buffer is correct only for the first batch. To solve this, the value of i is offseted based on the number of the batch.
For instance, the generated code for the example shown above is changed as follows:
data.set(i, i) -> data.set(i, i + batchNumber * BatchSize)

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

Run tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches


public TornadoHighTierContext(Providers providers, PhaseSuite<HighTierContext> graphBuilderSuite, OptimisticOptimizations optimisticOpts, ResolvedJavaMethod method, Object[] args,
TaskMetaData meta, boolean isKernel, long batchThreads) {
TaskMetaData meta, boolean isKernel, long batchThreads, int batchNumber, long batchSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a BatchConfig class or maybe a record to pass around long batchThreads, int batchNumber, long batchSize?

Copy link
Member

@jjfumero jjfumero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @mikepapadim . Let's create a class called BatchCompilationConfig and store all the related fields here. Then, what we pass around in is an object of type BatchCompilationConfig.

return indexInWrite;
}

public void indexInWrite() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void indexInWrite() {
public void getIndexInWrite() {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do this change, you need to refactor all dependencies.

@@ -180,7 +180,11 @@ private static Sketch buildSketch(ResolvedJavaMethod resolvedMethod, Providers p
mergeAccesses(methodAccesses, invoke.callTarget(), sketch.getArgumentsAccess());
});

return new Sketch(graph.copy(TornadoCoreRuntime.getDebugContext()), methodAccesses);
Sketch sketch = new Sketch(graph.copy(TornadoCoreRuntime.getDebugContext()), methodAccesses);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass the indexInWrite to the Sketch constructor. Also, I think the name should be changed. This is more like batchWriteThreadIndex and the method for get and set should have get and set as prefix.

}

@Override
public void setBatchSize(long batchSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between batchSIze and batchThread

void setBatchThreads(long batchThreads);

void setBatchSize(long batchSize);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of batchThreads changes when the chunks are not even, the batchSize is always the initial batch size. For example, if we have 3 batches, 60000, 60000 and 20000, the batchSize will be 60000 throughout the computation, even though the batch threads will change from 60000 to 20000 when computing the final chunk.

@jjfumero
Copy link
Member

jjfumero commented May 2, 2024

All batch tests passing for all supported backends (OpenCL, SPIR-V and PTX).

@jjfumero
Copy link
Member

jjfumero commented May 2, 2024

What happens when you have an expression that uses the thread-ID to store a value?

For example:

for (@Parallel int i = 0; i < data.getSize(); i++) {
      data.set(i, i * 20 + beta);
}

@mairooni
Copy link
Collaborator Author

mairooni commented May 9, 2024

What happens when you have an expression that uses the thread-ID to store a value?

For example:

for (@Parallel int i = 0; i < data.getSize(); i++) {
      data.set(i, i * 20 + beta);
}

This case also works, I just included a unittest to check. To test it, run
tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#testBatchThreadIndex

@mairooni
Copy link
Collaborator Author

mairooni commented May 9, 2024

All comments have been applied. This PR is ready for another review.

Copy link
Member

@jjfumero jjfumero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing license header

@jjfumero jjfumero requested a review from stratika May 10, 2024 08:24
@mairooni
Copy link
Collaborator Author

Missing license header

Done

@@ -0,0 +1,74 @@
/*
* Copyright (c) 2024 APT Group, Department of Computer Science,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (c) 2024 APT Group, Department of Computer Science,
* Copyright (c) 2024, APT Group, Department of Computer Science,

@@ -0,0 +1,107 @@
/*
* Copyright (c) 2024 APT Group, Department of Computer Science,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (c) 2024 APT Group, Department of Computer Science,
* Copyright (c) 2024, APT Group, Department of Computer Science,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forgot the comma after the year. Please append all files that have new headers.

appendPhase(new TornadoTaskSpecialisation(canonicalizer));
appendPhase(new TornadoBatchGlobalIndexOffset());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned that this phase is batch-specific. However, it is invoked also when batch processing is not used. Right?

@jjfumero
Copy link
Member

@stratika , anything pending from your side? Can we merge this PR?

@stratika
Copy link
Collaborator

From my side, the suggestion for the header parts of the new files and some comments that have not been answered.

@mairooni
Copy link
Collaborator Author

I added the commas in the headers.

@jjfumero jjfumero merged commit 772905a into beehive-lab:develop May 13, 2024
1 check passed
@mairooni mairooni deleted the fix/batch_index branch May 13, 2024 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants