Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in submitting a job #43

Open
smillidge opened this issue Nov 11, 2015 · 3 comments
Open

Race condition in submitting a job #43

smillidge opened this issue Nov 11, 2015 · 3 comments

Comments

@smillidge
Copy link
Contributor

Using JBatch within a JavaEE environment there is a potential race condition when submitting a batch job. When an EJB submits a job (e.g. from Cargo Tracker example app)

@Stateless
public class UploadDirectoryScanner {

    @Schedule(minute = "*/2", hour = "*") // Runs every fifteen minutes
    public void processFiles() {
        JobOperator jobOperator = BatchRuntime.getJobOperator();
       jobOperator.start("EventFilesProcessorJob", null);
    }
}

calling jobOperator.start has the effect of inserting a row into the EXECUTIONINSTANCEDATA table and then submitting a job via the JobExecutorService.

The race condition occurs because the insertion of the row into the EXECUTIONINSTANCEDATA table is in the scope of the EJB transaction and therefore is not visible in the database until the EJB transaction commits typically when the method exits. In the case above very quickly, however in the general case we do not know the delay between calling jobOperator.start and the row being committed into the database. In fact the EJB could mark the transaction for rollback.

However the submission of the job to the Executor thread pool is not synchronized with the outer EJB transaction. This means that the the job can be started on the second executor thread before the row in the EXECUTIONINSTANCEDATA table is visible. This causes Foreign Key constraint violations when the job executes a step and tries to insert a row into the STEPEXECUTIONINSTANCEDATA table referencing the EXECUTIONINSTANCEDATA table.

The resolution is to either synchronise the job submission with the EJB transaction using the javax.jta.TransactionSynchronizationRegistry or alternatively moving the logic that inserts the row into the EXECUTIONINSTANCEDATA into the code ran on the executor service thread pool.

@scottkurz
Copy link
Member

Hi Steve,

We did note that we'd failed to consider this scenario (job start within a transaction) on the spec mailing list recently see here.

The solution seemed to me to have the batch container suspend the transaction. I don't think the fact that one is using a transactional (e.g. DB) or non-transactional job repository should be part of the batch programming model.

I also mentioned that in WebSphere Liberty we suspend around the "read" operations as well.

I was thinking of adding this to 1.1 RI on start and restart. Not sure about the "read", as I'm also not sure how exactly to update the 1.1 spec.

What do you think?

@smillidge
Copy link
Contributor Author

Thanks for that Scott, does that mean the RI should suspend the transaction as well?

@scottkurz
Copy link
Member

Yes, I was planning to introduce that change, suspending around all operations. Wondering what you thought

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants