Skip to content
James Agnew edited this page Apr 11, 2022 · 1 revision

Batch2 Terminology

  • Job Definition ID: Each kind of job has a unique hardcoded identifier identifying that kind. This ID is an arbitrary string of characters, but should include only URL-friendly characters. For example, the resource reindexing job uses the definition ID REINDEX.
  • Job Definition Version: The framework technically supports having multiple versions of a job definition, which will allow for safe migration if jobs change in a meaningful way (e.g. by adding new steps). So far all job definitions have a version of 1 but in the future we can add new versions of existing jobs.
  • Job Step: For a given job definition, there will be a series of steps.
  • Job Parameters: The parameters used to start a particular job instance.
  • Work Chunk: A value object that is passed asynchronously between job steps.
  • Job Instance: An actual running (or queued, or stopped, etc.) instance of a job definition.
  • Step Worker: A class that actually implements step logic.

Steps for creating a new Batch2 job

  • Create a new package in hapi-fhir-storage-batch2-jobs
  • Define your start parameters object and call it FooParameters (it needs to be an IModelJson object, see BulkImportJobParameters for an example
  • It is important to add validation logic so that nobody can submit an invalid job that will blow up asynchronously. Add JSR-380 validation annotations to your parameters object. See this page for details on these annotations. See BulkImportJobParameters for an example parameters object that uses these annotations.
  • If you have validation logic that can't be satisfied by simple annotations, you can also implement programatic validation by creating an implementation of IJobParametersValidator<ParametersType>. See ReindexJobParametersValidator for an example.
  • The first step of your job will take this parameters as input and emit 0..* work chunks for processing in the subsequent step. Next, define your data model for the work chunks emitted by the first step. This too needs to be an IModelJson object. See NdJsonFileJson for an example
  • Create an initial step worker class that uses the start parameters as input and emits your work chunks. This class needs to implement IFirstJobStepWorker<ParametrsType, OutputChunkModelType>. See GenerateRangeChunksStep for an example.
  • Your job must have at least two steps (one to create the work chunks and one to process them) but it may have more intermediary steps as well. Technically is may have unlimited steps, but in reality most job types will have 2-3 steps only. If your job will have more than 2 steps, for each step before the final step:
    • Each step uses the previous step's work chunk model as input, and produced a new work chunk model as output. First, define a new work chunk model for the output of the new step.
    • Then, create a worker class. It will implement IJobStepWorker<ParametersType, InputChunkModelType, OutputChunkModelType>. See LoadIdsStep for an example.
  • For your final step, no output work chunk model is required or allowed since the final step can only consume data, not produce it. Create a final step worker that implenents IJobStepWorker<ParametersType, InputChunkModelType, VoidModel>. See ReindexStep for an example.
  • Create an @Configuration class to wire up your job. See ReindexAppCtx for an example of this.
  • Add your new Configuration class to Batch2JobsConfig.