Primary Author: Thomas
This service services ExtractionRequestMessage
which is a request to extract a given set images identified by a key tag (e.g. SeriesInstanceUID
) collection (e.g. 5000 SeriesInstanceUID values). It is the job of the Cohort Extractor to identify the images which correspond to the specified key values requested (e.g. the SeriesInstanceUID
) and generate output messages to downstream processes responsible for anonymising the images.
There can be multiple datasets in which matching images should be sourced e.g. MR / CT which could even reside on different servers. Datasets are identified and distinguished from one another through RDMP ICatalogue
which exists already as part of the data load process (See DicomRelationalMapper
).
- Clone the project and build. Any NuGet dependencies should be automatically downloaded
- Edit the yaml.default with the configuration for your environment
- Pick an implementation of
IAuditExtractions
e.g.Microservices.CohortExtractor.Audit.NullAuditExtractions
and enter the full Type name into default.yamlAuditorType
- Pick an implementation of
IExtractionRequestFulfiller
e.g.Microservices.CohortExtractor.Execution.RequestFulfillers.FromCataloguesExtractionRequestFulfiller
and enter the full Type name into default.yamlRequestFulfillerType
. - Specify the mapping RDMP catalogue database
- Optionally specify a list of Catalogue IDs in CataloguesToExtractFrom (or set it to * to use any). Depending on your
IExtractionRequestFulfiller
this value might be ignored.
Read/Write | Type | Config setting |
---|---|---|
Read | ExtractionRequestMessage | CohortExtractorOptions.QueueName |
Write | ExtractFileMessage | CohortExtractorOptions.ExtractFilesProducerOptions |
Write | ExtractFileCollectionInfoMessage | CohortExtractorOptions.ExtractFilesInfoProducerOptions |
Command Line Options | Purpose |
---|---|
CliOptions | Allows overriding of which yaml file is loaded. |
YAML Section | Purpose |
---|---|
RabbitOptions | Describes the location of the rabbit server for sending messages to |
RDMPOptions | Describes the location of the Microsoft Sql Server RDMP platform databases which keep track of load configurations, available datasets to extract images from (tables) etc |
CohortExtractorOptions | Which Catalogues to extract, which classes to instantiate to do the extraction |
Command Line Options | Purpose |
---|---|
CliOptions | Allows overriding of which yaml file is loaded. |
The set of images that could be extracted is controlled by the IExtractionRequestFulfiller
.
The current recommended implementation is FromCataloguesExtractionRequestFulfiller. This fulfiller will look up one or more tables or multi table joins (Catalogues) and search for the provided extraction key (e.g. SeriesInstanceUID = x)
The matched records are what will be reported on e.g. "for x UIDs we found y available images". From this result set a subset will be rejected (because you have made row level decisions not to extract particular images). This is handled by the Rejector
Configure the fulfiller in your options yaml:
OnlyCatalogues: 1,2,3
RequestFulfillerType: Microservices.CohortExtractor.Execution.RequestFulfillers.FromCataloguesExtractionRequestFulfiller
Records matched by the Fulfiller are passed to the IRejector
(if any is configured). This class can make last minute decisions on a row by row level to either extract or forbid (with a specific provided reason) the extraction of an image.
The currently recommended implementation is DynamicRejector. To use the dynamic rejector edit your options yaml as follows:
RejectorType: Microservices.CohortExtractor.Execution.RequestFulfillers.Dynamic.DynamicRejector
Using the DynamicRejector also requires you to configure a file DynamicRules.txt in the execution directory of your binary. An example is provided (see DynamicRules.txt).
Rules are written in C# and can only index fields that appear in the records returned by the Fulfiller.
All matching of request criteria is handled by IExtractionRequestFulfiller
.
All audit is handled by IAuditExtractions
.
The extraction destination is handled by IProjectPathResolver
- No files matching a given tag X
- ???
- No value in patient id substitution Y
- ???
- Others? TODO
- Operation on loss of RabbitMQ connection:
- No special logic
- Operation on loss of access to catalogues:
- Any Exception thrown by the
ISwapIdentifiers
will not be caught triggering a Fatal on theConsumer
.
- Any Exception thrown by the