This is a distributed version of the ColorMIP_Mask_Search Fiji plugin, running on Apache Spark.
./mvnw install
or just
./mvnw package
This will produce a jar called target/colormipsearch-<VERSION>-jar-with-dependencies.jar
which can be run
either with Spark or on the local host or on the cluster by using bsub.
Before running the release script make sure you have an server entry for the janelia-repo in your default maven settings.xml, typically located at ~/.m2/settings.xml
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
<servers>
<server>
<id>janelia-releases</id>
<username>yourusername</username>
<password>yourpassword</password>
</server>
</servers>
</settings>
If you don't want the password in clear in settings.xml, maven offers a mechanism to encrypt it using:
mvn --encrypt-master-password <password>
to create a master password and then use
mvn --encrypt-password <password>
which you can enter in place of your password. Check maven documentation how you can do this.
To release the artifacts simply run:
./release.sh <version>
This command will also tag the repository with the <version>
.
java -Xmx120G -Xms120G -jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
searchLocalFiles \
-i /nrs/jacs/jacsData/filestore/system/ColorDepthMIPs/JRC2018_Unisex_20x_HR/flylight_splitgal4_drivers/GMR_MB242A-20121208_31_H4-20x-Brain-JRC2018_Unisex_20x_HR-1846510864534863970-CH1_CDM.png \
-m /nrs/jacs/jacsData/filestore/system/ColorDepthMIPs/JRC2018_Unisex_20x_HR/flyem_hemibrain/5901194966_RT_18U.tif \
--maskThreshold 100 \
-od /groups/jacs/jacsDev/devstore/goinac/cdtest/test3
Here's another example to compare images from a zip archive in which we compare the first 1000
images from /nrs/jacs/jacsData/filestore/system/flylight_40xMCFO_Segmented_PackBits.zip
with
all images from /groups/scicomp/scicompsoft/otsuna/Brain/EM_Hemibrain/Masayoshi_selected.zip
java -Xmx480G -Xms480G -jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
searchLocalFiles \
-i /nrs/jacs/jacsData/filestore/system/flylight_40xMCFO_Segmented_PackBits.zip:0:1000 \
-m /groups/scicomp/scicompsoft/otsuna/Brain/EM_Hemibrain/Masayoshi_selected.zip \
-od local/testData/masayoshi-cds \
-gp /nrs/jacs/jacsData/filestore/system/flylight_40xMCFO_gradient_20px.zip \
-result masayoshiResult-0-1000 \
-lps 1000
java -jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
gradientScore \
-rf local/testData/results/qq.json \
-gp local/testData/flylight_40xMCFO_gradient_20px.zip \
-rd local/testData/results.withscore
A more detailed description can be found in PrecomputedData.md.
This section shows the steps and the commands used for generating the precomputed color depth search data
The MIPs metadata as explained in PrecomputedData.md is important for LM line or EM body lookup.
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
groupMIPsByPublishedName \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flylight_split_gal4_published \
--segmented-image-handling 0 \
-od local/testData/mips \
-lmdir split_gal4_lines
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
groupMIPsByPublishedName \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flylight_gen1_mcfo_published \
--segmented-image-handling 0 \
-od local/testData/mips \
-lmdir gen1_mcfo_lines
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
groupMIPsByPublishedName \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flyem_hemibrain \
-od local/testData/mips \
-emdir em_bodies
The steps to generate the precomputed color depth search results are the following:
- Prepare the input file for the color depth matches
- Calculate color depth matches between EM MIPs and LM MIPs
- this step outputs one set of color depth matches grouped by EM MIP ID and one set of matches grouped by the LM MIP ID
- Calculate gradient based score for the matches from EM MIPs to LM MIPs, considering only color depth matches grouped by EM MIP ID.
- Update the gradient score for the LM to EM matches using the gradient score from the corresponding EM to LM match calculated in the previous step
- Merge results - if the color depth search results between EM and various LM libraries were computed separately, for example EM vs Split-Gal4 matches were calculated separately from EM vs MCFO, then this step will merge the EM vs Split-Gal4 and EM vs MCFO in a single result file so that all EM matches can be ranked appropriately
- Rank final results and cleanup - this basically sorts the final results based on their gradient score and cleans up data that is not necessary for the release, such as image file names.
For MCFO and SplitGal4 input we use the corresponding segmented images as input (see the '--segmented-mips-base-dir' argument).
Prepare MCFO input:
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
createColorDepthSearchJSONInput \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flylight_gen1_mcfo_published \
--segmented-mips-base-dir /nrs/jacs/jacsData/filestore/system/40x_MCFO_Segmented_PackBits_forPublicRelease.zip \
--segmented-image-handling 0 \
-od local/testData/mips
Prepare SplitGal4 input:
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
createColorDepthSearchJSONInput \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flylight_split_gal4_published \
--segmented-mips-base-dir /nrs/jacs/jacsData/filestore/system/SS_Split/SS_Split_ALL_Segmented_CDM \
--segmented-image-handling 0 \
-od local/testData/mips
Prepare EM input:
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
createColorDepthSearchJSONInput \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flyem_hemibrain \
-od local/testData/mips
If you only need to run the color depth search for some new MIPS that have been added to the library you can do that using the following command:
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
createColorDepthSearchJSONInput \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flylight_split_gal4_published \
-od local/testData/mips \
--output-filename new_splitgal4_published \
--excluded-mips local/testData/mips/existing_splitgal4_published.json
The command assumes that "local/testData/mips/existing_splitgal4_published.json" is a JSON file that contains the metadat for all MIPs that you don't need to compute and it will put the MIPs that require color depth search in the "new_splitgal4_published.json" ("--excluded-mips" argument) under "local/testData/mips" directory ("-od" argument)
"createColorDepthSearchJSONInput" also supports an "--excluded-libraries" which checks the case in which a MIP is part of multiple libraries and if it is part of any library specified in the --excluded-lubraries list then the MIP will not be included in the output JSON.
Sometime the image URLs need to be replaced with image URLs that reference MIPs generated with a different gamma value that are better suited for viewing the matched regions. Image URLs can be updated at any time but the sooner we apply this the fewer files that need to be updated downstream.
This step requires a JSON file that contains the new image URLs that will replace the ones that are set in the JSON created from 'flylight_gen1_mcfo_published' library. The new URLs for example could come from the MIPs generated with a different gamma value. There are two sets of JSON files that will have to have the imageURLs update:
- the MCFO lines will have to be changed to reference the new URLs
- the EM to MCFO color depth search results reference the image of the MCFO MIP that matches the EM MIP entry, so these will have to be updated to reference the new URL.
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
createColorDepthSearchJSONInput \
--jacsURL http://goinac-ws1.int.janelia.org:8800/api/rest-v2 \
--authorization "Bearer tokenvalue" \
-l flylight_gen1_mcfo_case_1_gamma1_4 \
-od local/testData/mips
Update image URLs for the lines:
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
replaceAttributes \
-attrs local/testData/mips/flylight_gen1_mcfo_case_1_gamma1_4.json \
--input-dirs local/testData/mips/gen1_mcfo_lines \
--id-field id \
--fields-toUpdate imageURL thumbnailURL \
-od local/testData/mips.gamma1.4/gen1_mcfo_lines
Update image URLs for the input MIPs:
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
replaceAttributes \
-attrs local/testData/mips/flylight_gen1_mcfo_case_1_gamma1_4.json \
--input-file local/testData/mips/flylight_gen1_mcfo_published.json \
--id-field id \
--fields-toUpdate imageURL thumbnailURL \
-od local/testData/mips.gamma1.4
This step can also be applied later to replace the imageURLs in the result files. As a reminder the imageURLs in the result files reference the matched image so the field that we lookup for a match is "matchedId"
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
replaceAttributes \
-attrs local/testData/cdsresult/flylight_gen1_mcfo_case_1_gamma1_4.json \
--input-dirs local/testData/cdsresults.merged/flyem_hemibrain-vs-flylight \
--id-field id \
--fields-toUpdate imageURL thumbnailURL \
-od local/testData/cdsresults/flyem_hemibrain.gamma1_4
Even though this process is run only once it outputs two sets of result files: one for the EM -> LM results and one for the LM -> EM results, indexed by the EM mip IDs or LM mip IDs respectively.
Calculate EM vs SplitGal4 results
java -Xmx180G -Xms180G \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
searchFromJSON \
-m local/testData/mips/flyem_hemibrain.json:0:5000 \
-i local/testData/mips/flylight_gen1_mcfo_published.json:0:40000 \
--maskThreshold 20 \
--dataThreshold 30 \
--xyShift 2 \
--pixColorFluctuation 1 \
--pctPositivePixels 1 \
--mirrorMask \
--processingPartitionSize 4000 \
--perMaskSubdir flyem_hemibrain-vs-gen1_mcfo \
--perLibrarySubdir flylight_gen1_mcfo_published \
-od local/testData/cdsresults
java -Xmx180G -Xms180G \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
searchFromJSON \
-m local/testData/mips/flyem_hemibrain.json:0:35000 \
-i local/testData/mips/flylight_split_gal4_published.json:0:7800 \
--maskThreshold 20 \
--dataThreshold 30 \
--xyShift 2 \
--pixColorFluctuation 1 \
--pctPositivePixels 1 \
--mirrorMask \
--processingPartitionSize 4000 \
--perMaskSubdir flyem_hemibrain-vs-split_gal4 \
--perLibrarySubdir flylight_split_gal4_published \
-od local/testData/cdsresults
java -Xmx240G -Xms240G \
-Dlog4j.configuration=file:///groups/scicompsoft/home/goinac/Work/color-depth-spark/local/log4j.properties \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
--cacheSize 5000 --cacheExpirationInSeconds 60 \
gradientScore \
--maskThreshold 20 \
--negativeRadius 20 \
--mirrorMask \
--topPublishedNameMatches 500 \
--topPublishedSampleMatches 0 \
--processingPartitionSize 5 \
-gp /nrs/jacs/jacsData/filestore/system/40x_MCFO_Segmented_PackBits_forPublicRelease_gradient.zip \
-zgp /nrs/jacs/jacsData/filestore/system/40x_MCFO_Segmented_PackBits_forPublicRelease_20pxRGBMAX.zip --zgapSuffix _20pxRGBMAX \
-rd local/testData/cdsresults/flyem_hemibrain-vs-gen1_mcfo:1400:20 \
-od local/testData/cdsresults.ga/flyem_hemibrain-vs-gen1_mcfo
java -Xmx240G -Xms240G \
-Dlog4j.configuration=file:///groups/scicompsoft/home/goinac/Work/color-depth-spark/local/log4j.properties \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
--cacheSize 5000 --cacheExpirationInSeconds 60 \
gradientScore \
--maskThreshold 20 \
--negativeRadius 20 \
--mirrorMask \
--topPublishedNameMatches 500 \
--topPublishedSampleMatches 0 \
--processingPartitionSize 5 \
-gp /nrs/jacs/jacsData/filestore/system/SS_Split/SS_Split_ALL_Segmented_gradient \
-zgp /nrs/jacs/jacsData/filestore/system/SS_Split/SS_Split_ALL_Segmented_20pxRGB --zgapSuffix _20pxRGB \
-rd local/testData/cdsresults/flyem_hemibrain-vs-split_gal4:1400:100 \
-od local/testData/cdsresults.ga/flyem_hemibrain-vs-split_gal4
In order to simplify the parallelization options for Steps 2 and 3 colormipsearch-tools module has a set of bash scripts that allow the user to the color depth matches and the gradient scoring jobs into smaller jobs and run them on a host with enough resources or on the grid. The commands to run these steps are:
sh colormipsearch-tools/src/main/scripts/submitCDSBatch.sh
and
sh colormipsearch-tools/src/main/scripts/submitGABatch.sh
respectively.
Before running this commands make sure to set the appropriate parameters
in colormipsearch-tools/src/main/scripts/cdsparams.sh
which is imported
in the above submit scripts.
This step actually transfers the gradient score from the ones calculated for EM -> LM results to the corresponding LM -> EM results.
java -Xms480G -Xmx480G \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
gradientScoresFromMatchedResults \
-rd local/testData/cdsresults/flylight_gen1_mcfo_published \
-revd local/testData/cdsresults.ga/flyem_hemibrain-vs-gen1_mcfo \
-od local/testData/cdsresults.ga/flylight_gen1_mcfo_published \
-ps 500
java -Xms480G -Xmx480G \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
gradientScoresFromMatchedResults \
-rd local/testData/cdsresults/flylight_split_gal4_published \
-revd local/testData/cdsresults.ga/flyem_hemibrain-vs-split_gal4 \
-od local/testData/cdsresults.ga/flylight_split_gal4_published \
-ps 500
When we search EM matches we want to see both Spligal4 matches and MCFO matches, therefore the flyem-vs-mcfo and flyem-vs-sgal4 will have to be merged in a single directory.
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
mergeResults \
-rd \
local/testData/cdsresults.ga/flyem_hemibrain-vs-gen1_mcfo \
local/testData/cdsresults.ga/flyem_hemibrain-vs-split_gal4 \
-od local/testData/cdsresults.merged/flyem_hemibrain-vs-flylight
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
normalizeScores \
-rd local/testData/updateURLS/cdsresults/flyem_hemibrain-vs-flylight \
--pctPositivePixels 2.0 \
-cleanup \
-od local/testData/cdsresults.normalized/flyem_hemibrain-vs-flylight
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
normalizeScores \
-rd local/testData/cdsresults.ga/flylight_split_gal4_published \
--pctPositivePixels 2.0 \
-cleanup \
-od local/testData/cdsresults.normalized/flylight_split_gal4_published
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
normalizeScores \
-rd local/testData/cdsresults.ga/flylight_gen1_mcfo_published \
--pctPositivePixels 2.0 \
-cleanup \
-od local/testData/cdsresults.normalized/flylight_gen1_mcfo_published
This step requires aws-cli installed on the host from which we upload the data to S3.
Upload MIPs
# generate publishedNames.txt
find . local/testData/mips \
-type f \
-name "*.json" \
-printf "%f\n" | sed s/.json// | sort -u > publishedNames.txt
aws s3 cp publishedNames.txt s3://janelia-neuronbridge-data-prod/publishedNames.txt
# create and upload data version file
echo 1.0.0 > local/testData/DATA_VERSION
aws s3 cp \
local/testData/DATA_VERSION \
s3://janelia-neuronbridge-data-prod/metadata/cdsresults/DATA_VERSION
# upload MCFO lines
aws s3 cp \
local/testData/mips.gamma1.4/gen1_mcfo_lines \
s3://janelia-neuronbridge-data-prod/metadata/by_line --recursive
# upload SplitGal4 lines
aws s3 cp \
local/testData/mips/split_gal4_lines \
s3://janelia-neuronbridge-data-prod/metadata/by_line --recursive
# upload EM skeletons
aws s3 cp \
local/testData/mips/em_bodies \
s3://janelia-neuronbridge-data-prod/metadata/by_body --recursive
Upload Results
aws s3 cp \
local/testData/cdsresults.normalized/flyem_hemibrain-vs-flylight \
s3://janelia-neuronbridge-data-prod/metadata/cdsresults --recursive
aws s3 cp \
local/testData/cdsresults.normalized/flylight_split_gal4_published \
s3://janelia-neuronbridge-data-prod/metadata/cdsresults --recursive
aws s3 cp \
local/testData/cdsresults.normalized/flylight_gen1_mcfo_published \
s3://janelia-neuronbridge-data-prod/metadata/cdsresults --recursive
PPP_IMPORT_DIR=/nrs/saalfeld/kainmuellerd/flymatch/all_hemibrain_1.2_NB/setup22_nblast_20/results/68
PPP_RESULTS_DIR=/nrs/neuronbridge/v2.2/pppresults/flyem-to-flylight
java \
-jar target/colormipsearch-2.8.0-jar-with-dependencies.jar \
convertPPPResults \
-ps 30 \
--jacs-read-batch-size 3000 \
-rd ${PPP_IMPORT_DIR} \
-od ${PPP_RESULTS_DIR}