New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New IDAaaS data cache index file loading #37187
New IDAaaS data cache index file loading #37187
Conversation
- Created a very simplified implementation and a unit test
Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk> Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- I made the change because header files from DataHandling could not be imported into files in the API, due to DataHandling being missing from target_link_libraries in the CMakeLists.txt API file. Trying to fix this causes a circular dependency. Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk> Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk> Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added section to look for files in hard coded Linux path - Changed search function to use relative paths - Corrected unit test Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk> Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- The name of the directories in the cache uses the extended version i.e. 'MARI' instead of 'MAR', changed the function to account for this - Changed split into instr and run to handle edge case of 'SANS2D' (digit in instrument name) Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk> Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Function getInstrument() in the ConfigService is used to retrieve the instrument info from a string argument. - The argument of the function is either the name or short name of the instrument - Added option to pass the name or short name + delimiter and retrieve the correct instrument Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added unit test for case when there is a delimiter between instrument name and run number - Also checks that short name `PG3` gets properly expanded to `POWGEN` Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added some fail proof checks - Added unit tests for these cases Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added a test class that creates a dummy directory with the same structure as the data cache and performs unit tests on the path outputs - Introduced safety checks in ISISInstrDataCache to cover for edge cases - Added the data cache path to Mantid properties file Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- When user does not have access permissions to a folder, it will print an error informing the user Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Deleted unecessary changes that I had made - Replaced instrument finder by one that already accounts for delimiters and other edge cases - Removed unecessary notice() Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
c3291b0
to
1a189a0
Compare
- Added datacachesearch.directory to properties file documentation Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Organized catching of exceptions for bad inputs - Added check for when user does not have permission to open directory - Changed logging from notice level to debug level Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Made a quick fix for the getInstrument function, to make it throw the exception instead of using a default value - Not proud of this fix, but when I tried to take out the exception handling out of getInstrument, a few other things were breaking Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good already, some things to improve.
Other than the comments I've added, I think we can call these files / classes ISISInstrumentDataCache
etc. We don't need to cut down on file name length (sorry this is a bit of a pain).
Also, I think getFileParentDirPath
could be broken up into smaller functions, I've added one suggestion but there might be other places you could do this.
Co-authored-by: Jonathan Haigh <35813666+jhaigh0@users.noreply.github.com>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an intermittent issue with json parsing on IDAaaS that we are investigating.
Also be aware there seems to be an MSBuild warning - https://builds.mantidproject.org/job/pull_requests-conda-windows/5886/msbuild/ |
- Check for permission denied is now done inside the loop that directly tries to access files - Added check for non existing parent directory - Improved error message when file is not found whithin parent directory Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- If for some reason the index file on IDAaaS is currupted (eg. being updated) then the error is caught - Allows for the search to continue on the archive Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for holding this up, just have a few suggestions, then happy for this to be merged
The issue with json parsing has been fixed by the DAaaS team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! @jhaigh0 if you're happy then I'll enable auto merge
Description of work
Implemented search in data cache for files before defaulting to search in the data archive. This implementation leaves the previous implementation mostly untouched, and only adds a search in the data cache before advancing to the search in the archive. So the order for searching is User Directories > Data Cache > Data archive. This gives priority to local changes that users have made.
Summary of work
Added
ISISInstrDataCache.h
andISISInstrDataCache.cpp
which is where the implementation resides. Other than that, changedFileFinder.cpp
to call the search. Added unit tests.Purpose of work
To give access to data files through the data cache on IDAaaS even when the user does not have access to the archive.
Fixes #37013
Further detail of work
Added a few comments and print statements to make it clear what is going on. We can remove them during the review stage.
To test:
Build to test: https://builds.mantidproject.org/job/build_packages_from_branch/792/
mamba install -c mantid/label/idaas_data_cache_debug mantidworkbench
/data/instrument/ALF/ALF_index.json
. The numbers on the left side correspond to available run numbersALF_index.json
file. You don't need to clickRun
, if theOutputWorkspace
field gets updated then the file was successfully found. Try this several times and check they all successfully found the file.99999
ALF
,1234ALF
MAR27340
Reviewer
Please comment on the points listed below (full description).
Your comments will be used as part of the gatekeeper process, so please comment clearly on what you have checked during your review. If changes are made to the PR during the review process then your final comment will be the most important for gatekeepers. In this comment you should make it clear why any earlier review is still valid, or confirm that all requested changes have been addressed.
Code Review
Functional Tests
Does everything look good? Mark the review as Approve. A member of
@mantidproject/gatekeepers
will take care of it.Gatekeeper
If you need to request changes to a PR then please add a comment and set the review status to "Request changes". This will stop the PR from showing up in the list for other gatekeepers.