Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New IDAaaS data cache index file loading #37187

Merged
merged 40 commits into from May 10, 2024

Conversation

GuiMacielPereira
Copy link
Contributor

@GuiMacielPereira GuiMacielPereira commented Apr 19, 2024

Description of work

Implemented search in data cache for files before defaulting to search in the data archive. This implementation leaves the previous implementation mostly untouched, and only adds a search in the data cache before advancing to the search in the archive. So the order for searching is User Directories > Data Cache > Data archive. This gives priority to local changes that users have made.

Summary of work

Added ISISInstrDataCache.h and ISISInstrDataCache.cpp which is where the implementation resides. Other than that, changed FileFinder.cpp to call the search. Added unit tests.

Purpose of work

To give access to data files through the data cache on IDAaaS even when the user does not have access to the archive.

Fixes #37013

Further detail of work

Added a few comments and print statements to make it clear what is going on. We can remove them during the review stage.

To test:

Build to test: https://builds.mantidproject.org/job/build_packages_from_branch/792/

  1. Log into IDAaaS
  2. Create a new conda environment and install build with mamba install -c mantid/label/idaas_data_cache_debug mantidworkbench
  3. Open workbench
  4. Open the file /data/instrument/ALF/ALF_index.json. The numbers on the left side correspond to available run numbers
  5. In the workbench, change the default instrument to ALF and turn off access to the data archive.
  6. Open the Load GUI and type in several run numbers of your choice chosen from the ALF_index.json file. You don't need to click Run, if the OutputWorkspace field gets updated then the file was successfully found. Try this several times and check they all successfully found the file.
  7. Try some bad inputs (the error message will be displayed when hovering over the red asterisk):
    • A run number that doesn't exist, for example 99999
    • Filename with wrong format, eg. ALF, 1234ALF
    • File we don't have access to, eg. MAR27340

Reviewer

Please comment on the points listed below (full description).
Your comments will be used as part of the gatekeeper process, so please comment clearly on what you have checked during your review. If changes are made to the PR during the review process then your final comment will be the most important for gatekeepers. In this comment you should make it clear why any earlier review is still valid, or confirm that all requested changes have been addressed.

Code Review

  • Is the code of an acceptable quality?
  • Does the code conform to the coding standards?
  • Are the unit tests small and test the class in isolation?
  • If there is GUI work does it follow the GUI standards?
  • If there are changes in the release notes then do they describe the changes appropriately?
  • Do the release notes conform to the release notes guide?

Functional Tests

  • Do changes function as described? Add comments below that describe the tests performed?
  • Do the changes handle unexpected situations, e.g. bad input?
  • Has the relevant (user and developer) documentation been added/updated?

Does everything look good? Mark the review as Approve. A member of @mantidproject/gatekeepers will take care of it.

Gatekeeper

If you need to request changes to a PR then please add a comment and set the review status to "Request changes". This will stop the PR from showing up in the list for other gatekeepers.

@GuiMacielPereira GuiMacielPereira linked an issue Apr 19, 2024 that may be closed by this pull request
5 tasks
@GuiMacielPereira GuiMacielPereira added the ISIS Team: Core Issue and pull requests managed by the Core subteam at ISIS label Apr 19, 2024
@GuiMacielPereira GuiMacielPereira added this to the Release 6.10 milestone Apr 19, 2024
GuiMacielPereira and others added 13 commits April 19, 2024 16:01
- Created a very simplified implementation and a unit test
Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- I made the change because header files from DataHandling could not be imported into files in the API, due to DataHandling being missing from target_link_libraries in the CMakeLists.txt API file. Trying to fix this causes a circular dependency.

Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added section to look for files in hard coded Linux path
- Changed search function to use relative paths
- Corrected unit test

Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- The name of the directories in the cache uses the extended version i.e. 'MARI' instead of 'MAR', changed the function to account for this
- Changed split into instr and run to handle edge case of 'SANS2D' (digit in instrument name)

Co-authored-by: thomashampson <thomas.hampson@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Function getInstrument() in the ConfigService is used to retrieve the instrument info from a string argument.
- The argument of the function is either the name or short name of the instrument
- Added option to pass the name or short name + delimiter and retrieve the correct instrument

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added unit test for case when there is a delimiter between instrument name and run number
- Also checks that short name `PG3` gets properly expanded to `POWGEN`

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added some fail proof checks
- Added unit tests for these cases

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Added a test class that creates a dummy directory with the same structure as the data cache and performs unit tests on the path outputs
- Introduced safety checks in ISISInstrDataCache to cover for edge cases
- Added the data cache path to Mantid properties file

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- When user does not have access permissions to a folder, it will print an error informing the user

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Deleted unecessary changes that I had made
- Replaced instrument finder by one that already accounts for delimiters and other edge cases
- Removed unecessary notice()

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
@GuiMacielPereira GuiMacielPereira force-pushed the 37013-new-idaaas-data-cache-index branch from c3291b0 to 1a189a0 Compare April 19, 2024 15:22
GuiMacielPereira and others added 4 commits April 19, 2024 16:45
- Added datacachesearch.directory to properties file documentation

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Organized catching of exceptions for bad inputs
- Added check for when user does not have permission to open directory
- Changed logging from notice level to debug level

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- Made a quick fix for the getInstrument function, to make it throw the exception instead of using a default value
- Not proud of this fix, but when I tried to take out the exception handling out of getInstrument, a few other things were breaking

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Copy link
Contributor

@jhaigh0 jhaigh0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good already, some things to improve.
Other than the comments I've added, I think we can call these files / classes ISISInstrumentDataCache etc. We don't need to cut down on file name length (sorry this is a bit of a pain).
Also, I think getFileParentDirPath could be broken up into smaller functions, I've added one suggestion but there might be other places you could do this.

Framework/API/inc/MantidAPI/ISISInstrDataCache.h Outdated Show resolved Hide resolved
Framework/API/inc/MantidAPI/FileFinder.h Outdated Show resolved Hide resolved
Framework/API/src/FileFinder.cpp Outdated Show resolved Hide resolved
Framework/API/src/FileFinder.cpp Show resolved Hide resolved
Framework/API/src/ISISInstrDataCache.cpp Outdated Show resolved Hide resolved
Framework/API/test/ISISInstrDataCacheTest.h Outdated Show resolved Hide resolved
Framework/API/test/ISISInstrDataCacheTest.h Outdated Show resolved Hide resolved
Framework/API/src/FileFinder.cpp Outdated Show resolved Hide resolved
Framework/API/src/FileFinder.cpp Show resolved Hide resolved
GuiMacielPereira and others added 3 commits April 24, 2024 10:35
Co-authored-by: Jonathan Haigh <35813666+jhaigh0@users.noreply.github.com>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
GuiMacielPereira and others added 2 commits April 24, 2024 16:08
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
@GuiMacielPereira GuiMacielPereira marked this pull request as ready for review April 26, 2024 09:31
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
jhaigh0
jhaigh0 previously approved these changes Apr 29, 2024
Copy link
Contributor

@thomashampson thomashampson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an intermittent issue with json parsing on IDAaaS that we are investigating.

@sf1919
Copy link
Contributor

sf1919 commented May 7, 2024

Also be aware there seems to be an MSBuild warning - https://builds.mantidproject.org/job/pull_requests-conda-windows/5886/msbuild/

@sf1919 sf1919 self-assigned this May 9, 2024
GuiMacielPereira and others added 2 commits May 9, 2024 10:49
- Check for permission denied is now done inside the loop that directly tries to access files
- Added check for non existing parent directory
- Improved error message when file is not found whithin parent directory

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
- If for some reason the index file on IDAaaS is currupted (eg. being updated) then the error is caught
- Allows for the search to continue on the archive

Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
Framework/API/src/FileFinder.cpp Show resolved Hide resolved
Framework/API/src/FileFinder.cpp Show resolved Hide resolved
GuiMacielPereira and others added 2 commits May 9, 2024 13:36
Co-authored-by: Jonathan Haigh <jonathan.haigh@stfc.ac.uk>
jhaigh0
jhaigh0 previously approved these changes May 10, 2024
Copy link
Contributor

@robertapplin robertapplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for holding this up, just have a few suggestions, then happy for this to be merged

Framework/API/inc/MantidAPI/ISISInstrumentDataCache.h Outdated Show resolved Hide resolved
Framework/API/inc/MantidAPI/ISISInstrumentDataCache.h Outdated Show resolved Hide resolved
Framework/API/src/FileFinder.cpp Outdated Show resolved Hide resolved
buildconfig/CMake/CppCheck_Suppressions.txt.in Outdated Show resolved Hide resolved
@sf1919 sf1919 dismissed thomashampson’s stale review May 10, 2024 11:11

The issue with json parsing has been fixed by the DAaaS team.

@jhaigh0 jhaigh0 changed the base branch from main to release-next May 10, 2024 12:52
Copy link
Contributor

@robertapplin robertapplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! @jhaigh0 if you're happy then I'll enable auto merge

@robertapplin robertapplin merged commit 240c7a6 into release-next May 10, 2024
10 checks passed
@robertapplin robertapplin deleted the 37013-new-idaaas-data-cache-index branch May 10, 2024 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ISIS Team: Core Issue and pull requests managed by the Core subteam at ISIS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support new IDAaaS data cache index files
5 participants