Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiclient for making waveform requests to an arbitrary amount of clients based on a config and the requested SEED ID #3304

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

megies
Copy link
Member

@megies megies commented May 22, 2023

What does this PR do?

The motivation for this PR is automated workflows that work with waveform data from a changing set of stations with data being available through a variety of different means.
For data center hosted data, things like EIDA routing client or the IRIS FDSN Federator in many cases are sufficient to pull together data. However, often enough a mix of locally stored data (usually in SDS structures or similar well ordered directory trees) and also non public FDSN services (e.g. provided by non-public SeisComP instances) has to be queried to get all data combined in a certain automated analysis task.

The idea to make this task easy is for the user to specify all details once in a config file and then any obspy based workflow can simply rely on MultiClient with the given config file to handle all details of where from and how to fetch the data. In the configuration file an arbitrary amount of clients (currently of type FDSN, seedlink, SDS filesystem, but fully extensible in user code) can be specified, each one with their own set of initialization parameters (varying FDSN URLs, varying SDS directory trees) in combination with lookup information that maps SEED IDs to a certain specific client. The lookup can be based on simply the network code, or on station level (which can be useful e.g. if certain stations are not part of public servers and have to be pulled from file system).

Still needs some minor tweaking, work on docs and adding some tests.

Example config:

[lookup]                                                                        
US = fdsn_iris                                                                  
GR = fdsn_bgr                                                                   
GR.WET = sds1                                                                   
                                                                                
[fdsn_iris]                                                                     
type = fdsn                                                                     
base_url = IRIS                                                                 
user_agent = LMU                                                                
timeout = 30                                                                    
                                                                                
[fdsn_bgr]                                                                       
type = fdsn                                                                      
base_url = http://eida.bgr.de                                                    
user_agent = LMU                                                                 
timeout = 20                                                                     
                                                                                
[sds1]                                                                          
type = sds                                                                      
sds_root = /path/to/SDS/archive                                                 
sds_type = D                                                                    
format = MSEED                                                                  
fileborder_seconds = 30                                                         
fileborder_samples = 5000                                                       

Example usage

from obspy import UTCDateTime
from obspy.clients.multiclient import MultiClient

config = "/home/megies/.multiclientrc"

t = UTCDateTime("2023-05-05T05:05:05")

requests = [
    ["US", "KSU1", "*", "BH*", t, t+10],
    ["GR", "FUR", "", "HH?", t, t+10],
    ["GR", "WET", "", "H*", t, t+10],
    ]

client = MultiClient(config)
for args in requests:
    st = client.get_waveforms(*args)
    print(st)

Why was it initiated? Any relevant Issues?

I have done this in one of my workflows for quite a while and recently when working on a new workflow decided to extract the concept and propose to put it into obspy properly to reduce code duplication and duplicated work.

PR Checklist

  • Correct base branch selected? master for new features, maintenance_... for bug fixes
  • This PR is not directly related to an existing issue (which has no PR yet).
  • While the PR is still work-in-progress, the no_ci label can be added to skip CI builds
  • If the PR is making changes to documentation, docs pages can be built automatically.
    Just add the build_docs tag to this PR.
    Docs will be served at docs.obspy.org/pr/{branch_name} (do not use master branch).
    Please post a link to the relevant piece of documentation.
  • If all tests including network modules (e.g. clients.fdsn) should be tested for the PR,
    just add the test_network tag to this PR.
  • All tests still pass.
  • Any new features or fixed regressions are covered via new tests.
  • Any new or changed features are fully documented.
  • Significant changes have been added to CHANGELOG.txt .
  • First time contributors have added your name to CONTRIBUTORS.txt .
  • If the changes affect any plotting functions you have checked that the plots
    from all the CI builds look correct. Add the "upload_plots" tag so that plotting
    outputs are attached as artifacts.
  • New modules, add the module to CODEOWNERS with your github handle
  • Add the yellow ready for review label when you are ready for the PR to be reviewed.

makes logic much easier, especially considering wildcards being used in
the requested data and use cases that need to select client on
location or channel level are probably very very rare
@megies megies added enhancement feature request .clients issue related to our network modules labels May 22, 2023
@megies megies added this to the 1.5.0 milestone May 22, 2023
@megies megies added the build_docs Docs will be automatically built and deployed in github actions on pushes to the PR label May 26, 2023
 - no need for "no value" in config, can just leave out or comment out
   unwanted init parameters for clients
 - SafeConfigParser was replaced by ConfigParser at some point
 - remove mention for file-like objects for config, no need atm
@megies
Copy link
Member Author

megies commented May 26, 2023

note to self: might need to add to docs .rst pages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build_docs Docs will be automatically built and deployed in github actions on pushes to the PR .clients issue related to our network modules enhancement feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant