Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check full SEED id of picks against Trace header information #388

Open
cjhopp opened this issue Apr 27, 2020 · 4 comments · May be fixed by #481
Open

Check full SEED id of picks against Trace header information #388

cjhopp opened this issue Apr 27, 2020 · 4 comments · May be fixed by #481

Comments

@cjhopp
Copy link
Member

cjhopp commented Apr 27, 2020

Describe the bug
Currently, picks used when generating a Template are only checked against the Station and Channel attributes in the Trace header. For cases (this station, for instance) where multiple sensors exist at a given set of lat/lon coordinates (thereby requiring use of the underappreciated SEED location code), multiple picks with the same phase_hint will fulfill the following conditions:

tr.stats.station == pk.waveform_id.station_code and
tr.stats.channel == pk.waveform_id.channel_code

Expected behavior
We instead want only the picks on a single sensor, where the sensors are distinguished only by their location code.

Ditto Network information, which likely produces a similar result if you happen to find stations of the same name but in different networks.

Desktop (please complete the following information):

  • Operating System: Arch
  • Python version: 3.7
  • EQcorrscan version: 0.4.0
@calum-chamberlain
Copy link
Member

Definitely a "feature" not a bug 😝

But yes, would be great to check the full seed-id.

As stated on gitter, a key here will be in doing this change "gently": change with optional opt-in for next minor release with a warning that opt-in will be mandatory the default (with optional opt-out) in the next major release. Then when we break everyone's codes in the next major release we can at least say:

we told you so...

@flixha
Copy link
Collaborator

flixha commented Apr 28, 2020

I had the impression that for several other steps where the location and network codes can be an issue, one has to manually modify them to a common name anyways (e.g., for correlating templates and continuous traces). But I guess one reason that network+location code may not be checked for the Picks is that in Nordic format (from Seisan), such information is not stored anyways - because it usually doesn't matter for the pick (of course it could if one looks at sensors that are a few 100 m apart, and earrthquakes close by are the target).
It's good to keep in mind all kinds of changes that can happen to location codes, e.g., changes through the lifetime of a station, differences between data repositories for the same network (EIDA vs IRIS), and differences between different transmission systems for the same network from the same data provider (e.g., FTP vs FDSNWS). Hence I feel there could also be a use of allowing an option to respect or ignore location+network codes globally across the code. What do you think?

@cjhopp cjhopp mentioned this issue Apr 28, 2020
10 tasks
@cjhopp
Copy link
Member Author

cjhopp commented Apr 29, 2020

Hi @flixha!

The approach @calum-chamberlain outlines above would (I think) allow you to opt out of network+location checks even after the next minor release, but would change the default behavior to opt-in.

Still, your point is well taken wrt location code changes with time and between different repositories. I agree that my case is not common and could see an argument for leaving the network+location check remaining 'opt-in' only. However, it is technically correct, I think, to always check for the full SEED id, and ask that users bring their data into line with the standard, instead of accomodating the deviations (albeit common) from that standard.

On a side note, @calum-chamberlain, I have to look a bit closer, but it looks like we're already checking full SEED-ids via tr.id in _prep_data_for_correlation here. I'm sure I'm missing a lot here, but how then does a nordic-derived template with no net or location not throw an error at that stage? I guess it's assumed that template_gen is passed waveforms that do have net and location in their header, and therefore pass the checks against continuous data?

@flixha
Copy link
Collaborator

flixha commented May 7, 2020

Hi @cjhopp ,
sorry for the delayed answer - didn't see the thread. I totally agree that making it an opt-in/opt-out is a very good change to make things robust and, when needed, technically correct.

Regarding your note on Seisan: I had a quick look into _template_gen. There is no check for network or location code there, so if there is a pick it will be assigned to a trace of the same station and channel (vertical or horizontal etc.). So I guess what you're referring to is the check of the template trace vs. the continuous trace, where the full id is checked as you say.

Right now I'm looking at data that covers a longer time period and a lot of "alternative" network/station/location/channel-codes. I need to see how the functions that I'm writing could become robust enough to work not only for my specific problem, and then it may be a good addition to make the process of checking /adjusting all the codes a bit easier.

@calum-chamberlain calum-chamberlain added this to the 0.5.0 milestone Aug 14, 2020
@calum-chamberlain calum-chamberlain linked a pull request Dec 9, 2021 that will close this issue
12 tasks
@calum-chamberlain calum-chamberlain linked a pull request Dec 9, 2021 that will close this issue
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants