Check full SEED id of picks against Trace header information #388

cjhopp · 2020-04-27T22:37:50Z

Describe the bug
Currently, picks used when generating a Template are only checked against the Station and Channel attributes in the Trace header. For cases (this station, for instance) where multiple sensors exist at a given set of lat/lon coordinates (thereby requiring use of the underappreciated SEED location code), multiple picks with the same phase_hint will fulfill the following conditions:

tr.stats.station == pk.waveform_id.station_code and
tr.stats.channel == pk.waveform_id.channel_code

Expected behavior
We instead want only the picks on a single sensor, where the sensors are distinguished only by their location code.

Ditto Network information, which likely produces a similar result if you happen to find stations of the same name but in different networks.

Desktop (please complete the following information):

Operating System: Arch
Python version: 3.7
EQcorrscan version: 0.4.0

The text was updated successfully, but these errors were encountered:

calum-chamberlain · 2020-04-28T00:42:17Z

Definitely a "feature" not a bug 😝

But yes, would be great to check the full seed-id.

As stated on gitter, a key here will be in doing this change "gently": change with optional opt-in for next minor release with a warning that opt-in will be ~~mandatory~~ the default (with optional opt-out) in the next major release. Then when we break everyone's codes in the next major release we can at least say:

we told you so...

flixha · 2020-04-28T07:06:27Z

I had the impression that for several other steps where the location and network codes can be an issue, one has to manually modify them to a common name anyways (e.g., for correlating templates and continuous traces). But I guess one reason that network+location code may not be checked for the Picks is that in Nordic format (from Seisan), such information is not stored anyways - because it usually doesn't matter for the pick (of course it could if one looks at sensors that are a few 100 m apart, and earrthquakes close by are the target).
It's good to keep in mind all kinds of changes that can happen to location codes, e.g., changes through the lifetime of a station, differences between data repositories for the same network (EIDA vs IRIS), and differences between different transmission systems for the same network from the same data provider (e.g., FTP vs FDSNWS). Hence I feel there could also be a use of allowing an option to respect or ignore location+network codes globally across the code. What do you think?

cjhopp · 2020-04-29T00:06:00Z

Hi @flixha!

The approach @calum-chamberlain outlines above would (I think) allow you to opt out of network+location checks even after the next minor release, but would change the default behavior to opt-in.

Still, your point is well taken wrt location code changes with time and between different repositories. I agree that my case is not common and could see an argument for leaving the network+location check remaining 'opt-in' only. However, it is technically correct, I think, to always check for the full SEED id, and ask that users bring their data into line with the standard, instead of accomodating the deviations (albeit common) from that standard.

On a side note, @calum-chamberlain, I have to look a bit closer, but it looks like we're already checking full SEED-ids via tr.id in _prep_data_for_correlation here. I'm sure I'm missing a lot here, but how then does a nordic-derived template with no net or location not throw an error at that stage? I guess it's assumed that template_gen is passed waveforms that do have net and location in their header, and therefore pass the checks against continuous data?

flixha · 2020-05-07T15:16:16Z

Hi @cjhopp ,
sorry for the delayed answer - didn't see the thread. I totally agree that making it an opt-in/opt-out is a very good change to make things robust and, when needed, technically correct.

Regarding your note on Seisan: I had a quick look into _template_gen. There is no check for network or location code there, so if there is a pick it will be assigned to a trace of the same station and channel (vertical or horizontal etc.). So I guess what you're referring to is the check of the template trace vs. the continuous trace, where the full id is checked as you say.

Right now I'm looking at data that covers a longer time period and a lot of "alternative" network/station/location/channel-codes. I need to see how the functions that I'm writing could become robust enough to work not only for my specific problem, and then it may be a good addition to make the process of checking /adjusting all the codes a bit easier.

calum-chamberlain added core.match_filter core.template_gen enhancement labels Apr 28, 2020

calum-chamberlain assigned cjhopp Apr 28, 2020

cjhopp mentioned this issue Apr 28, 2020

WIP: Check full seed #391

Closed

10 tasks

calum-chamberlain added this to the 0.5.0 milestone Aug 14, 2020

calum-chamberlain linked a pull request Dec 9, 2021 that will close this issue

Check full SEED ID when comparing picks and waveforms #481

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check full SEED id of picks against Trace header information #388

Check full SEED id of picks against Trace header information #388

cjhopp commented Apr 27, 2020

calum-chamberlain commented Apr 28, 2020

flixha commented Apr 28, 2020

cjhopp commented Apr 29, 2020

flixha commented May 7, 2020

Check full SEED id of picks against Trace header information #388

Check full SEED id of picks against Trace header information #388

Comments

cjhopp commented Apr 27, 2020

calum-chamberlain commented Apr 28, 2020

flixha commented Apr 28, 2020

cjhopp commented Apr 29, 2020

flixha commented May 7, 2020