Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A formatter to match named timezones or their abbreviations, such as GMT #140

Open
Earnestly opened this issue Feb 15, 2022 · 6 comments
Open

Comments

@Earnestly
Copy link

I have a few date strings which contain the named zones GMT and UTC. Would it be possible for %Z to match named timezones or abbreviations as strftime does?

I realise I could repeat the -i flags with both UTC and GMT permuted among them, but that seems a bit awkward.

@hroptatyr
Copy link
Owner

Hi, what do you mean by match? Just eat the string, or actually evaluate the string so it acts as --from-zone?

@Earnestly
Copy link
Author

In my case I have <pubDate>1 Feb 2022 12:00:00 GMT</pubDate> where GMT ultimately means UTC and so both could be eschewed.

But your suggestion of --from-zone is probably more appropriate, what people might expect. The matching I'm refering to is via -i '...', currently I repeat the date formats:

    ...
    -i '<d>%a, %d %b %Y %T %Z</d>' \
    -i '<d>%a, %d %b %Y %T GMT</d>' \
    -i '<d>%a, %d %b %Y %T UTC</d>' \
    ...

@hroptatyr
Copy link
Owner

I see. I mean you could just not specify it, then (depending on the presence of -S) it will be part of the output again. Are you hoping to rewrite, say, BST to +00:00 or PRC to +08:00?

Using --from-zone you'd only be able to specify one zone for the entire input. What you suggested sounded like a "per-line" --from-zone.

@Earnestly
Copy link
Author

Earnestly commented Feb 15, 2022

In this particular case I'm hoping to convert all timestamps to UTC0, rfc3339. Specifically %FT%TZ.

Many of the formats I deal with follow RFC822 to a degree with inclusion of zone offsets as either +0000 or +00:00 (where 0 is placeholder for actual variance).

@hroptatyr
Copy link
Owner

I see, the case with explicit zone offsets is easy (and already supported) because it's just a calculation.

Using zone names isn't too difficult either but I feel quite uncomfortable with the idea of (potentially) opening a new zone file for every line of input. Or, making it as comfortable as possible for the user, you'd have to open all zone files because those daylight saving names are inside the file. And you might need to have more diambiguation measures in place, e.g. for the famous AEST vs EST (Australia) vs EST (North America).

@Earnestly
Copy link
Author

Earnestly commented Feb 15, 2022

That does sound fairly onerous certainly (zone files might be amenable to a hash table, but it is pretty large (5M on my system)).

I might just stick with my repeated -i in that case because it's not too terrible. Both UTC and GMT are ultimately UTC0 anyway, so when converting to UTC they can simply be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants