Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple fall back input formats for dateconv #138

Open
Earnestly opened this issue Feb 11, 2022 · 3 comments
Open

Multiple fall back input formats for dateconv #138

Earnestly opened this issue Feb 11, 2022 · 3 comments

Comments

@Earnestly
Copy link

dateconv -S is particularly useful when used as a filter for a large amount of input. It would potentially be helpful when dealing with inputs that have a few known formats for dateconv to try each one in turn until it succeeds.

The alternative would be to execute dateconv (and perhaps strptime) for each line of input.

(The ultimate solution would be for dateconv to detect the format such as Date.parse from js or dateutils from python)

@hroptatyr
Copy link
Owner

Hi, thanks for the report. That's what -i|--input-format is for.

@Earnestly
Copy link
Author

Oh I'm silly, I did not read properly that -i could be given multiple times. I'll have to give this a try

@Earnestly
Copy link
Author

Earnestly commented Feb 14, 2022

It doesn't appear to operate in a fallback manner, and attempts to apply each input format to every line instead of breaking after the first success.

I.e. given this input:

Sun, 26 Sep 2021 00:00:00 +1000 http://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html The Speed of Time

Currently dateconv will apply the %F input format to the url, which is fair enough as -S matches anything in the line.

% dateconv -Sf %FT%TZ -i %FT%T%Z -i '%a, %d %b %Y %T %Z' -i %FT%TZ -i '%d %b %Y %T %Z' -i %F
2021-09-25T14:00:00Z http://www.brendangregg.com/blog/2021-09-26T00:00:00Z/the-speed-of-time.html The Speed of Time

Ideally I would hope for something like this, where it breaks after the first success.

% dateconv -Sf %FT%TZ -i %FT%T%Z -i '%a, %d %b %Y %T %Z' -i %FT%TZ -i '%d %b %Y %T %Z' -i %F
2021-09-25T14:00:00Z http://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html The Speed of Time

But this is all heuristic and it seems like the only proper solution to this would be to support fields such as sort -k (and sort -t). Another might be to add "anchors" to the "general specs", so along with %n for newline, to perhaps have %a+ and %a- representing the regex anchors ^ and $.

To workaround this I've devised a scheme to ensure titles cannot contain tabs while inserting a tab between the date and the rest of the line. Then -i can include this tab via %t in the match and -f can re-insert the space. This seems to work consistently with my inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants