Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A scalar starting with 00 (double zero) is interpreted as octal number #152

Open
rafalkrupinski opened this issue Oct 8, 2022 · 8 comments

Comments

@rafalkrupinski
Copy link

When parsing an unquoted value that starts with 00 (zero zero), yq tries to parse it a s octal number.
According to the specification, octals are starting with 0o (zero oscar) and I was expecting the value to parse as string.

file.yaml:

009
yq . data1.yml
yq: Error running jq: ValueError: invalid literal for int() with base 8: '009'.

Thank you!

@conao3
Copy link
Contributor

conao3 commented Oct 23, 2023

Not only 00, starting 0 number is interpreted as octal number.

$ echo 'hour: 08' | yq .
yq: Error running jq: ValueError: invalid literal for int() with base 8: '08'.

pyyaml parses 08 as '08'.

$ echo 'hour: 08' | python -c 'import yaml; print(yaml.safe_load(input()))'
{'hour': '08'}

@c00kiemon5ter
Copy link

you can always set the type explicitly

$ cat file.yml
!str 009

$ yq . file.yml
"009"

@kislyuk
Copy link
Owner

kislyuk commented Apr 15, 2024

I don't think yq will be doing anything to address this issue directly. Under the hood, yq uses PyYAML with developmental YAML 1.2 grammar regular expressions to match numeric literals. We will not be rolling our own regular expressions to address these edge cases, and we will not be backing away from 1.2 support and reverting to plain PyYAML with its 1.1 defaults, because that would cause even worse usability issues (like the "on" -> True problem).

The solution for this issue will come from using a fully YAML 1.2 compliant parser.

@kislyuk
Copy link
Owner

kislyuk commented Apr 15, 2024

Note: if you're encountering this issue with YAML that is emitted by yq -y, a possible workaround is to use YAML 1.2 as the output grammar: yq -y --yaml-output-grammar-version=1.2. The output grammar is still set to 1.1 for compatibility with tools that expect 1.1-like behavior (although it will be changed to default to 1.2 in the future).

@conao3
Copy link
Contributor

conao3 commented Apr 16, 2024

hmm, "OK" but...

$ echo '{hours: ["05", "06", "07", "08", "09", "10"]}' | yq . -y | yq .
{
  "hours": [
    "05",
    "06",
    "07",
    "08",
    "09",
    "10"
  ]
}

$ brew upgrade python-yq
==> Upgrading 1 outdated package:
python-yq 3.3.0 -> 3.3.1
==> Downloading https://ghcr.io/v2/homebrew/core/python-yq/manifests/3.3.1
############################################################################################ 100.0%
==> Fetching python-yq
==> Downloading https://ghcr.io/v2/homebrew/core/python-yq/blobs/sha256:a3b2e22c6978bf8a606b45d378496ccf587c617686e64457e997fe7ff8797be5
############################################################################################ 100.0%
==> Upgrading python-yq
  3.3.0 -> 3.3.1 
==> Pouring python-yq--3.3.1.arm64_ventura.bottle.tar.gz
==> Caveats
zsh completions have been installed to:
  /opt/homebrew/share/zsh/site-functions
==> Summary
🍺  /opt/homebrew/Cellar/python-yq/3.3.1: 105 files, 852KB

$ echo '{hours: ["05", "06", "07", "08", "09", "10"]}' | yq . -y | yq .
yq: Error running jq: ValueError: invalid literal for int() with base 8: '08'.

$ echo '{hours: ["05", "06", "07", "08", "09", "10"]}' | yq . -y --yaml-output-grammar-version=1.2 | yq .
{
  "hours": [
    "05",
    "06",
    "07",
    "08",
    "09",
    "10"
  ]
}

@kislyuk
Copy link
Owner

kislyuk commented Apr 16, 2024

@conao3 yes, I am aware of this issue. The workaround is to use yq -y --yaml-output-grammar-version=1.2.

@kislyuk
Copy link
Owner

kislyuk commented Apr 17, 2024

@conao3 I committed a fix for the issue where yq emits unquoted string scalars that start with 08 and 09, and released it in v3.4.0. This removes the need to use --yaml-output-grammar-version=1.2; yq will now use what amounts to a modified version 1.1 for output with this quoting behavior as the main change.

To be clear, this doesn't address the issue with how these unquoted scalars are parsed in the input, and I don't expect to address this issue within yq.

@conao3
Copy link
Contributor

conao3 commented Apr 18, 2024

OK, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants