Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support formatting using PDG rounding rules #45

Open
davidchall opened this issue Dec 7, 2020 · 3 comments
Open

Support formatting using PDG rounding rules #45

davidchall opened this issue Dec 7, 2020 · 3 comments

Comments

@davidchall
Copy link

davidchall commented Dec 7, 2020

The GUM is rather vague about how to select an appropriate number of significant digits to display when reporting measurements with their uncertainty:

7.2.6 The numerical values of the estimate y and its standard uncertainty uc(y) or expanded uncertainty U should not be given with an excessive number of digits. It usually suffices to quote uc(y) and U [as well as the standard uncertainties u(xi) of the input estimates xi] to at most two significant digits, although in some cases it may be necessary to retain additional digits to avoid round-off errors in subsequent calculations.

In reporting final results, it may sometimes be appropriate to round uncertainties up rather than to the nearest digit. For example, uc(y) = 10,47 mΩ might be rounded up to 11 mΩ. However, common sense should prevail and a value such as u(xi) = 28,05 kHz should be rounded down to 28 kHz. Output and input estimates should be rounded to be consistent with their uncertainties; for example, if y = 10,057 62 Ω with uc(y) = 27 mΩ, y should be rounded to 10,058 Ω. Correlation coefficients should be given with three-digit accuracy if their absolute values are near unity.

However the Particle Data Group provides recommendations about how to select an appropriate number of significant digits:

5.3. Rounding : While the results shown in the Particle Listings are usually exactly those published by the experiments, the numbers that appear in the Summary Tables (means, averages and limits) are subject to a set of rounding rules.

The basic rule states that if the three highest order digits of the error lie between 100 and 354, we round to two significant digits. If they lie between 355 and 949, we round to one significant digit. Finally, if they lie between 950 and 999, we round up to 1000 and keep two significant digits. In all cases, the central value is given with a precision that matches that of the error. So, for example, the result (coming from an average) 0.827 ± 0.119 would appear as 0.83 ± 0.12, while 0.827 ± 0.367 would turn into 0.8 ± 0.4.

Do you think this would be a good addition to the format() method? (or perhaps a global option?)

@Enchufa2
Copy link
Member

Enchufa2 commented Dec 8, 2020

The package already supports the digits argument, which is expected to be numeric (and by default is equal to 1), and can be controlled globally via options(errors.digits=<some number>). We could add support for digits="pdg" and, potentially, other formats.

However, I'm curious... Why these specific rules? I mean, why 354 and not e.g. 237, etc.? I don't see any explanation in the referenced document. (EDIT: or, more specifically, why 354 and not 10*sqrt(1000), which at least is the middle point in log-scale).

@davidchall
Copy link
Author

👍 I was thinking along the same lines with format(x, digits = "pdg") and options(errors.digits = "pdg").

I've been unable to find a derivation for the rounding heuristic. I always assumed it was defined by a threshold on the logarithmic scale, like you suggested. Indeed, I still think this is the case. Here's my reasoning:

  • 10 * sqrt(1000) is 316.2278
  • Naively, this suggests we should round <=316 to 2 significant figures and >316 to 1 significant figure
  • However, this yields self-contradictory loss of precision: 316 -> 320 and 317 -> 300
  • This leaves us with 2 options: set the threshold at either 304 or 354
  • I think they choose the latter to be conservative in how precision is lost

@Enchufa2
Copy link
Member

Enchufa2 commented Dec 9, 2020

Right, makes sense. I'll have to rework the format method a little bit to support this and potentially other rules, but I think it's doable (even if my past self did not bother to leave me some comments...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants