Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Add tests in pyhf-validation-root-base Docker image #9

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

matthewfeickert
Copy link
Member

@matthewfeickert matthewfeickert commented Mar 14, 2020

Add tests that run inside of the pyhf/pyhf-validation-root-base Docker image (where ROOT 6.22.02 is installed). This is done by using the container job option, which allow for specification of an arbitrary Docker image to launch into.

A more in depth example of container based workflow can be found here.

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR
* Add testing to the CI using the 'pyhf/pyhf-validation-root-base' Docker image
* Uses the 'container' option for GitHub Actions
* Fix replacement of lumi nuisance parameter in nuisance parameter comparison scripts

@matthewfeickert matthewfeickert added CI CI systems, GitHub Actions tests pytest labels Mar 14, 2020
@matthewfeickert matthewfeickert self-assigned this Mar 14, 2020
@codecov
Copy link

codecov bot commented Mar 14, 2020

Codecov Report

Merging #9 (2ae3c60) into master (e7e3b3d) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master       #9   +/-   ##
=======================================
  Coverage   90.90%   90.90%           
=======================================
  Files           3        3           
  Lines          11       11           
=======================================
  Hits           10       10           
  Misses          1        1           
Flag Coverage Δ
unittests 90.90% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e7e3b3d...2ae3c60. Read the comment docs.

@lukasheinrich
Copy link

we could create ROOT workspaces using json2xml + xml2workspace from the public sbottom likelihood

@matthewfeickert
Copy link
Member Author

we could create ROOT workspaces using json2xml + xml2workspace from the public sbottom likelihood

Good idea. I should have thought of this given that you and @kratsg had already suggested this when we originally posted some TODOs in the Issues.

@matthewfeickert
Copy link
Member Author

scripts/JSON_to_workspace.sh: line 45: hist2workspace: command not found

To my embarrassment, somehow the current version of the pyhf/pyhf-validation-root-base Docker image knows about PyROOT but doesn't have root or hist2workspace in PATH. :? This is very strange.

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Mar 16, 2020

Self note: Things to follow up on later (edit: after fixing lumi string conversion)

  • compare_nuisance.py gives
Nuisance params unique to pyhf:
staterror_CRtt_meff_2
staterror_CRtt_meff_1
staterror_CRtt_meff_0

Nuisance params unique to root:
  • compare_fitted_nuisance.py gives
########### Printing nuisance parameter comparisons to screen #############

param                                     pyhf val          root val          abs diff          % diff            

lumi                                      9.999041e-01      9.999047e-01      -5.789377e-07     -0.000058         
EG_RESOLUTION_ALL                         4.612360e-03      4.704046e-03      -9.168577e-05     -1.987828         
EG_SCALE_ALL                              -7.499355e-03     -7.619664e-03     1.203096e-04      -1.604266         
EL_EFF_ChargeIDSel_TOTAL_1NPCOR_PLUS_UNCOR0.000000e+00      0.000000e+00      0.000000e+00      0.000000          
EL_EFF_ID_TOTAL_1NPCOR_PLUS_UNCOR         -2.050934e-03     -2.061238e-03     1.030395e-05      -0.502403         
EL_EFF_Iso_TOTAL_1NPCOR_PLUS_UNCOR        -3.020117e-02     -3.018264e-02     -1.853162e-05     0.061361          
EL_EFF_Reco_TOTAL_1NPCOR_PLUS_UNCOR       3.148846e-04      3.296803e-04      -1.479566e-05     -4.698755         
EL_EFF_TriggerEff_TOTAL_1NPCOR_PLUS_UNCOR 0.000000e+00      0.000000e+00      0.000000e+00      0.000000          
EL_EFF_Trigger_TOTAL_1NPCOR_PLUS_UNCOR    0.000000e+00      0.000000e+00      0.000000e+00      0.000000          
FT_EFF_B_systematics                      -8.226148e-03     -7.876455e-03     -3.496934e-04     4.250998          
FT_EFF_C_systematics                      -2.304282e-02     -2.247763e-02     -5.651925e-04     2.452793          
FT_EFF_Light_systematics                  -1.442350e-01     -1.435668e-01     -6.681248e-04     0.463220          
FT_EFF_extrapolation                      -1.017101e-01     -1.015515e-01     -1.585868e-04     0.155920          
FT_EFF_extrapolation_from_charm           -7.146362e-02     -7.149188e-02     2.826320e-05      -0.039549         
JET_EtaIntercalibration_NonClosure_highE  5.297120e-05      6.585029e-05      -1.287909e-05     -24.313382        
JET_EtaIntercalibration_NonClosure_negEta 1.053397e-03      1.075789e-03      -2.239210e-05     -2.125704         
JET_EtaIntercalibration_NonClosure_posEta -6.598275e-04     -6.669702e-04     7.142699e-06      -1.082510         
JET_Flavor_Response                       -1.211315e-01     -1.229163e-01     1.784819e-03      -1.473456         
JET_GroupedNP_1                           3.895620e-02      3.980564e-02      -8.494396e-04     -2.180499         
JET_GroupedNP_2                           8.902315e-02      9.045789e-02      -1.434742e-03     -1.611651         
JET_GroupedNP_3                           -8.649500e-02     -8.762272e-02     1.127723e-03      -1.303802         
JET_JER_DataVsMC                          3.874304e-03      3.914144e-03      -3.983995e-05     -1.028312         
JET_JER_EffectiveNP_1                     -1.367855e-02     -1.353319e-02     -1.453564e-04     1.062659          
JET_JER_EffectiveNP_2                     -9.716172e-03     -9.529539e-03     -1.866322e-04     1.920841          
JET_JER_EffectiveNP_3                     -1.004252e-02     -9.914179e-03     -1.283391e-04     1.277957          
JET_JER_EffectiveNP_4                     -1.157892e-02     -1.146194e-02     -1.169860e-04     1.010336          
JET_JER_EffectiveNP_5                     -1.125749e-02     -1.106993e-02     -1.875624e-04     1.666111          
JET_JER_EffectiveNP_6                     -7.609317e-03     -7.515067e-03     -9.425056e-05     1.238621          
JET_JER_EffectiveNP_7restTerm             -8.836154e-03     -8.466758e-03     -3.693960e-04     4.180507          
JET_JvtEfficiency                         6.653292e-03      6.788719e-03      -1.354272e-04     -2.035491         
MET_SoftTrk_ResoPara                      -1.122048e-02     -1.120073e-02     -1.974629e-05     0.175984          
MET_SoftTrk_ResoPerp                      -5.237057e-03     -5.170010e-03     -6.704759e-05     1.280253          
MET_SoftTrk_Scale                         2.668171e-02      2.720399e-02      -5.222818e-04     -1.957452         
MUON_EFF_BADMUON_STAT                     1.150481e-08      0.000000e+00      1.150481e-08      100.000000        
MUON_EFF_BADMUON_SYS                      4.984433e-07      0.000000e+00      4.984433e-07      100.000000        
MUON_EFF_ISO_STAT                         -2.213125e-04     -2.165684e-04     -4.744056e-06     2.143600          
MUON_EFF_ISO_SYS                          2.798518e-04      2.900454e-04      -1.019354e-05     -3.642476         
MUON_EFF_RECO_STAT                        1.182098e-04      1.312382e-04      -1.302839e-05     -11.021413        
MUON_EFF_RECO_SYS                         5.625014e-04      5.741601e-04      -1.165876e-05     -2.072663         
MUON_EFF_TTVA_STAT                        5.297959e-07      1.151321e-05      -1.098341e-05     -2073.140503      
MUON_EFF_TTVA_SYS                         -5.713945e-06     1.160404e-05      -1.731798e-05     303.082742        
MUON_EFF_TrigStatUncertainty              1.094673e-07      0.000000e+00      1.094673e-07      100.000000        
MUON_EFF_TrigSystUncertainty              1.872494e-07      0.000000e+00      1.872494e-07      100.000000        
MUON_ID                                   -5.514388e-03     -5.604341e-03     8.995379e-05      -1.631256         
MUON_MS                                   -7.825303e-03     -8.072155e-03     2.468518e-04      -3.154534         
MUON_SAGITTA_RESBIAS                      1.558436e-02      1.520452e-02      3.798361e-04      2.437291          
MUON_SAGITTA_RHO                          0.000000e+00      0.000000e+00      0.000000e+00      0.000000          
MUON_SCALE                                -2.634326e-03     -2.654062e-03     1.973623e-05      -0.749195         
SigRad                                    4.952022e-05      -2.134732e-04     2.629935e-04      531.082941        
ttH_theory                                2.189601e-02      2.325871e-02      -1.362701e-03     -6.223512         
ttZ_theory                                8.329884e-03      8.688973e-03      -3.590890e-04     -4.310852         
ttbar_FSR                                 -3.424808e-01     -3.421057e-01     -3.750301e-04     0.109504          
ttbar_Gen                                 5.526187e-01      5.601142e-01      -7.495502e-03     -1.356361         
ttbar_ISR_Down                            6.604958e-01      6.682535e-01      -7.757710e-03     -1.174528         
ttbar_ISR_Up                              1.528636e-01      1.462661e-01      6.597444e-03      4.315903          
ttbar_PS                                  -1.729353e-01     -1.746909e-01     1.755532e-03      -1.015138         
staterror_SR_meff_0                       1.015946e+00      1.016074e+00      -1.282399e-04     -0.012623         
staterror_SR_meff_1                       9.810488e-01      9.810100e-01      3.874903e-05      0.003950          
staterror_SR_meff_2                       9.923182e-01      9.921207e-01      1.974996e-04      0.019903          
mu_ttbar                                  9.580629e-01      9.584793e-01      -4.163798e-04     -0.043461         
mu_SIG                                    3.662478e-14      3.293900e-07      -3.293900e-07     -899363724.041528 

Things that aren't clear to me:

  • Why ROOT doesn't have CRtt nuisance parameters
  • Why some nuisance parameters are exactly zero for both ROOT and pyhf
  • Why some nuisance parameters are exactly zero for only ROOT
  • Why some nuisance parameters differ between ROOT and pyhf by as much as O(e-03)

@matthewfeickert matthewfeickert added the fix A bug fix label Mar 16, 2020
@kratsg
Copy link

kratsg commented Mar 16, 2020

  • Why ROOT doesn't have CRtt nuisance parameters

because they're zero and automatically pruned. We don't do that in pyhf yet.

  • Why some nuisance parameters are exactly zero for both ROOT and pyhf

depends on the workspace/analysis. not concerning as long as both ROOT and pyhf agree

  • Why some nuisance parameters are exactly zero for only ROOT

likely ROOT auto-prunes below a certain value to 0 having it be negligible, and we just can't tell.

  • Why some nuisance parameters differ between ROOT and pyhf by as much as O(e-03)

all of these seem to be at the 1% difference level, so I'm not generally concerned.

@matthewfeickert matthewfeickert force-pushed the ci/add-Docker-based-CI-testing branch 2 times, most recently from 83f62ad to 35dab99 Compare March 20, 2020 21:13
@matthewfeickert matthewfeickert marked this pull request as ready for review March 28, 2020 03:58
@matthewfeickert matthewfeickert force-pushed the ci/add-Docker-based-CI-testing branch from ed230f6 to 4527e56 Compare May 6, 2020 04:55
@matthewfeickert matthewfeickert requested review from danikam and kratsg and removed request for kratsg and danikam May 6, 2020 04:59
@matthewfeickert matthewfeickert force-pushed the ci/add-Docker-based-CI-testing branch from 4527e56 to 0a6238a Compare May 6, 2020 05:55
Copy link

@kratsg kratsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON_to_workspace seems smarter to me to have a Makefile variant of that instead of a shell script.

@matthewfeickert
Copy link
Member Author

JSON_to_workspace seems smarter to me to have a Makefile variant of that instead of a shell script.

I don't normally think of Makefiles as taking command line arguments outside of the target. What is the advantage here to hardcoding something in the Makefile? Or am I misunderstanding what you're suggesting?

@kratsg
Copy link

kratsg commented May 6, 2020

I don't normally think of Makefiles as taking command line arguments outside of the target. What is the advantage here to hardcoding something in the Makefile? Or am I misunderstanding what you're suggesting?

Makefiles don't take arguments really. You can do like make root-workspace and that will make files from input files -> output files. Think of it like a compilation, right? There's not a huge reason why a makefile wouldn't work instead?

@matthewfeickert
Copy link
Member Author

Makefiles don't take arguments really

This is kinda my point. If we were to remove the Bash script and move all of that logic into a Make target then we'd have to hardcode everything.

There's not a huge reason why a makefile wouldn't work instead?

scripts/JSON_to_workspace.sh works for generic arguments. In the CI we can do

bash scripts/JSON_to_workspace.sh \
  sbottom \
  https://doi.org/10.17182/hepdata.89408.v1/r2 \
  RegionA/BkgOnly.json \
  RegionA/patch.sbottom_1300_205_60.json \
  RegionA/sbottom_1300_205_60.json

for the specific test there, but we can use this for analyses in the future. Why do we want to hardcode this one particular example case in a Makefile?

@kratsg
Copy link

kratsg commented May 7, 2020

This is kinda my point. If we were to remove the Bash script and move all of that logic into a Make target then we'd have to hardcode everything.

Makefiles don't hardcode names. They hardcode patterns. You can define a pattern target, e.g. patch.%.json would be an example, and that would automatically extract out sbottom_1300_205_60 and so on.

@matthewfeickert
Copy link
Member Author

You can define a pattern target, e.g. patch.%.json would be an example, and that would automatically extract out sbottom_1300_205_60 and so on.

Doesn't this still hardcode a Region and a URL or require a user to download everything? This seems like a loss of functionality. Maybe I should be asking a different question(s): What do you see us gaining by making this all a Make target? What do you see as the disadvantages of using a Bash script that requires user arguments?

@kratsg
Copy link

kratsg commented May 7, 2020

No advantages or disadvantages. The thing with the bash script is you'll regenerate when you re-run (versus a Makefile which only re-runs / re-makes things that have their source changing).

significantly smaller
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI CI systems, GitHub Actions fix A bug fix tests pytest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants