Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculation of METRIC.Precision #157

Open
b-math opened this issue May 18, 2022 · 1 comment
Open

Calculation of METRIC.Precision #157

b-math opened this issue May 18, 2022 · 1 comment

Comments

@b-math
Copy link

b-math commented May 18, 2022

Dear hap.py development team,

Could you please tell me, how the calculation for METRIC.Precision is done for the hap.py statistics summary?

According to the docs, the following formula is used:
Precision = TP/(TP+FP)

So I tried to calculate the Precision with TRUTH.TP/(TRUTH.TP+QUERY.FP), however I get different results compared to the METRIC.Precision. Is this behaviour expected? And if yes, could you please provide me with the formula (or correct column names) to get METRIC.Precision correctly?

See below the examples from your git repo. The behaviour is observed for

vcfeval...

	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	METRIC.Recall	**METRIC.Precision**	METRIC.F1_Score	**TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0	INDEL	ALL	8929	7968	961	11812	227	0.892373	**0.972637**	0.930778	**0.9723001830384381**
1	INDEL	PASS	8929	7660	1269	9971	175	0.857879	**0.978155**	0.914077	**0.9776643267389917**
2	SNP	ALL	52494	52174	320	90092	504	0.993904	**0.990444**	0.992171	**0.9904324385891644**
3	SNP	PASS	52494	46955	5539	48078	90	0.894483	**0.998089**	0.94345	**0.9980869380380487**

... and happy

	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	METRIC.Recall	**METRIC.Precision**	METRIC.F1_Score	**TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0	INDEL	ALL	8937	7839	1098	11812	343	0.87714	**0.958635**	0.916079	**0.9580787093620142**
1	INDEL	PASS	8937	7550	1387	9971	283	0.844803	**0.964656**	0.90076	**0.9638708030128942**
2	SNP	ALL	52494	52125	369	90092	582	0.992971	**0.988966**	0.990964	**0.9889578234390118**
3	SNP	PASS	52494	46920	5574	48078	143	0.893816	**0.996963**	0.942576	**0.9969615196651297**

... but not for unhappy

	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	METRIC.Recall	**METRIC.Precision**	METRIC.F1_Score	**TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0	INDEL	ALL	8937	7060	1877	11812	1232	0.789974	**0.851423**	0.819548	**0.8514230583695128**
1	INDEL	PASS	8937	6850	2087	9971	1157	0.766476	**0.855501**	0.808546	**0.8555014362432871**
2	SNP	ALL	52494	52105	389	90092	639	0.99259	**0.987885**	0.990232	**0.9878848779008039**
3	SNP	PASS	52494	46908	5586	48078	178	0.893588	**0.99622**	0.942117	**0.9962196831329907**

Thank you very much for your time and I'm looking forward to hearing from you soon.

Best regards
Barbara

@skDooley
Copy link

I am seeing the same thing. I tried playing with the numbers but it just doesn't add up...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants