You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- This version was validated using EHR-based real-world data on rare diseases
6
+
- Added function for semantic enrichment to improve the quality and clarity of generated reports
7
+
- Improvement of the functionality for generating DQ reports and the layout of the created reports
8
+
- Updated package documentation
9
+
10
+
**Full set of changes:**[v1.3.1...v1.5.0](https://github.com/KaisTahar/dqLib/compare/v1.3.1...v1.5.0)
11
+
1
12
# dqLib 1.3.1
2
13
3
14
**New changes and tests**
4
-
- This version was successfully tested using synthetic data across multiple hospitals
15
+
- This version was successfully tested using synthetic data across multiple hospitals. The conceptual framework for harmonized DQ assessments and the findings from the distributed DQ assessments have been published under [DOI: 10.1055/a-2006-1018](https://www.thieme-connect.com/products/ejournals/abstract/10.1055/a-2006-1018)
16
+
- Implemented functions for assessing data completeness, plausibility, and uniqueness were validated using real-world EHR data, and the results have been published under [DOI: 10.3233/SHTI230121](https://ebooks.iospress.nl/doi/10.3233/SHTI230121)
5
17
- Added new functions to assess the completeness of cases and subjects
6
18
- Added a new function to evaluate the concordance indicator
7
19
- Fixed bug
@@ -11,5 +23,5 @@
11
23
12
24
# dqLib 1.0.0
13
25
14
-
- First version of the R-Package dqLib, which was tested independently using synthetic data
15
-
- Test results indicated the correctness of the implemented DQ indicators as reported in [DOI: 10.3205/22gmds116](https://www.egms.de/static/en/meetings/gmds2022/22gmds116.shtml)
26
+
- First version of the data quality library (dqLib), which was tested independently using synthetic data
27
+
- Test results indicated the correctness of the implemented data quality (DQ) indicators as reported in [DOI: 10.3205/22gmds116](https://www.egms.de/static/en/meetings/gmds2022/22gmds116.shtml)
Copy file name to clipboardExpand all lines: README.md
+9-22Lines changed: 9 additions & 22 deletions
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,11 @@
1
1
# `dqLib`
2
2
3
-
The data quality library (`dqLib`) is an R package for data quality (DQ) assessment and reporting.
4
-
`dqLib` provides methods for calculating DQ metrics and generating reports on detected DQ issues, especially in [`CORD-MI`](https://www.medizininformatik-initiative.de/de/CORD).
3
+
The data quality library (`dqLib`) is an R package for data quality (DQ) assessment and reporting. The library provides methods for calculating DQ metrics and generating reports on detected DQ issues, especially in the field of Rare Diseases(RDs)
5
4
6
5
Acknowledgment: This work was done within the “Collaboration on Rare Diseases” of the Medical Informatics Initiative (CORD-MI) funded by the German Federal Ministry of Education and Research (BMBF), under grant number: 01ZZ1911R, FKZ-01ZZ1911R
7
6
8
7
## DQ Metrics and Reports
9
-
`dqLib` provides functions for creating specific reporting scripts that enable user to select desired DQ dimensions, indicators and parameters. The DQ reports provide adequate information to find the data quality violations and the causes of these violations.
10
-
`dqLib` also enables annual assessments of selected DQ metrics. The following DQ metrics are already implemented:
8
+
`dqLib` provides functions that enable users to select desired dimensions, indicators, and parameters as well as to define specific DQ reports. The DQ reports include the resulting DQ metrics and adequate information to identify potential DQ violations.`dqLib` also enables annual assessments of selected DQ metrics. The following DQ metrics are already implemented:
11
9
12
10
| Dimension | DQ Indicator |
13
11
| ------------- | ------------- |
@@ -37,19 +35,19 @@ Acknowledgment: This work was done within the “Collaboration on Rare Diseases
37
35
| P16 | RD cases | number of RD cases |
38
36
| P17 | ambiguous RD cases | number of ambiguous RD cases |
39
37
| P18 | duplicated RD cases | number of duplicated RD cases |
40
-
| P19 | tracer cases | number of tracer cases per year |
38
+
| P19 | tracer cases | number of tracer cases |
41
39
| P20 | Orpha cases | number of available orpha-coded cases|
42
40
| P21 | RD cases rel. frequency| relative frequency of RD cases |
43
41
| P22 | tracer cases rel. frequency| relative frequency of tracer cases normalized to 100.000 inpatient cases |
44
42
| P23 | Orpha cases rel. frequency| relative frequency of Orpha cases normalized to 100.000 inpatient cases |
45
43
| P24 | minimal tracer cases in reference values| min. rel. frequency of tracer cases normalized to 100.000 inpatient cases found in the literature |
46
44
| P25 | maximale tracer cases in reference values| max. rel. frequency of tracer cases normalized to 100.000 inpatient cases found found in the literature |
47
45
48
-
The following references are required to assess the quality of RD documentation: (1) Current Version of Alpha-ID-SE Terminology [1] and (2) A reference for tracer diagnoses such as the list provided in [2].
46
+
The following references are required to assess the quality of RD documentation: (1) Current Version of Alpha-ID-SE Terminology [1] and (2) a reference for tracer diagnoses such as the list provided in [2].
49
47
50
48
[1] BfArM - Alpha-ID-SE [Internet]. [cited 2022 May 23]. Available from: [BfArM](https://www.bfarm.de/EN/Code-systems/Terminologies/Alpha-ID-SE/_node.html)
51
49
52
-
[2] Tahar K, Martin T, Mou Y, et al. Distributed Data Quality Assessment Across CORD-MI Consortia.[doi:10.3205/22gmds116](https://www.egms.de/static/en/meetings/gmds2022/22gmds116.shtml)
50
+
[2] Tahar et al. Rare Diseases in Hospital Information Systems — An Interoperable Methodology for Distributed Data Quality Assessments. Methods Inf Med. 2023 Sep;62(3/4):71–89. [DOI: 10.1055/a-2006-1018](https://www.thieme-connect.com/products/ejournals/abstract/10.1055/a-2006-1018)
[Here](https://github.com/KaisTahar/cordDqChecker-MIM/tree/methods_dqTools) are examples of DQ assessments using `dqLib`
71
-
-[cordDQCheck.R](https://github.com/KaisTahar/cordDqChecker-MIM/blob/methods_dqTools/Local/cordDqChecker.R): A reporting script for DQ assessment in CORD-MI
72
-
- Here you can see [the resulting files](https://github.com/KaisTahar/cordDqChecker-MIM/tree/methods_dqTools/Local/Data/Export)
68
+
-[cordDQCheck](https://github.com/KaisTahar/cordDqChecker/blob/bmc_dqTools/Local/cordDqChecker.R): A reporting tool for DQ assessment on RD data implemented using `dqLib`
69
+
- Here are some [examples](https://github.com/KaisTahar/cordDqChecker/tree/bmc_dqTools/Local/Data/Export) of DQ reports generated using `dqLib`
73
70
74
71
## Note
75
-
The default data quality dimensions are completeness, plausibility, uniqueness and concordance. However, this framework allows the user to select desired DQ dimensions and metrics as well as to generate user defined annual reports.
72
+
-The default data quality dimensions are completeness, plausibility, uniqueness and concordance. However, this framework allows the user to select desired DQ dimensions and metrics as well as to generate user defined annual reports.
76
73
77
-
To cite `dqLib`, please use the following **BibTeX** entry:
78
-
79
-
```
80
-
@software{Tahar_dqLib,
81
-
author = {Tahar, Kais},
82
-
title = {{dqLib}},
83
-
url = {https://github.com/KaisTahar/dqLib}
84
-
year = {2021}
85
-
}
86
-
```
87
-
See also: [`CORD-MI`](https://www.medizininformatik-initiative.de/de/CORD)
74
+
- To cite `dqLib`, please use the CITATION file located in the folder `./inst`.
0 commit comments