Skip to content

Commit 8d277f2

Browse files
committed
update package documentation
1 parent 9fd430b commit 8d277f2

File tree

5 files changed

+71
-39
lines changed

5 files changed

+71
-39
lines changed

CITATION.cff

Lines changed: 0 additions & 10 deletions
This file was deleted.

DESCRIPTION

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,17 @@
11
Package: dqLib
22
Type: Package
33
Title: Tools for data quality assessment and reporting
4-
Version: 1.3.1
5-
Author: Kais Tahar
6-
Maintainer: The package maintainer <kais.tahar@med.uni-goettingen.de>
7-
Description: Contains methods for data quality analysis and reporting, especially in CORD-MI project.
4+
Version: 1.5.0
5+
Date/Publication: 2024-10-11
6+
Authors@R: c(
7+
person( given = "Kais",
8+
family = "Tahar",
9+
email = "kais.tahar@med.uni-goettingen.de",
10+
role = c("aut","cre"),
11+
comment=c(ORCID = "0000-0001-9683-0575")))
12+
Description: Contains methods for data quality analysis and reporting, especially in the field of Rare Diseases(RDs).
813
License: GPL-3
14+
URL: https://github.com/KaisTahar/dqLib
915
Encoding: UTF-8
1016
LazyData: true
1117
Imports: openxlsx, stringi

NEWS.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,19 @@
1+
# dqLib 1.5.0
2+
3+
**New changes and tests**
4+
5+
- This version was validated using EHR-based real-world data on rare diseases
6+
- Added function for semantic enrichment to improve the quality and clarity of generated reports
7+
- Improvement of the functionality for generating DQ reports and the layout of the created reports
8+
- Updated package documentation
9+
10+
**Full set of changes:** [v1.3.1...v1.5.0](https://github.com/KaisTahar/dqLib/compare/v1.3.1...v1.5.0)
11+
112
# dqLib 1.3.1
213

314
**New changes and tests**
4-
- This version was successfully tested using synthetic data across multiple hospitals
15+
- This version was successfully tested using synthetic data across multiple hospitals. The conceptual framework for harmonized DQ assessments and the findings from the distributed DQ assessments have been published under [DOI: 10.1055/a-2006-1018](https://www.thieme-connect.com/products/ejournals/abstract/10.1055/a-2006-1018)
16+
- Implemented functions for assessing data completeness, plausibility, and uniqueness were validated using real-world EHR data, and the results have been published under [DOI: 10.3233/SHTI230121](https://ebooks.iospress.nl/doi/10.3233/SHTI230121)
517
- Added new functions to assess the completeness of cases and subjects
618
- Added a new function to evaluate the concordance indicator
719
- Fixed bug
@@ -11,5 +23,5 @@
1123

1224
# dqLib 1.0.0
1325

14-
- First version of the R-Package dqLib, which was tested independently using synthetic data
15-
- Test results indicated the correctness of the implemented DQ indicators as reported in [DOI: 10.3205/22gmds116](https://www.egms.de/static/en/meetings/gmds2022/22gmds116.shtml)
26+
- First version of the data quality library (dqLib), which was tested independently using synthetic data
27+
- Test results indicated the correctness of the implemented data quality (DQ) indicators as reported in [DOI: 10.3205/22gmds116](https://www.egms.de/static/en/meetings/gmds2022/22gmds116.shtml)

README.md

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,11 @@
11
# `dqLib`
22

3-
The data quality library (`dqLib`) is an R package for data quality (DQ) assessment and reporting.
4-
`dqLib` provides methods for calculating DQ metrics and generating reports on detected DQ issues, especially in [`CORD-MI`](https://www.medizininformatik-initiative.de/de/CORD).
3+
The data quality library (`dqLib`) is an R package for data quality (DQ) assessment and reporting. The library provides methods for calculating DQ metrics and generating reports on detected DQ issues, especially in the field of Rare Diseases(RDs)
54

65
Acknowledgment: This work was done within the “Collaboration on Rare Diseases” of the Medical Informatics Initiative (CORD-MI) funded by the German Federal Ministry of Education and Research (BMBF), under grant number: 01ZZ1911R, FKZ-01ZZ1911R
76

87
## DQ Metrics and Reports
9-
`dqLib` provides functions for creating specific reporting scripts that enable user to select desired DQ dimensions, indicators and parameters. The DQ reports provide adequate information to find the data quality violations and the causes of these violations.
10-
`dqLib` also enables annual assessments of selected DQ metrics. The following DQ metrics are already implemented:
8+
`dqLib` provides functions that enable users to select desired dimensions, indicators, and parameters as well as to define specific DQ reports. The DQ reports include the resulting DQ metrics and adequate information to identify potential DQ violations.`dqLib` also enables annual assessments of selected DQ metrics. The following DQ metrics are already implemented:
119

1210
| Dimension | DQ Indicator |
1311
| ------------- | ------------- |
@@ -37,19 +35,19 @@ Acknowledgment: This work was done within the “Collaboration on Rare Diseases
3735
| P16 | RD cases | number of RD cases |
3836
| P17 | ambiguous RD cases | number of ambiguous RD cases |
3937
| P18 | duplicated RD cases | number of duplicated RD cases |
40-
| P19 | tracer cases | number of tracer cases per year |
38+
| P19 | tracer cases | number of tracer cases |
4139
| P20 | Orpha cases | number of available orpha-coded cases|
4240
| P21 | RD cases rel. frequency| relative frequency of RD cases |
4341
| P22 | tracer cases rel. frequency| relative frequency of tracer cases normalized to 100.000 inpatient cases |
4442
| P23 | Orpha cases rel. frequency| relative frequency of Orpha cases normalized to 100.000 inpatient cases |
4543
| P24 | minimal tracer cases in reference values| min. rel. frequency of tracer cases normalized to 100.000 inpatient cases found in the literature |
4644
| P25 | maximale tracer cases in reference values| max. rel. frequency of tracer cases normalized to 100.000 inpatient cases found found in the literature |
4745

48-
The following references are required to assess the quality of RD documentation: (1) Current Version of Alpha-ID-SE Terminology [1] and (2) A reference for tracer diagnoses such as the list provided in [2].
46+
The following references are required to assess the quality of RD documentation: (1) Current Version of Alpha-ID-SE Terminology [1] and (2) a reference for tracer diagnoses such as the list provided in [2].
4947

5048
[1] BfArM - Alpha-ID-SE [Internet]. [cited 2022 May 23]. Available from: [BfArM](https://www.bfarm.de/EN/Code-systems/Terminologies/Alpha-ID-SE/_node.html)
5149

52-
[2] Tahar K, Martin T, Mou Y, et al. Distributed Data Quality Assessment Across CORD-MI Consortia.[doi:10.3205/22gmds116](https://www.egms.de/static/en/meetings/gmds2022/22gmds116.shtml)
50+
[2] Tahar et al. Rare Diseases in Hospital Information Systems — An Interoperable Methodology for Distributed Data Quality Assessments. Methods Inf Med. 2023 Sep;62(3/4):71–89. [DOI: 10.1055/a-2006-1018](https://www.thieme-connect.com/products/ejournals/abstract/10.1055/a-2006-1018)
5351

5452
------------------------------------------------------------------------
5553

@@ -67,21 +65,10 @@ devtools::install_github("https://github.com/KaisTahar/dqLib")
6765
```
6866
## Example
6967

70-
[Here](https://github.com/KaisTahar/cordDqChecker-MIM/tree/methods_dqTools) are examples of DQ assessments using `dqLib`
71-
- [cordDQCheck.R](https://github.com/KaisTahar/cordDqChecker-MIM/blob/methods_dqTools/Local/cordDqChecker.R): A reporting script for DQ assessment in CORD-MI
72-
- Here you can see [the resulting files](https://github.com/KaisTahar/cordDqChecker-MIM/tree/methods_dqTools/Local/Data/Export)
68+
- [cordDQCheck](https://github.com/KaisTahar/cordDqChecker/blob/bmc_dqTools/Local/cordDqChecker.R): A reporting tool for DQ assessment on RD data implemented using `dqLib`
69+
- Here are some [examples](https://github.com/KaisTahar/cordDqChecker/tree/bmc_dqTools/Local/Data/Export) of DQ reports generated using `dqLib`
7370

7471
## Note
75-
The default data quality dimensions are completeness, plausibility, uniqueness and concordance. However, this framework allows the user to select desired DQ dimensions and metrics as well as to generate user defined annual reports.
72+
- The default data quality dimensions are completeness, plausibility, uniqueness and concordance. However, this framework allows the user to select desired DQ dimensions and metrics as well as to generate user defined annual reports.
7673

77-
To cite `dqLib`, please use the following **BibTeX** entry:
78-
79-
```
80-
@software{Tahar_dqLib,
81-
author = {Tahar, Kais},
82-
title = {{dqLib}},
83-
url = {https://github.com/KaisTahar/dqLib}
84-
year = {2021}
85-
}
86-
```
87-
See also: [`CORD-MI`](https://www.medizininformatik-initiative.de/de/CORD)
74+
- To cite `dqLib`, please use the CITATION file located in the folder `./inst`.

inst/CITATION

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
citHeader("To cite dqLib in publications, please use:")
2+
3+
## Metadata generated from the DESCRIPTION file
4+
if(!exists("meta") || is.null(meta)) meta <- packageDescription("dqLib")
5+
if (is.null(meta$Date)) meta$Date <- Sys.Date()
6+
year <- format(as.Date(meta$Date), "%Y")
7+
title <- paste("dqLib:", meta$Title)
8+
auth <- paste(meta$Author)
9+
note <- paste("R package version", meta$Version)
10+
url <- meta$URL
11+
12+
## Software
13+
bibentry(
14+
bibtype = "manual",
15+
title = title,
16+
author = auth,
17+
year = year,
18+
note = note,
19+
url = url
20+
)
21+
22+
## Journal article
23+
bibentry(bibtype = "Article",
24+
title = "Rare Diseases in Hospital Information Systems — An Interoperable Methodology for Distributed Data Quality Assessments",
25+
author = c(as.person("Kais Tahar"),
26+
as.person("Tamara Martin"),
27+
as.person("Yongli Mou"),
28+
as.person("Raphael Verbuecheln"),
29+
as.person("Holm Graessner"),
30+
as.person("Dagmar Krefting")),
31+
journal = "Methods of Information in Medicine",
32+
volume = "62",
33+
number = "3",
34+
pages = "71-89",
35+
year = "2023",
36+
publisher = "Georg Thieme Verlag KG",
37+
doi = "10.1055/a-2006-1018")

0 commit comments

Comments
 (0)