New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: download zip files #75
Comments
The drug event downloads are partitioned on an internal key called |
Follow-up questions: do you keep or remove earlier reports from the same safetyid? Thanks |
The reports are processed in order first to latest and only the latest one is kept. |
https://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20140131]+AND+safetyreportid:4261828 In the original ASCII 2004Q1, the id has only one record with Does receipt date also come from MFR_DT? or the newer FDA_DT? My original guess is caused by the change of data structure |
I do not know the answer to this one. The answer might be in the pdf files that come with the downloads. There is a sort of data dictionary in there that may prove useful in answering this questions. You could also ask the openFDA team, since they are in regular contact with the FDA and they can forward your question on to the internal group responsible for the drug event data. Good luck. |
@yunstat, the pipeline pulls drug adverse events from FAERS XML/SGML files, not the ASCII ones (the latter is only used for report id to case number conversion in some cases). So in the example you provided the dates come from AERS_SGML_2004q1.zip/sgml/ADR04M01.SGM:
So the drug event data should not be affected by the change in the ASCII structure you described. |
Thank you. After seeing the earlier response, I checked the NTS files to search for the reason of having smaller receipt_date. The FAERS system is started at 2012 Q4. Before that, the system is called AERS, or LAERS.
From 2004Q1 to 2012Q3, the NTS file describes
Therefore, receivedate >= receiptdate before 2012Q3, and receivedate <= receiptdate after 2012Q4. This solved my questions about having receiptdate smaller than receivedate at 2004. I have been thinking about the effect on aggregated counts. This is like a shift of time frame. My observation on the effect is small. Since openfda has been using ASCII files, there date fields in ASCII, like FDA_dt etc, that might be useful to anchor the reports. |
The format of FAERS xml files is recommended by the DTD in ICH E2b/M2 V 2.1 standard.
By definition, I would think receiptdate >= receivedate.
This example has receiptdate smaller than receivedate. The document, "MAINTENANCE OF THE ICH GUIDELINE ON CLINICAL SAFETY DATA MANAGEMENT : DATA ELEMENTS FOR TRANSMISSION OFINDIVIDUAL CASE SAFETY REPORTS E2B(R2)", provides some more details:
The AERS system started way earlier before 2001. The change of definition on XML keys may have its own historical reasons. Even now, I am still not sure the receivedate in FAERS can be defined as "the date the retransmitter first received the information". This retransmitter-first-received-date may not always available, for example, the retransmitter may not have the date in record, or have a lot of missing data. In many situations, the currently used receivedate is still a reasonably good and quick solution for monitoring the trend. |
Note that E2B(R3) is currently in use and not v2 - http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM275638.pdf |
@cerdman I just downloaded 2016q1 XML's zip file, and in both PDF and XML files it is stated that version 2.1 is used, not v3. PDF:
XML:
|
For the zip files packed at 2016/06/01, I downloaded the file named "2004q1/drug-event-0001-of-0002.json.zip", and found that the first three records of the file have strange receive_date:
"receivedate": "20100729",
"receivedate": "20101129",
"receivedate": "20110614".
Are those zip files packed randomly or by some rules?
The text was updated successfully, but these errors were encountered: