Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL SELECT query limesurvey using limer? #49

Open
AmyMikhail opened this issue Mar 4, 2019 · 2 comments
Open

SQL SELECT query limesurvey using limer? #49

AmyMikhail opened this issue Mar 4, 2019 · 2 comments

Comments

@AmyMikhail
Copy link

We are in the process of setting up a demo version of limesurvey, with a view to testing it out for use in our organisation to hold routine surveys. These surveys will always be open (collecting new responses daily). As such, analysts will not normally want to import all the data in one go as e.g. after one year there will be several thousand responses. More typically we would want to import a subset of the data that falls within a specific range for a given analysis pipeline.

It would therefore be useful to be able to perform a select SQL query for e.g., responses for a particular survey ID within a specific date range, or age range or whatever. I (and an increasing number of my colleagues) use R, and ultimately we want to be able to create automated reports from the survey data with R markdown which we might run on a regular basis (weekly / quarterly / annual etc).

Is it possible to perform an SQL select query with limer and if so would it be possible to demonstrate with a vignette how to do this?

@ratnanil
Copy link

ratnanil commented Mar 7, 2019

responses for a particular survey ID

you have to provide the survey ID get_responses()

perform a select SQL query

retrieving all data and querying locally is not an option? E.g.

library(limer)
library(dplyr)

data <- get_responses(123)
data_filtered <- filter(data, sex == "female")

we want to be able to create automated reports from the survey data with R markdown which we might run on a regular basis (weekly / quarterly / annual etc).

We do exactly that with limer and RS Connect.

@r0bis
Copy link

r0bis commented Apr 13, 2023

Whilst it would be nice to be able to export a range of data, I do not see how this could be done with the current RC API. But I have come across the usefulness of making demo surveys, they help to show people how this thing work and they keep demo surveys and data separate from real-live surveys. Preparing demo data takes time and when you are doing the demo it often is helpful to bring your demo data to recent past.

Process I employed - copy the survey you want to use as DEMO. Download data from the original survey as a VV file (it is a CSV with specific header). Edit your responses to make them suit your purpose using whatever method you prefer (R, Excel, python), save the VV file. Then import your demo data (the edited VV file) into the copied DEMO survey. It will import since the structure is the same. This whole process is a bit of an investment of time, Demonstration usually means you show the survey and ask people to enter their responses, then run the analysis/reporting script and show how analysis has changed from data they just submitted. I wrote a script that does two tasks: a) eliminate answers that are not part of the demo dataset, b) bring demo dataset submitdate forward so that the newest response is 1 week back from today. The process runs pretty quick and uses LS RCAPI. It may be impractical if you are dealing with many thousands of responses, but demo dataset does not need to be that big usually.

I would not think that anyone would implement response manipulation on server by SQL in RCAPI. If you are sure you wish to do that (I wouldn't) you can open phpMyAdmin and run your SQL. I agree with ratnanil that for regular reports it is best to download all the dataset and run SQL on it locally. Automated reports with limer and R are easy. If you do require surveys containing many thousands of responses, I would be somewhat worried about storing all that data on the LS server. You could fairly easily make a backup script that weekly (or daily) downloads all existing responses and stores them locally in your preferred storage (.CSV or via an SQL database import). Then the same script goes back to LS server and deletes the responses you have downloaded. That way you have all your data locally (e.g. on your LAN on a proper database server, you can run SQL). If you want to make a report that contains all the newest data - it is quite simple - you obtain data from your LAN database and also from LS server and join them together for your analysis.

My point is - I would not like to have LS database to serve as the single place that stores my valuable data. Every time I run a report, I make a local backup copy of the data. If I expected to have a massive amount of data, I would put in place process moving them out of LS and putting somewhere that is less exposed to internet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants