Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR for feature #58 (access private datasets and be able to upsert and full replace datasets) #64

Closed

Conversation

marks
Copy link
Contributor

@marks marks commented Nov 1, 2015

Please see issue (feature request) #58; This is the re-do of PR #63 since it was not correctly based on the dev branch

Mark Silverberg added 2 commits November 1, 2015 08:59
NOTE the new branch. Original work I did was on a branch that was not
based on `dev`

- Added `email` and `password` parameters to `returnData.R` and its
dependent methods such as those in `errorHandling.R` and `metadata.R`
- Added tests in `test-readPrivateDataset.R` and ran `document()`
- Was able to run `build()` successfully
- created `writeData.R` which defines `write.socrata()`
- Added tests in `test-writeData` and ran `document()`
- Was able to run `build()` successfully
@marks
Copy link
Contributor Author

marks commented Nov 1, 2015

I am seeing some tests fail locally but I think it is code not related to my changes in the dev branch. Please do let me know if this is not the case!

@dmpe
Copy link
Contributor

dmpe commented Nov 1, 2015

}


#' @description Method for updating Socrata datasets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

title is missing

@marks
Copy link
Contributor Author

marks commented Nov 2, 2015

How may I transmit the username/password used for the tests that require a Socrata login? I have been able to successfully test on my own travis-ci.org repo using environment variables.

Thanks!

@dmpe
Copy link
Contributor

dmpe commented Nov 2, 2015

@marks
Copy link
Contributor Author

marks commented Nov 2, 2015

You can't use containerized Travis unless sudo isn't required.. I also think it seems an admin of the repo is best off just adding the environment variables from the Travis interface

Mark Silverberg
(m) 512 826 7004
http://twitter.com/skram

On Nov 2, 2015, at 10:19 AM, John Malc notifications@github.com wrote:

http://kieranhealy.org/blog/archives/2015/10/16/using-containerized-travis-ci-to-check-r-in-rmarkdown-files/ may help or again what i said in #58


Reply to this email directly or view it on GitHub.

@dmpe
Copy link
Contributor

dmpe commented Nov 2, 2015

oh, i forgot, sorry.

---------- Původní zpráva ----------

Od: Mark Silverberg notifications@github.com

Komu: Chicago/RSocrata RSocrata@noreply.github.com

Datum: 2. 11. 2015 16:21:35

Předmět: Re: [RSocrata] PR for feature #58 (access private datasets and be
able to upsert and full replace datasets) (#64)

"
You can't use containerized Travis unless sudo isn't required.. I also think
it seems an admin of the repo is best off just adding the environment
variables from the Travis interface

Mark Silverberg

(m) 512 826 7004

http://twitter.com/skram

On Nov 2, 2015, at 10:19 AM, John Malc notifications@github.com wrote:

http://kieranhealy.org/blog/archives/2015/10/16/using-containerized-travis
-ci-to-check-r-in-rmarkdown-files/ may help or again what i said in #58

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
(#64 (comment)).

"=

@dmpe
Copy link
Contributor

dmpe commented Nov 6, 2015

cc @tomschenkjr

@tomschenkjr
Copy link
Contributor

Still reviewing this pull request. Some issues are showing-up in the process. Here are some quick notes so far:

  • read.socrata(url = "https://data.cityofchicago.org/Health-Human-Services/Food-Inspections/4ijn-s7e5") generates an error because of the rowCount line. Specifically, I'm trying to understand the grander concept around getMetadata(), which is trying to fetch a lot of information, but is using little of it. Does it make sense to just keep it simple to help troubleshooting?
  • read.socrata(url = "https://data.cityofchicago.org/Buildings/Building-Permits/ydr8-5enu") is rather FUBAR, but this is an existing problem. For a reason I cannot determine yet, the list is not being properly transformed into a data frame. Instead, it's being transformed into an ugly list.

There are some other issues, but am looking at these two first.

@marks
Copy link
Contributor Author

marks commented Nov 16, 2015

@tomschenkjr: I am seeing those errors as well but I am not confident they are part of this PR. I am getting those errors on the dev branch (which doesn't have this PR merged in). Do let me know how I can help though!

Thanks,

Mark

@tomschenkjr
Copy link
Contributor

Yeah, it's unrelated to the write functions you added so as an FYI at this point. Once I clear up those other issues I'll be looking at these new functions.

Sent by Outlookhttp://taps.io/outlookmobile for Android

On Mon, Nov 16, 2015 at 4:54 AM -0800, "Mark Silverberg" <notifications@github.commailto:notifications@github.com> wrote:

@tomschenkjrhttps://github.com/tomschenkjr: I am seeing those errors as well but I am not confident they are part of this PR. I am getting those errors on the dev branch (which doesn't have this PR merged in)

Thanks,

Mark

Reply to this email directly or view it on GitHubhttps://github.com//pull/64#issuecomment-157020153.


This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail (or the person responsible for delivering this document to the intended recipient), you are hereby notified that any dissemination, distribution, printing or copying of this e-mail, and any attachment thereto, is strictly prohibited. If you have received this e-mail in error, please respond to the individual sending the message, and permanently delete the original and any copy of any e-mail and printout thereof.

@dmpe
Copy link
Contributor

dmpe commented Nov 16, 2015

Any issue now can be profiled using https://github.com/rstudio/profvis.

devtools::install_github("rstudio/profvis")
library(profvis)
library(RSocrata)

profvis({
  g <- read.socrata(url = "https://data.cityofchicago.org/Health-Human-Services/Food-Inspections/4ijn-s7e5")
  print(g)
})

profvis({
  b <- read.socrata(url = "https://data.cityofchicago.org/Buildings/Building-Permits/ydr8-5enu")
  print(b)
})

For the second issue e.g. like this : socratalist

Specifically, I'm trying to understand the grander concept around getMetadata(), which is trying to fetch a lot of information, but is using little of it.

Yes, there are 2 ways: Either you use execute another query on the desired dataset with =Count(*) (getQueryRowCount does that) or you fetch that information from the socrata's undocumented columns.json view. Here, @marks can provide us some help and tell what is better and more realible (I guess the first one, otherwise there would be no such method). In ither case, we can use the first option only and provide the second (from me) as just another function for end-users.

@marks
Copy link
Contributor Author

marks commented Nov 16, 2015

@dmpe - i would suggest using the COUNT(*) as opposed to using columns.json. Documented metadata APIs coming soon!

@dmpe
Copy link
Contributor

dmpe commented Nov 16, 2015

Thanks, one issue solved. Some code changes will be needed. Assumption must be made that if from me then at earliest in Christmas break.

With the second issue, @marks can also try help us. (Not so helpful and not so nicely summarized and I am sorry for it.) Do you know why do some datasets (and btw. some not; see above) have an "invalid" json file, which may cause our issues. (well, I guess these were attempts to fix them: #19 #3 )

library(jsonlite)
df<-jsonlite::fromJSON("https://data.cityofchicago.org/resource/ydr8-5enu.json")

Cannot test it now, but as i remember it was failing at least some time ago.

@marks
Copy link
Contributor Author

marks commented Nov 17, 2015

@dmpe - could you help me understand this issue by sharing a list of datasets that are giving invalid JSON? I am very confident all JSON that we produce and put out fits the standard but it might not be exactly what jsonlite/R is expecting. I just need a little more info to be able to look into this more with the team. I will likely end up creating an issue at https://github.com/socrata/dev.socrata.com/issues/new if you want to go ahead and do that instead of posting here. Whatever is best for you!

@dmpe
Copy link
Contributor

dmpe commented Nov 17, 2015

Yeeh, code from #3 is blocked so i cannot test it anymore.
At least these are working, now. Most probably an error in our code.

df<-jsonlite::fromJSON("https://data.cityofchicago.org/resource/ydr8-5enu.json")
df1<-jsonlite::fromJSON("https://data.cityofchicago.org/resource/kn9c-c2s2.json")
df1<-jsonlite::fromJSON("https://data.cityofchicago.org/resource/4ijn-s7e5.json")
dfs1<-jsonlite::fromJSON("http://data.undp.org/resource/wxub-qc5k.json")

@dmpe
Copy link
Contributor

dmpe commented Nov 18, 2015

After spending huge amount of time on it, I start to think the error is comming somewhere from httr or curl package.
In fact, when I run code below, I got this:

gM3 <- getMetadata(url = "https://data.cityofchicago.org/resource/6zsd-86xi.json")
 Show Traceback

 Rerun with Debug
 Error in curl::curl_fetch_memory(url, handle = handle) : 
  Couldn't resolve host name 

But the same code for

gM2 <- getMetadata(url = "https://data.cityofboston.gov/resource/awu8-dc52")

is not an issue. WTF is it (with my PC) ?

@marks
Copy link
Contributor Author

marks commented Dec 23, 2015

@dmpe @tomschenkjr is there anything I can do to help keep this moving forward?

Appreciate all your hard work!

@tomschenkjr
Copy link
Contributor

I think we have a bit of scope creep in this PR, but I'm going to continue to evaluate the parts of it to clean-up the errors. Hope to do it this week.

Tom Schenk Jr.

Chief Data Officer

Department of Innovation and Technology

City of Chicago

(312) 744-2770

tom.schenk@cityofchicago.org

data.cityofchicago.org


From: Mark Silverberg notifications@github.com
Sent: Wednesday, December 23, 2015 12:57 PM
To: Chicago/RSocrata
Cc: Schenk, Tom
Subject: Re: [RSocrata] PR for feature #58 (access private datasets and be able to upsert and full replace datasets) (#64)

@dmpehttps://github.com/dmpe @tomschenkjrhttps://github.com/tomschenkjr is there anything I can do to help keep this moving forward?

[https://avatars3.githubusercontent.com/u/1799661?v=3&s=400]https://github.com/dmpe

dmpe (John Malc) · GitHubhttps://github.com/dmpe
github.com
dmpe has 26 repositories written in R, Java, and CSS. Follow their code on GitHub.

Appreciate all your hard work!

Reply to this email directly or view it on GitHubhttps://github.com//pull/64#issuecomment-166970369.


This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail (or the person responsible for delivering this document to the intended recipient), you are hereby notified that any dissemination, distribution, printing or copying of this e-mail, and any attachment thereto, is strictly prohibited. If you have received this e-mail in error, please respond to the individual sending the message, and permanently delete the original and any copy of any e-mail and printout thereof.

dmpe added a commit to dmpe/RSocrata that referenced this pull request Dec 24, 2015
…ow splitted.

update description file & squash all the commits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants