Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support Analytics External Data API #14

Open
vancelopez opened this issue Jan 4, 2019 · 5 comments
Open

Feature Request: Support Analytics External Data API #14

vancelopez opened this issue Jan 4, 2019 · 5 comments
Assignees
Labels
enhancement Feature request
Milestone

Comments

@vancelopez
Copy link

Salesforce.com documentation.

Would be great to be able to add Analytics datasets via API.

Particular use case: pull data from onsite SQL Server, transformations, push via salesforcer to Analytics.

@StevenMMortimer StevenMMortimer self-assigned this Jan 4, 2019
@StevenMMortimer StevenMMortimer added the enhancement Feature request label Jan 4, 2019
@StevenMMortimer StevenMMortimer added this to the Future milestone Jan 4, 2019
@StevenMMortimer
Copy link
Owner

@vancelopez Thanks for the request. I'm not very familiar with the Analytics External Data API. It looks like a pretty straightforward HTTP request of a CSV file with JSON header. Do you have an example of the type of CSV data you'd like to upload? I'm not sure what the type of file you want to upload would look like. Thanks.

@vancelopez
Copy link
Author

Thanks, @StevenMMortimer for the quick reply. The CSVs I upload (via web interface) are basic datasets (i.e. iris) that vary in size. The largest I upload is about 1 Gb. It looks like the External Data API wants the data in 10 mb chunks. Also, I believe Salesforce requires at least one "dimension" on external data, so there is a requirement for a character variable.

The metadata JSON could be created programmatically referencing Analytics External Data Format. However, it would also be nice to have the option to upload a user defined JSON file - I've done that with the large dataset because there are over 50 columns and so it becomes cumbersome to define datatypes using the point and click interface.

@StevenMMortimer
Copy link
Owner

StevenMMortimer commented Jan 5, 2019

@vancelopez I did some testing and realized that you can already use the salesforcer package to push data to the Analytics External Data API because it involves just creating a few records as you would normally do with any other object in Salesforce. Below is an example for you to follow.

I'll continue working on creating a specific function in salesforcer that will reduce this routine to a single function that will also do the more exciting things like inferring the data structure's JSON and stuff like that. However, this should be enough for you to start uploading your own data via the API. Let me know if you have any issues.

# load libraries that are required to run this example
library(salesforcer) # needed to interact with the Salesforce APIs
library(nycflights13) # package with a large dataset to test with
library(dplyr) # needed for bind_rows()
library(readr) # needed to write the csv out
library(iotools) # needed to process and create the binary files in <10MB chunks
library(base64enc) # needed to encode those data parts to be sent across the API

# authenticate to Salesforce as you normally would
sf_auth(username = username,
        password = password,
        security_token = security_token,
        login_url = 'https://login.salesforce.com/')

# insert a row into the InsightsExternalData
input_data <- c("Format"="Csv", 
                "EdgemartAlias"="flights", 
                #"MetadataJson"=metadataJson, # FUTURE ENHANCEMENT - CURRENTLY SUPPLY YOUR OWN JSON USING THE jsonlite PACKAGE and encode it using base64encode
                "Operation"="Overwrite", 
                "Action"="None")
ied_header_record <- sf_create(input_data, object_name="InsightsExternalData")

# write the data as a CSV into a temp file
f <- tempfile()
write_csv(flights, f)

# process the CSV file into chunks of < 10MB each and convert them to base64 encoded strings
data_chunks <- chunk.apply(input = f, 
                           CH.MERGE = list,
                           CH.MAX.SIZE = 7000000, #10485760 resulted in ~13mb chunks so use smaller
                           FUN = base64encode)

# for each chunk of the dataset insert it into the InsightsExternalDataPart object
# using the part number and the Id of the InsightsExternalData record we created initially
all_parts <- NULL
for(i in 1:length(data_chunks)){
  input_data <- c("DataFile" = data_chunks[[i]][1],
                  "InsightsExternalDataId" = ied_header_record$id, 
                  "PartNumber" = i)
  this_part <- sf_create(input_data, object_name="InsightsExternalDataPart")
  all_parts <- bind_rows(all_parts, this_part)
}

# stop if not all the parts were successfully uploaded
stopifnot(all(all_parts$success))

# now start the processing of the dataset by Salesforce
input_data <- c("Id"=ied_header_record$id, "Action"="Process")
ied_header_record_start <- sf_update(input_data, object_name="InsightsExternalData")

# keep checking back until the Status="Completed"
sf_retrieve(ied_header_record$id, fields=c("Status","StatusMessage"), object_name="InsightsExternalData")

Here is a screenshot of what the flights dataset looks like inside the Analytics dashboard
image

@vancelopez
Copy link
Author

Incredible. Thanks @StevenMMortimer. I will test.

@StevenMMortimer StevenMMortimer modified the milestones: Future, 0.1.3 Jan 6, 2019
@vancelopez
Copy link
Author

vancelopez commented Jan 10, 2019

@StevenMMortimer Successfully uploaded my own data to Analytics using your example. Thanks for the work on this!

I already had a metadata JSON text file, so included it in the header record with:
metadatajson <- base64encode::base64encode("C:/...")

I also tried first reading in the JSON using read_json(), then converting back with base64encode(toJSON(jsontext)) but it returned an error, "Invalid data in metadata property". I have not worked much with jsonlite.

@StevenMMortimer StevenMMortimer modified the milestones: 0.1.3, Future May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature request
Projects
None yet
Development

No branches or pull requests

2 participants