- Learn how to ingest XDM-formatted data files
- Learn the API interface
- Sample data .parquet file in Profile XDM format: ProfileDataSample.parquet
- Ingest and confirm a .parquet file in the Experience Platform API
While using the UI for visualization and other tools are useful for Analysts/Architects, users like the Data Engineer ("Joe") and App Developer ("Adam") are likely to also utilize the Adobe Experience Cloud APIs to ingest data. This includes not just batch data of files, but also point-to-point connectors that could be transferring or streaming data from the sign-up forms for loyalty program, sign-in details from the login screen, as well as transactions from logged-in users using the company's website. We will implement these different data ingestion and lookup functions using the API, and demonstrate the tools available for integration of Experience Platform. When developing integrations and connections to Experience Platform from the Company's touch points for their customer loyalty program (sign-up forms, information changes, transaction history, web interactions), utilizing the APIs will facilitate an automatic method to establish new basis profiles, or to bring additional customer data into their basis profile.
Once customer profile data exists within the platform, integrations with the company's website as well as internal data science needs will require the ability to query the data on a point-to-point basis for profiles, as well as filtering and selecting segments (also covered in next steps). We will use the APIs to lookup a customer's profile or to filter through data.
-
Building on the tasks we completed in chapter 6 we'll create a dataset using our custom schema type and ingest some data.
-
Start by expanding the
Chapter 7
&List Datasets
folder in postman underAdobe Experience Platform
. -
Skip over
Catalog: Get Datasets
and instead select theCatalog: Get Datasets Limit 5
action and clickSend
.In the response pane you can scroll down and see the top five datasets. We are limiting the number of datasets returned to five in order have the query return quickly and not process too much data.
-
Next let's run the
Catalog: Get Batches
action by selecting it and clickingSend
.Datasets can consist of multiple batches.
-
Finally let's run the
Catalog: Get Dataset Files
action by selecting it and clickingSend
.In this case we'll get a list of files in the dataset and the metadata will include which batch it is from. Now that we've learned how to query datasets let's get busy creating one.
-
Expand the
Create Dataset
folder in postman underChapter 7 - Ingest the Data
, select theCatalog: Create Dataset
action and clickSend
.The call will create a new dataset and return a ID we can use in future calls. The
"unifiedProfile": ["enabled:true"]
flag within the body ensures that this dataset is automatically included in the Unified Profile Service, which we detailed in Chapter 5.Remember that the DataSet is based on the schema you select like Profile.
Once created it will conceptually look like this:
-
Next we'll call
Catalog: Get Dataset
to inspect the dataset:In the response area of the postman call you view the datasets metadata.
-
Now that we've successfully created a dataset we'll create a batch so we can start ingesting files.
Expand the
Create Batch
folder, selectIngest: Create Batch
and clickSend
: -
Next we'll upload a file to the batch.
Select
Ingest: Upload File
then click on theBody
tab. -
From there you'll need to select a file to upload by clicking on
Choose Files
and selecting ProfileDataSample.parquet.Then click
Send
to upload the file. If the upload succeeds you won't see anything in the response section other than a 200 OK. -
Since batches may contain multiple files we have to make an additional call to close off the batch to indicate that the batch is now ready for processing.
So select the
Ingest: Close Batch
action and clickSend
.Once again you won't see anything in the response section of postman other than a 200 OK.
-
If we've done everything right up until this point the file and batch should be successfully ingested. In order to check on that we'll open the
Batch Status
folder, select theBatch by ID
action and clickSend
.In the response section pay close attention to the value of
status
. We want it to saysuccess
but if it saysprocessing
it just means we have to wait a bit and make this postman request again.In the meantime, this is what the response would look like if the ingested file brings up a
failed
status - likely due to not being in the correct format for that dataset: -
Once we've seen the file in the batch has been successfully ingested we can check to see how many files are in the batch. Select the
Files in Batch
action and clickSend
.The response pane will show you how many files are in this batch. In our case there should only be one file.
-
Let's walk back up the containers and check on the status of our dataset. Expand the
Dataset Status
folder, select theCatalog: Get Dataset
action and clickSend
.Scroll down in the response pane until you find
lastBatchStatus
property which should read assuccess
. -
Things are looking good for our dataset but what if we want to get more information about the files stored in the dataset. Then we open the
Data Access
folder, selectData Export: Get File Details
action and hitSend
.This response contains important metadata like the file name and file size.
-
Next, let's take a look at the file stored in data services.
Select the
Data Export: Get File
action and hittingSend
.In this case the response is not that useful to us as .parquet files are binary in format so they are difficult to read. However, it should make customers happy to know whatever data they import into data services can be exported as well.
-
Finally, let's use the preview API to get a human readable version of the dataset we've just created. Select the
Data Export: Get Dataset Preview
action and hittingSend
.This response returns JSON data which is at least textual format which can be read without any additional programs to do the formatting. Note: preview will only show you the results of the latest batch to be ingested into that dataset.
Well that was a lot of work to ingest the data. Let's move on to the next step which is querying the data and targeting specific users.
Catalog Service RESTful API Resource
Previous: | Next: |
---|---|
Chapter 6 - API - Schema: Explore and Define XDM Schema | Chapter 8 - Technical: Querying: Using the Profile Query Service |
Return Home: Workbook Index