Skip to content

Commit

Permalink
docs: update readme partials (#6)
Browse files Browse the repository at this point in the history
  • Loading branch information
anguillanneuf committed Jan 15, 2021
1 parent 500c2dc commit f9cde0a
Showing 1 changed file with 33 additions and 60 deletions.
93 changes: 33 additions & 60 deletions .readme-partials.yaml
@@ -1,51 +1,44 @@
custom_content: |
## Requirements
### Enable the PubSub Lite API
### Creating a new subscription or using an existing subscription
Follow [these instructions](https://cloud.google.com/pubsub/lite/docs/quickstart#before-you-begin).
Follow [the instruction](https://cloud.google.com/pubsub/lite/docs/quickstart#create_a_lite_subscription) to create a new subscription or use an existing subscription. If using an existing subscription, the connector will read from the oldest unacknowledged message in the subscription.
### Create a new subscription or use existing subscription
### Creating a Google Cloud Dataproc cluster (Optional)
Follow [the instruction](https://cloud.google.com/pubsub/lite/docs/quickstart#create_a_lite_subscription) to create a new
subscription or use existing subscription. If using existing subscription, the connector will read message from the
oldest unacknowledged.
If you do not have an Apache Spark environment, you can create a [Cloud Dataproc](https://cloud.google.com/dataproc/docs) cluster with pre-configured auth. The following examples assume you are using Cloud Dataproc, but you can use `spark-submit` on any cluster.
### Create a Google Cloud Dataproc cluster (Optional)
If you do not have an Apache Spark environment you can create a Cloud Dataproc cluster with pre-configured auth. The following examples assume you are using Cloud Dataproc, but you can use `spark-submit` on any cluster.
```
MY_CLUSTER=...
gcloud dataproc clusters create "$MY_CLUSTER"
```
```
MY_CLUSTER=...
gcloud dataproc clusters create "$MY_CLUSTER"
```
## Downloading and Using the Connector
<!--- TODO(jiangmichael): Add jar link for spark-pubsublite-latest.jar -->
The latest version connector of the connector (Scala 2.11) is publicly available in
gs://spark-lib/pubsublite/spark-pubsublite-latest.jar.
The latest version connector of the connector (Scala 2.11) will be publicly available in `gs://spark-lib/pubsublite/spark-pubsublite-latest.jar`.
<!--- TODO(jiangmichael): Release on Maven Central and add Maven Central link -->
The connector is also available from the Maven Central
repository. It can be used using the `--packages` option or the
`spark.jars.packages` configuration property. Use the following value
The connector will also be available from the Maven Central repository. It can be used using the `--packages` option or the `spark.jars.packages` configuration property.
| Scala version | Connector Artifact |
| --- | --- |
| Scala 2.11 | `com.google.cloud.pubsublite.spark:pubsublite-spark-sql-streaming-with-dependencies_2.11:0.1.0` |
<!--
| Scala version | Connector Artifact |
| --- | --- |
| Scala 2.11 | `com.google.cloud.pubsublite.spark:pubsublite-spark-sql-streaming-with-dependencies_2.11:0.1.0` |
-->
<!--- TODO(jiangmichael): Add exmaple code and brief description here -->
## Usage
## Usage
### Reading data from PubSub Lite
### Reading data from Pub/Sub Lite
```
```python
df = spark.readStream \
.option("pubsublite.subscription", "projects/123456789/locations/us-central1-a/subscriptions/test-spark-subscription")
.format("pubsublite") \
.load
.option("pubsublite.subscription", "projects/$PROJECT_NUMBER/locations/$LOCATION/subscriptions/$SUBSCRIPTION_ID")
.format("pubsublite") \
.load
```
Note that the connector supports both MicroBatch Processing and [Continuous Processing](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#continuous-processing).
Expand Down Expand Up @@ -76,51 +69,31 @@ custom_content: |
| publish_timestamp | TimestampType | |
| event_timestamp | TimestampType | Nullable |
## Compiling with the connector
To include the connector in your project:
### Maven
```xml
<dependency>
<groupId>com.google.cloud.pubsublite.spark</groupId>
<artifactId>pubsublite-spark-sql-streaming-with-dependencies_2.11</artifactId>
<version>0.1.0</version>
</dependency>
```
### SBT
```sbt
libraryDependencies += "com.google.cloud.pubsublite.spark" %% "pubsublite-spark-sql-streaming-with-dependencies_2.11" % "0.1.0"
```
## Building the Connector
The connector is built using Maven. Following command creates a jar with shaded dependencies:
The connector is built using Maven. Following command creates a JAR file with shaded dependencies:
```
```sh
mvn package
```
## FAQ
## FAQ
### What is the Pricing for the PubSub Lite?
### What is the cost for the Pub/Sub Lite?
See the [PubSub Lite pricing documentation](https://cloud.google.com/pubsub/lite/pricing).
See the [Pub/Sub Lite pricing documentation](https://cloud.google.com/pubsub/lite/pricing).
### Can I configure the number of spark partitions?
### Can I configure the number of Spark partitions?
No, the number of spark partitions is set to be the number of PubSub Lite partitions of the topic that the supplied subscription is for.
No, the number of Spark partitions is set to be the number of Pub/Sub Lite partitions of the topic that the subscription is attached to.
### How do I authenticate outside GCE / Dataproc?
### How do I authenticate outside Cloud Compute Engine / Cloud Dataproc?
Use a service account JSON key and `GOOGLE_APPLICATION_CREDENTIALS` as described [here](https://cloud.google.com/docs/authentication/getting-started).
Use a service account JSON key and `GOOGLE_APPLICATION_CREDENTIALS` as described [here](https://cloud.google.com/docs/authentication/getting-started).
Credentials can be provided with `gcp.credentials.key` option, it needs be passed in as a base64-encoded string directly.
Credentials can be provided with `gcp.credentials.key` option, it needs to be passed in as a base64-encoded string.
Example:
```
```java
spark.readStream.format("pubsublite").option("gcp.credentials.key", "<SERVICE_ACCOUNT_JSON_IN_BASE64>")
```
```

0 comments on commit f9cde0a

Please sign in to comment.