Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-s3 coming blank on site and amazons3 not coming , so combining them #825

Merged
merged 1 commit into from May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 0 additions & 25 deletions docs/connectors/amazons3.md

This file was deleted.

23 changes: 23 additions & 0 deletions docs/connectors/aws-s3.md
@@ -1,2 +1,25 @@
# AWS S3

1. Set a bucket e.g. zingg28032023 and a folder inside it e.g. zingg

2. Create aws access key and export via env vars (ensure that the user with below keys has read/write access to above):

export AWS_ACCESS_KEY_ID=<access key id>
export AWS_SECRET_ACCESS_KEY=<access key>

(if mfa is enabled AWS_SESSION_TOKEN env var would also be needed )

3. Download hadoop-aws-3.1.0.jar and aws-java-sdk-bundle-1.11.271.jar via maven

4. Set above in zingg.conf :
spark.jars=/<location>/hadoop-aws-3.1.0.jar,/<location>/aws-java-sdk-bundle-1.11.271.jar

5. Run using:

./scripts/zingg.sh --phase findTrainingData --properties-file config/zingg.conf --conf examples/febrl/config.json --zinggDir s3a://zingg28032023/zingg
./scripts/zingg.sh --phase label --properties-file config/zingg.conf --conf examples/febrl/config.json --zinggDir s3a://zingg28032023/zingg
./scripts/zingg.sh --phase train --properties-file config/zingg.conf --conf examples/febrl/config.json --zinggDir s3a://zingg28032023/zingg
./scripts/zingg.sh --phase match --properties-file config/zingg.conf --conf examples/febrl/config.json --zinggDir s3a://zingg28032023/zingg

6. Models etc. would get saved in
Amazon S3 > Buckets > zingg28032023 >zingg > 100