Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: multiple buckets / providers support (#48) #177

Merged
merged 7 commits into from Jul 20, 2021

Conversation

hbollon
Copy link
Member

@hbollon hbollon commented Jun 24, 2021

As the title shown, I've added the multiple buckets / providers support into Terraboard.
It's now possible to set multiples S3/MinIO buckets or to mix providers (to have AWS, Gitlab and GCP simultaneously for example), Terraboard will fetch them all on db refresh.

However:

  • To use this functionality, the only configuration method usable is through Yaml config file. Env and flags are still usable for mono bucket configuration (maybe we can use them for multiple providers but it's not appropriated imo)
  • Atm, Terraboard is dependent of AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env variables for AWS provider, which prevents being able to have several backend AWS or others compatible providers like MinIO (we can still have several buckets under the same backend). We could consider making these values configurable via the yaml file however (@raphink what do you think?).

I tested that with MinIO/AWS but not with others providers since I don't have any account for them.
The terraboard-dev-multi-providers service in the test/ docker-compose is configured to use two MinIO bucker (just replace terraboard-dev by it in the compose file)
I also updated the ReadMe with some instructions / advices.

Should meets the needs of #48 😄

@coveralls
Copy link

coveralls commented Jun 24, 2021

Coverage Status

Coverage decreased (-0.4%) to 57.412% when pulling f3a13f8 on hbollon:multi-bucket-support into ad88878 on camptocamp:master.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated
In the case of AWS, don't forget to set **AWS_ACCESS_KEY_ID** and **AWS_SECRET_ACCESS_KEY** environment variables.

That's it! Terraboard will now fetch these two buckets on DB refresh. You can also mix providers like AWS and Gitlab or anything else.
You can find a ready-to-use docker example with two *MinIO* buckets in the `test/` sub-folder (just swipe the two terraboard services in the docker-compose file).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can find a ready-to-use docker example with two *MinIO* buckets in the `test/` sub-folder (just swipe the two terraboard services in the docker-compose file).
You can find a ready-to-use Docker example with two *MinIO* buckets in the `test/` sub-folder (just swipe the two terraboard services in the docker-compose file).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "swipe the services"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "swipe the services"?

I added in the docker-compose of /test, a Terraboard container configured to use a YAML config file and two MinIO buckets.
However, I commented it in favor of the Terraboard lambda container to avoid having two applications simultaneously.

services:
  terraboard-dev:
    build:
      context: ../
      dockerfile: ./Dockerfile
    environment:
      AWS_ACCESS_KEY_ID: root
      AWS_SECRET_ACCESS_KEY: mypassword
      AWS_BUCKET: test-bucket
      AWS_REGION: eu-west-1
      AWS_ENDPOINT: http://minio:9000/
      AWS_FORCE_PATH_STYLE: "true"
      TERRABOARD_LOG_LEVEL: debug
      TERRABOARD_NO_LOCKS: "true"
      TERRABOARD_NO_VERSIONING: "true"
      DB_PASSWORD: mypassword
      DB_SSLMODE: disable
      GODEBUG: netdns=go
    depends_on:
      db:
        condition: service_healthy
      minio:
        condition: service_started
    volumes:
      - ../static:/static:ro
    ports:
      - "8080:8080"

  # terraboard-dev-multi-providers:
  #   build:
  #     context: ../
  #     dockerfile: ./Dockerfile
  #   environment:
  #     AWS_ACCESS_KEY_ID: root
  #     AWS_SECRET_ACCESS_KEY: mypassword
  #     DB_SSLMODE: disable
  #     CONFIG_FILE: config/config.yml
  #     GODEBUG: netdns=go
  #   depends_on:
  #     db:
  #       condition: service_healthy
  #     minio:
  #       condition: service_started
  #   volumes:
  #     - ../static:/static:ro
  #     - ./config.yml:/config/config.yml:ro
  #   ports:
  #     - "8081:8080"

I'm not really happy with that, maybe we can split the test sub-folder into multiple config folders each with a docker-compose 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so you mean "swap" then, not "swipe" :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes 😆

Comment on lines +158 to +159
c := initDefaultConfig()
parseStructFlagsAndEnv(&c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for keeping the env / flags configuration possible with single provider Terraboard.
Indeed, now we have providers array in the config struct:

AWS []AWSConfig `group:"AWS Options" yaml:"aws"`

TFE []TFEConfig `group:"Terraform Enterprise Options" yaml:"tfe"`

GCP []GCPConfig `group:"Google Cloud Platform Options" yaml:"gcp"`

Gitlab []GitlabConfig `group:"GitLab Options" yaml:"gitlab"`

And so they don't have any elements in them on Config instanciation which prevent from using env variables or flags.
c := initDefaultConfig() add an empty struct in each of them and parse go-flags stuff in them. If they're still empty or invalid they will be ignored but with that process we can still use Terraboard with others configuration methods as before (although they are incompatible with multi buckets).

parseStructFlagsAndEnv(interface{}) is just a DRY function for go-flags parsing

main.go Outdated
continue
}
log.Debugf("Total providers: %d\n", len(sps))
for i, sp := range sps {
Copy link
Contributor

@raphink raphink Jun 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping inside refreshDB() means providers will be parsed sequentially. Maybe we could parallelize providers instead by looping ouside the call to refreshDB() in main() and keep refreshDB() the way it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea that I should have thought of!
This will allow to not interrupt all the fetching processes if one of them has an error and reduce the time that the refresh takes
I will do that

hbollon added a commit to hbollon/terraboard that referenced this pull request Jun 25, 2021
@hbollon
Copy link
Member Author

hbollon commented Jun 25, 2021

I've done the following changes:

  • RefreshDB is now parallelized
  • I added mutex lock security on sensible db operations (FirstOrCreate principally, which lead to errors without locks)
  • I split test directory into multiples sub-folder containing different configurations
  • I fixed all ReadMe typo issues

hbollon added a commit to hbollon/terraboard that referenced this pull request Jun 25, 2021
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
db/db.go Show resolved Hide resolved
db/db.go Outdated
err = db.FirstOrCreate(&lineage, types.Lineage{Value: sf.Lineage}).Error
if err != nil || lineage.ID == 0 {
log.Error("Unknown error in stateS3toDB during lineage finding")
log.Errorf("Unknown error in stateS3toDB during lineage finding: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be interesting to use a structured log here (with fields).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

hbollon added a commit to hbollon/terraboard that referenced this pull request Jun 25, 2021
@Shocktrooper
Copy link

As the title shown, I've added the multiple buckets / providers support into Terraboard.
It's now possible to set multiples S3/MinIO buckets or to mix providers (to have AWS, Gitlab and GCP simultaneously for example), Terraboard will fetch them all on db refresh.

However:

  • To use this functionality, the only configuration method usable is through Yaml config file. Env and flags are still usable for mono bucket configuration (maybe we can use them for multiple providers but it's not appropriated imo)
  • Atm, Terraboard is dependent of AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env variables for AWS provider, which prevents being able to have several backend AWS or others compatible providers like MinIO (we can still have several buckets under the same backend). We could consider making these values configurable via the yaml file however (@raphink what do you think?).

I tested that with MinIO/AWS but not with others providers since I don't have any account for them.
The terraboard-dev-multi-providers service in the test/ docker-compose is configured to use two MinIO bucker (just replace terraboard-dev by it in the compose file)
I also updated the ReadMe with some instructions / advices.

Should meets the needs of #48 😄

With AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY being needed does this prevent multiple cross account S3 buckets from being selected? Would they all have to be in the same account and is there work to migrate over to the AWS SDK or something that can automatically handle credentials automatically instead of relying on hardcoded environment variables

@hbollon
Copy link
Member Author

hbollon commented Jul 1, 2021

With AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY being needed does this prevent multiple cross account S3 buckets from being selected?

Actually, with this PR, yes all AWS buckets must be on the same account.

is there work to migrate over to the AWS SDK or something that can automatically handle credentials automatically instead of relying on hardcoded environment variables

This is exactly what I proposed to @raphink 😃

* allow aws multiple buckets to be configured directly in s3 yaml field (now an objects array)
* add another minio instance to multiple-minio-buckets testing env
* update all yaml example config files
@hbollon
Copy link
Member Author

hbollon commented Jul 5, 2021

I've added:

  • AWS access key and secret access key config fields and aws sdk appropriate usage to support multiple AWS accounts configuration
  • AWS Session Token support (should do the trick for feat: support aws_session_token #181)
  • The ability to directly configure multiple buckets for the same AWS account under the "S3" field of the YAML file which is now an objects array

I also updated unit test and docker testing env to match with new config (I also add a second minio container in the multiple-minio-buckets test env)
Like previously, I made sure that the config via environment variables or by flags is always possible for the instances of Terraboard with a single provider

Example of a YAML config file:

aws:
  - access-key: root
    secret-access-key: mypassword
    endpoint: http://minio:9000/
    region: eu-west-1
    s3:
      - bucket: test-bucket
        force-path-style: true
        file-extension: 
          - .tfstate
      - bucket: test-bucket2
        force-path-style: true
        file-extension: 
          - .tfstate

  - access-key: admin
    secret-access-key: password
    endpoint: http://minio-2:9000/
    region: eu-west-1
    s3:
      - bucket: test-bucket
        force-path-style: true
        file-extension: 
          - .tfstate

@mcanevet mcanevet added this to the v2.0.0-alpha1 milestone Jul 7, 2021
config/yaml.go Show resolved Hide resolved
state/aws.go Outdated
sess := session.Must(session.NewSession())

awsConfig := aws_sdk.NewConfig()
func NewAWS(c *config.Config) []*AWS {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd be easier to keep the existing NewAWS() method as it is and add a NewAWSCollection() wrapper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the iteration in NewAWS()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also because generally I'd expect an instantiation method to return a single instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I will do that for AWS but also for all others providers
Each one will have a NewProvider() function wich return a single instance and a NewProviderCollection() one which will iterate and call the other one. (ofc in functions names Provider will be replaced by his name)

README.md Show resolved Hide resolved
@Shocktrooper
Copy link

Is there since there was mention of aws sdk usage of credentials is there an example of how to tell terraboard to assume a role in another account or how to tell it to use an instance profile in the case of S3 buckets

README.md Outdated
1. Environment variables
2. CLI parameters
3. Configuration file (YAML). A configuration file example can be found in the root directory of this repository.
1. Environment variables **(only usable for mono provider configuration)**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Environment variables **(only usable for mono provider configuration)**
1. Environment variables **(only usable for single provider configuration)**

README.md Outdated
2. CLI parameters
3. Configuration file (YAML). A configuration file example can be found in the root directory of this repository.
1. Environment variables **(only usable for mono provider configuration)**
2. CLI parameters **(only usable for mono provider configuration)**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. CLI parameters **(only usable for mono provider configuration)**
2. CLI parameters **(only usable for single provider configuration)**

state/aws.go Outdated
sess := session.Must(session.NewSession())

awsConfig := aws_sdk.NewConfig()
func NewAWS(c *config.Config) []*AWS {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and also because generally I'd expect an instantiation method to return a single instance.

@hbollon hbollon force-pushed the multi-bucket-support branch 3 times, most recently from 0a4c557 to 6e54c13 Compare July 15, 2021 09:01
state/aws.go Outdated Show resolved Hide resolved
state/gcp.go Outdated Show resolved Hide resolved
state/gitlab.go Outdated Show resolved Hide resolved
if len(c.TFE) > 0 {
objs, err := NewTFECollection(c)
if err != nil {
return []Provider{}, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you named the return values of the method, you might just use return here iirc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, if one provider fails, you get no providers at all at the moment.

Copy link
Member Author

@hbollon hbollon Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, if one provider fails, you get no providers at all at the moment.

Do you think it's better to just log an error and continue with the following ones?
Initially, I considered this error as fatal since it would lead to the non-inclusion of an entire provider and because, potentially, it could come from an error in the configuration file which could impact the rest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I say I'd prefer to see some data and get an error somewhere rather than nothing at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you named the return values of the method, you might just use return here iirc

Indeed but in this case I think it's more simple like this since if I name the return values I can't use := anymore with err which will force me to declare the temp variable objs before each time I use it. It's, imo, more complex and furthermore it raise an error on gosimple if I do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I say I'd prefer to see some data and get an error somewhere rather than nothing at all.

Doesn't the user risk not realizing it if a provider fails but the others still bring states in Terraboard?

Copy link
Contributor

@mcanevet mcanevet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you should change one of your commits' commit message with an exclamation mark as it contains a breaking change.

state/tfe.go Outdated Show resolved Hide resolved
@raphink raphink changed the title feat: multiple buckets / providers support (#48) feat!: multiple buckets / providers support (#48) Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants