Skip to content
bramboomen edited this page Mar 26, 2024 · 3 revisions

Etl-tooling git submodule

The etl-tooling repository is meant to be used as a 'submodule' of a pipeline repository, situated at the root of the repository. When initialized and updated, in the file-system the submodule looks like a generic subfolder called 'etl-tooling' containing all the files from the repository. To git the submodule is merely a reference to a specific commit of the etl-tooling repository.

Add etl-tooling submodule to existing pipeline

The etl-tooling submodule needs to be added, which will immediately clone the contents of the module. To add the changes, it needs to be committed to the pipeline repository. The commit will cover only the reference to the etl-tooling repository, not the contents.

  1. Add the submodule:
    git submodule add git@github.com:CWTSLeiden/CWTS-ETL-tooling.git etl-tooling
  2. Commit the submodule:
    git add etl-tooling
    git commit -m "add etl-tooling version x.x.x"

Clone existing pipeline containing etl-tooling submodule

When cloning a repository, the submodule reference will be cloned, but not the contents, resulting in a repository which has an empty folder where the etl-tooling repository should be. To clone the contents, the submodule needs to be registered and updated. This will bring the etl-tooling to the commit version to which the reference points.

  1. Clone the pipeline repository:
    git clone git@github.com:CWTSLeiden/$pipeline.git
  2. Register the submodule:
    git submodule init etl-tooling
  3. Clone the submodule:
    git submodule update

Update the etl-tooling to the pipeline version.

When another user updates the repository of a pipeline which includes a commit that updates the etl-tooling reference to a new version, these changes need to be pulled separately.

  1. Update the pipeline repository
    git pull
  2. Update the etl-tooling submodule
    git submodule update

Update the etl-tooling submodule of a pipeline to the latest version.

To bring the etl-tooling submodule of a pipeline up-to-date, update and commit the submodule.

  1. Update the etl-tooling submodule
    git submodule update --remote etl-tooling
  2. Commit the new etl-tooling submodule version
    git commit -m "update etl-tooling"

Make changes to the etl-tooling repository

To make changes to the etl-tooling it is recommended to make the changes in the etl-tooling repository (not in the etl-tooling submodule of a pipeline repository), commit those changes and then update the submodule in the pipeline repository.

  1. Changes in etl-tooling repository
    # in etl-tooling repository
    git commit -m "update etl-tooling function"
    git push
  2. Update etl-tooling submodule changes in pipeline repository
    # in etl-tooling repository
    git submodule update --remote etl-tooling
    git commit -m "update etl-tooling"

Change the remote url of the etl-tooling repository

To change the upstream url of the etl-tooling submodule in a pipeline edit the .gitmodules file in the pipeline repository, synchronize and then update the submodule.

  1. Update the url field in the .gitmodules file in the root of the pipeline repository. If the main branch has a different name than the original branch it is necessary to add the branch = directive.
    [submodule "etl-tooling"]
        path = etl-tooling
        url = git@github.com:CWTSLeiden/CWTS-ETL-tooling.git
        branch = main
    
  2. Synchronize the submodule so that it is tracking the new url (and branch).
    git submodule sync etl-tooling
  3. Update the submodule to pull in the new upstream version.
    git submodule update --remote

NOT RECOMMENDED! Make changes to the etl-tooling submodule

When working in a pipeline repository, you can edit the files of the submodule as if it were a regular git repository and commit those changes to the etl-tooling repository.

  1. Make changes to etl-tooling submodule and commit
    # in pipeline repository root
    # make edits to etl-tooling/$file
    cd etl-tooling
    git add $file
    git commit -m "update $file"
  2. Push changes to etl-tooling repository. (note that we are working with HEAD detached)
    # in pipeline repository root
    cd etl-tooling
    git push origin master
  3. Update etl-tooling submodule changes in pipeline repository
    # in pipeline repository root
    git commit -m "update etl-tooling"
  4. Set etl-tooling pointer to latest revision to be safe
    git submodule update --remote etl-tooling