Feature request: Mechanism to init keyspaces and pre-populate test data on startup #280

tadhgpearson · 2024-03-29T02:09:12Z

Thanks for building and supporting this image - it's really nice to have a dockerized Cassandra to run that just works 👍

We're using this in combination with Fabric8's Docker Maven plugin to run integration tests supported by docker containers as part of our build. To run our tests effectively, we need to instantiate keyspaces and pre-populate test data for our integration tests. This seems to be a common request, see #31 , #65 , #104 and plenty of others...

We'd rather do this as part of the image, because then developers can also start the containers using docker-compose if they want to run or debug tests in their IDEs. To date, we're been butchering the entrypoint script to achieve this, but it's hard to read, easy to screw up, and doesn't survive version upgrades well 😢

Could you consider a more elegant way to support this use case? For example, here's how it works in the Oracle Docker image: https://github.com/oracle/docker-images/blob/main/OracleDatabase/SingleInstance/README.md#running-scripts-after-setup-and-on-startup
This approach even allows users to start the container and run scripts simply by passing them as arguments to docker run. I think it would greatly improve the usability of this image.

The text was updated successfully, but these errors were encountered:

tianon · 2024-03-29T17:01:56Z

#122 (comment) is really relevant here 😅

In short, we're really unhappy with how fragile and "not upstream" our existing script is, so the likelihood of us adding more behavior to it is very low. 🙇 ❤️

tadhgpearson · 2024-03-29T20:33:34Z

That's a pity because it's pretty config-heavy.

If I understand it correctly, the solutions you propose either
(a) requires a lot of boilerplate from the end user (me and others) for a common use-case (integration tests). Writing a new Dockerfile to build a new image for each script set is considerably more steps, more execution time and generally becomes copy-pasta across multiple projects OR
(b) requires me to update the docs to send each user into docker config and set it up for every project they run.. which breaks the "just clone and build" paradigm that makes it easy for developers to start using new projects.

I know what you mean by that script being fragile - having added script execution to it for our project, every time I upgrade the Cassandra image version I need to rewrite it again, and every time I'm scratching my head! But it's definitely possible...

For end users, being able to add a volume of scripts to run after startup would be quick and easy out-of-the-box.

LaurentGoderre · 2024-04-11T16:24:25Z

You could use an init container to achieve this step. You could have the default cassandra container running as the db and another cassandra with an custom entrypoint that runs the scripts on the remote host of the first container. This is a pattern I have used many times.

tadhgpearson · 2024-04-12T13:21:59Z

You certainly can... and as a user I think it's a pretty clumsy solution.

it's not obvious. You'd have to write some significant documentation explaining why you have two Cassandra docker images to avoid being questioned every time a new developer opens the docker config
in our case, this would need to be duplicated in the Fabric8 Maven Plugin (which we use at build time) and in our docker-compose (which we use when testing in the IDE.)

Honestly, what our setup does at the moment is have a test singleton in Spring Data Cassandra that loads the schema and required data before running the first test using Spring's CQL Script Runner. This is kinda OK - it separates concerns correctly, etc. but it's still requires quite a bit of boilerplate and super-classing, and it's not obvious what's going on when there's an error in the startup CQL.

Compared to, for example, what Oracle does in their SQL image, all of these are pretty poor solutions. I think we can do better - hence this issue. I'm saying we because I use this docker image multiple times a day. It's the best one out there, and I'm invested, I want to help make it better!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Mechanism to init keyspaces and pre-populate test data on startup #280

Feature request: Mechanism to init keyspaces and pre-populate test data on startup #280

tadhgpearson commented Mar 29, 2024

tianon commented Mar 29, 2024

tadhgpearson commented Mar 29, 2024

LaurentGoderre commented Apr 11, 2024

tadhgpearson commented Apr 12, 2024

Feature request: Mechanism to init keyspaces and pre-populate test data on startup #280

Feature request: Mechanism to init keyspaces and pre-populate test data on startup #280

Comments

tadhgpearson commented Mar 29, 2024

tianon commented Mar 29, 2024

tadhgpearson commented Mar 29, 2024

LaurentGoderre commented Apr 11, 2024

tadhgpearson commented Apr 12, 2024