This is a data pipeline demo project for learning how to work with components such as Kafka, Flink and Marathon apps on DC/OS to analyze data. We will setup the following demo environment:
This demo was designed for running DC/OS and all required components on local laptop and not intended for production usage.
You can also find other cool vagrant and DC/OS demos here:
- You'll need vagrant (recommend 1.9.4).
- (Optional) All build steps can be performed on the bootstrap node, but you can also perform build steps on a local docker engine. For example, install docker-machine to build the demo app.
- A system with about ~16G of RAM, and 8 Cores. This demo was built on macOS 10.12.3. SSD drive is highly desirable. An internet connection with 10 mBits / sec or better.
- Allow 60 minutes for vagrant up to complete.
- Allow 30 minutes for DC/OS deployment steps.
- (optional) trying to go passwordless? Visit these instructions or run these steps:
curl -o /tmp/passwordless.sh \ --url 'https://raw.githubusercontent.com/dcos/dcos-vagrant/master/ci/passwordless.sh' && \ sudo bash /tmp/passwordless.sh ; \ rm -f /tmp/passwordless
- generate some keys if you haven't done that yet, we'll need those in the keys directory:
NOTE: These keys can not be encrypted, as the DC/OS installation will be using them for an unattended installation.
cd keys && ./genkeys.sh && cd ..
- run vagrant up!
vagrant up
in the root directory of this project. - At this point, you might want to take snap shots of all VMs so that you can repeat the DC/OS deployment iteratively with different configurations.
for v in $(vagrant status|grep '.dcos-demo'|awk '{print $1}'); do vagrant snapshot take $v base; done
At this point you should have setup 6 nodes for deploying DC/OS.
- 1 bootstrap for docker commands and DC/OS installation
- 1 master for DC/OS operations
- 1 public agent for application routing with Marathon LB
- 3 private agents for all component deployments
This should allow us to continue to deploy the data pipeline demo.
The goals are simple. Learn how to deploy DC/OS and manage a simple data pipeline.
The vagrant setup performed the following actions, now it's time to continue to the DC/OS deployment:
- We installed docker only on the bootstrap node.
- We configured the ssh key in the keys folder to allow for an ssh based deployment.
- We installed minimal packages on all nodes.
- We generated the basic configuration file from the genconf/config.yaml
We perform the installation manually to tune the DC/OS environment just right for the target laptop. A minimum setup has been prepared, but adjustments might need to be made to the configuration.
- Connect to the bootstrap node:
vagrant ssh bootstrap.dcos-demo
- Change to dcos_install folder :
cd /opt/dcos_install
- Install all the node pre-reqs:
sudo bash ./dcos_generate_config.sh --install-prereqs -v
- Verify we can install with pre-flights:
sudo bash ./dcos_generate_config.sh --preflight -v
- Deploy DC/OS:
sudo bash ./dcos_generate_config.sh --deploy -v
- Verify install is good with post-flights:
sudo bash dcos_generate_config.sh --postflight -v
- Setup the dcos client.
Once all these steps are complete, try accessing the DC/OS environment with a local browser at the following URL: http://m1.dcos-demo
Tweak the Vagrantfile and genconf/config.yaml for other configurations. For example, 3 masters, 1 agent, 1 public agent.
Now lets do the data pipeline demo!
We can start with this site, however lets provide some specific help with the vagrant VMs.