Skip to content
This repository has been archived by the owner on Mar 11, 2021. It is now read-only.

Dockerfile for DocumentCloud #119

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

stefanw
Copy link

@stefanw stefanw commented Jul 31, 2014

This PR contains the first steps at dockerizing DocumentCloud for easier setup.

  • Runs DocumentCloud via Thin behind a non-SSL nginx on Ubuntu 14.04
  • PostgreSQL runs in a different container and is configured via env variables
  • FileSystemStore can be used by setting an env variable
  • Solr integration works
  • Background Queue integration works
  • AWS integration works
  • Ready for production use

Documentation how to run this yourself can be found in README.docker. Some patches to the code base became necessary to make a non-SSL setup work etc.

We (@nickstenning and myself) will continue working on this and just wanted to start a conversation, in case some similar efforts are already ongoing.

stefanw and others added 6 commits July 16, 2014 15:23
- base the Dockerfile on phusion/baseimage, in order to use the runit
  supervisor and startup script support
- split up a couple of tasks in order to take advantage of Docker build
  caching
Don't even try to count the number of heinous sins committed herein.
You'll be here a while.

Unfortunately DocumentCloud makes all kinds of fairly weird assumptions
about how SSL is terminated in front of it that make running it
*without* SSL (or with SSL terminated by someone else downstream) really
hard.

So we just patched the shit out of it until it worked.

This is probably outright dangerous at this stage. Don't use it.

That said, you can now run DocumentCloud and get to the admin interface
without it blowing up. Which is progress. I think.
@nickstenning
Copy link

It's worth noting that we've done some really horrible things in here in order to convince DocumentCloud to be served over plain ol' HTTP (rather than requiring HTTPS everywhere). This definitely needs reworking and I'd reiterate @stefanw's point that this is about starting a conversation rather than requesting a merge.

(And, for the avoidance of doubt: of course serving DC over HTTPS is the ideal configuration, but it should be possible to run without, even if that requires setting a THIS_IS_DANGEROUS_ENABLE_INSECURE config option. Unfortunately, even doing this will require some changes to how DC generates URLs.)

@jashkenas
Copy link
Member

Out of curiosity — what are you guys planning on doing with your version of DocumentCloud?

@stefanw stefanw changed the title Dockerfile for Documentcloud Dockerfile for DocumentCloud Jul 31, 2014
@stefanw
Copy link
Author

stefanw commented Jul 31, 2014

@jashkenas No concrete plans yet, but one likely scenario is internal deployments in German newsrooms.

@knowtheory
Copy link
Member

Hey @nickstenning, long time!

@stefanw, nice to get a chance to chat, @pudo & Annabel Church mentioned you at srccon last week.

I'd be interested in hearing further thoughts on why you think serving over HTTP is so vital a requirement, because we've actually been thinking about going the other direction, and just serving all pages over HTTPS.

@stefanw
Copy link
Author

stefanw commented Jul 31, 2014

@knowtheory Hi there!

Serving all pages over HTTPS is exactly what we want, but DC should not know about the security of the connection at all (as this is transparently handled by e.g. nginx); DC should work protocol agnostic. The protocol switching code inside DC makes setting up a dev server a pain and complicates deployment.

DC should use absolute paths in links/redirects or use a setting for the domain URL (e.g. "https://www.documentcloud.org" or "http://localhost:3000") if needed.

@knowtheory
Copy link
Member

Cool, just had a conversation with @nathanstitt and came to the same conclusion.

Sure we'd be happy to work towards that as a goal and being able to proxy to the app server.

If you've poked through our config files you can see we deploy through nginx & passenger.

Additionally DocumentCloud does have internal config for setting what the server's root url is, and we build all of our links off of that root.

This will cause no end of confusion (being wrong) at some point in the distant future, so I'll just correct it now.
@codeportland
Copy link

Would be wonderful if someone could push this forward or at least update and maintain the install file. I know it's a lot of work but without some kind of help hosting this beast we can't use the software if the work we're doing doesn't qualify us for access to an official Document Cloud web-site account.

@knowtheory knowtheory mentioned this pull request Feb 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants