Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resources API #68

Closed
aaron-junot opened this issue Apr 8, 2019 · 6 comments · Fixed by #81
Closed

Resources API #68

aaron-junot opened this issue Apr 8, 2019 · 6 comments · Fixed by #81

Comments

@aaron-junot
Copy link
Member

The purpose of this issue is to request that the Resources API be deployed on the Operation Code infrastructure.

Description
The Resources API provides access to a database of learning resources for Operation Code members to benefit from. There is an API to get resources based on category, programming language, etc. Authenticated members can create or update resources.

Where it lives
https://github.com/OperationCode/resources_api

Dependencies

  • Docker
  • Make (for management purposes)
  • PostgreSQL database

Docker pulls in required dependencies such as pip and related packages and those specific dependencies can be viewed in the requirements.txt

Requirements
There is a setup script which can be run using make setup that will populate the database with the seed data found in resources.yml. The Flask app itself can be run using the command make run.

@apex-omontgomery
Copy link
Member

I'd like to use this as a way for me to learn to add new projects to the current k8s infrastructure, and maybe as a test for circle CI.

Are there any environment variables that need to be used?
Are there any secrets that need to be stored and do they differ based on environment?
What internal (to operation code) systems do you need to talk to?
What external (to operation code) systems do you need to talk to?
What external systems do you need to allow connections from?
Do you need a staging environment?
-> do you need a separate staging and prod database
-> how will the application know which database it's using, what other environment variables are different?
What do you want the url to connect to the environments to be?
-> backend currently uses api.staging.operationcode.org and api.operationcode.org
-> pybot uses pybot.operationcode.org and something really long for staging.
How are logs being handled?
Is there a strategy for handling alerting and critical events (ex: sentry raven, prometheius) ?
How do you want to handle environment escalation to staging and production?

@aaron-junot
Copy link
Member Author

aaron-junot commented Apr 8, 2019

Are there any environment variables that need to be used?

The environment variables are contained in the .env file. The following need to be changed for production:

SQLALCHEMY_DATABASE_URI=postgresql://user_name:change_password@resources-psql:5432/resources-psql
POSTGRES_USER=user_name
POSTGRES_PASSWORD=change_password
DB_USERNAME=user_name
USER=user_name
PASSWORD=change_password
FLASK_ENV=development
FLASK_DEBUG=1

The database URL will need to change to the actual db url. The usernames and passwords will need to be changed to actual usernames and passwords. FLASK_ENV should be production and FLASK_DEBUG must be removed.

Are there any secrets that need to be stored and do they differ based on environment?

The above list all the secrets used in the app.

What internal (to operation code) systems do you need to talk to?

The API should talk directly to the OC backend for authentication purposes.
For the MVP, we don't need to get pybot and slack involved just yet. Eventually, we want to connect it such that we can use a slash command or have the bot automatically post resources in certain channels, but I'm not worried about that just yet.

What external (to operation code) systems do you need to talk to?

Currently, the app is self-sufficient but once we get OperationCode/resources_api#83 fixed, there will be external calls to various URLs to verify the resources.

What external systems do you need to allow connections from?

We need to allow requests from anywhere, the app provides its own authentication for routes that require it.

Do you need a staging environment?

A staging environment is a nice-to-have but not a hard requirement at this point.

-> do you need a separate staging and prod database

If we do get a staging environment, we should certainly keep the database separate from prod.

-> how will the application know which database it's using, what other environment variables are different?

The SQLALCHEMY_DATABASE_URI determines which database it uses (see above)

What do you want the url to connect to the environments to be?
-> backend currently uses api.staging.operationcode.org and api.operationcode.org
-> pybot uses pybot.operationcode.org and something really long for staging.

I'm partial to resources-api.operationcode.org but I'm flexible on this.

How are logs being handled?

There is a logger that creates log files in resources_api/log. We can change the logger to log in a different location for prod depending on the easiest thing from an ops perspective. It should be trivial to log to wherever we need to.

Is there a strategy for handling alerting and critical events (ex: sentry raven, prometheius) ?

No current strategy, but this sounds like a good idea. Please advise on what we would need to add to the application to accomplish this.

How do you want to handle environment escalation to staging and production?

I'm not sure I understand this question

@ohaiwalt
Copy link
Member

ohaiwalt commented Apr 8, 2019

I'd also ask questions around impacts of downtime, what's in the database, does the database need to be backed up, what happens if it's down, etc.

Does your application have a health check endpoint?

@aaron-junot
Copy link
Member Author

I'm not really sure on the impacts of downtime, no one is using it for anything yet. But if it was down and someone was trying to get learning resources, they wouldn't be able to which would potentially increase churn of OC members (supposing that they join looking for learning resources, they try to use our tools to get them and then they find out that they can't so they just leave OC altogether).

The database contains thousands of resources for learning about Programming, Cybersecurity, resources for Veterans and Minorities, as well as Career Advice and a variety of other topics relevant to OC members. They are organized by category and (where relevant) programming language.

The database should certainly be backed up. Right now, we have a YAML file to seed the database with the current resources, but the idea is to stop updating the YAML file when the database goes live. Therefore, we need to back it up so we don't lose new resources (or changes to existing resources) if it goes down and needs to be blown away for any reason.

The application does not currently have a health check endpoint, but we can easily add one. Just say the word and I'll open an issue and start working on the PR.

@aaron-junot
Copy link
Member Author

Oh, I may have misunderstood the "impacts of downtime" question. If the question is "Can you use the API at all if the DB is down?" the answer is "no". Every operation is a CRUD operation of some sort on the database. If the DB is down, the code is useless.

@ohaiwalt
Copy link
Member

@aaron-suarez, @wimo7083 and I had some conversations in Slack, I'll try to capture them here for posterity, in no particular order:

  • We ended up settling on resources.operationcode.org and resources.staging.operationcode.org for URLs. There's several options here, ranging from representing the resources app as a path underneath the OpCode backend like api.operationcode.org/resources/ using reverse proxies with Kubernetes Ingress, or the currently chosen subdomain model. We can even try both to see how well they work. In some environments it would make sense to go for the path option for SEO and product unity, as well as ease for TLS certificates. For us, it's mostly preference.

  • The application needs a health check endpoint that returns an HTTP OK when it's ready. That will be represented on a /healthz handler, and can be very basic for now. In the future, this could be more fully featured and exercise dependencies, such as verifying database connections. This doesn't need to be ready before standing up the infra, but it is important to make sure the application stays healthy. If you have version A running and introduce a change of version B and attempt to deploy, without the health check Kubernetes will roll your containers through and the result will be an outage. With a healthcheck, the deployment will never complete because Kubernetes won't terminate the old containers until the new ones are ready.

  • Instrumentation: Gathering data about your running application, and the things you care about. It's how we are able to answer the question "How do you know it's working?". Aaron is going to add a /metrics handler using the Prometheus python client library, since we were able to stand up a Prometheus instance this weekend to begin gathering metrics about our infrastructure. It doesn't need to be fully featured, but allows us to begin to monitor the application.

  • Since we introduced ArgoCD with Introduce ArgoCD for GitOps workflows #78 and CI/CD management of DNS records with Define primary hosted zones in code, with CI/CD #67, it's now largely possible to create and iterate on your own resources in our environment, solely by interacting with this repository. In the PR for Add Resources API #81, there's some good conversation and a beginning of doing just that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants