Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

Set up pagerduty for alerting and accountability #2072

Closed
6 tasks
patcon opened this issue Feb 25, 2014 · 15 comments
Closed
6 tasks

Set up pagerduty for alerting and accountability #2072

patcon opened this issue Feb 25, 2014 · 15 comments
Labels

Comments

@patcon
Copy link
Contributor

patcon commented Feb 25, 2014

In setting up hubot and preparing for him to be given responsibility, I feel we need a good way to make sure that if he ever goes down, someone will be alerted right away, and there will be chain of responsibility if that someone is unavailable (ie. an on-call schedule and escalation policy).

I've had experience with PagerDuty in the past, and they are a great tool, trusted by other great product companies (GitHub being one of them).

I personally wouldn't feel comfortable giving hubot any future control unless he's got properly monitored with alerting in place :) I'm really excited to work on implementing on-call schedules and rotations for other critical monitoring areas of Gittip. This will help us distribute responsibility to the right people (and away from Chad). And if you're on an on-call schedule, Gittip gets a little more real for you, which is a good thing :)

Evolving vendor policy: gratipay/inside.gratipay.com#15
Roobot PagerDuty setup issue: gratipay/roobot#8

Todo

@chadwhitacre
Copy link
Contributor

Go for it. What do you need from me?

@patcon
Copy link
Contributor Author

patcon commented Feb 27, 2014

Just waiting on @m3matta to have some conversations on his end on Monday. We're not blocked on anything technical here, but I'd like to leave this open until we've sorted out the policy linked above, as I feel like we'll need that to wrap this up tidily.

Also, as part of the convo we had, I've offered to come in an give a Gittip presentation to the @PagerDuty team if they take me up on it (which I hope they do!). They're currently sharing a space with @uken games in Toronto.

@patcon
Copy link
Contributor Author

patcon commented Feb 27, 2014

Ah wait, we can use this issue for setting up security@gittip.com notifications via pagerduty too, as you suggested in IRC :)

I would need you to set up forwarding to not just your email, but also to the pagerduty service email address (which is tied to the "security-email" service). I've created an escalation policy and on-call schedule that only you are a part of, but you could easily decide to include other people. You'll need to log into your pagerduty account (I sent you an invite last week), and set up phone/text/app alerts:
https://gittip.pagerduty.com/users/PUR9ZXS

I'd recommend putting the app first, as you can get in the habit of treating that comm method as VIP.

@chadwhitacre
Copy link
Contributor

IRC

@patcon
Copy link
Contributor Author

patcon commented Mar 5, 2014

Thinking we should have a checklist of security practices that we're agreeing we follow before getting on the security team. Like if someone has access to privileges information, and we're making it public that it's going to their inbox and machine, what are the expectations we're holding each other to. Maybe we should come to agreement on that. Thoughts @greggles? Would love to hash this out with you.

Is this a blocker guys?

Rough thoughts:

  • using a password vault?
  • x bits of entropy in every password (higher expectations of master password)
  • up-to-date and patched operating system
  • looking unfavourably on windows?
  • understanding of PGP? Email encryption?
  • ssh keys with passphrases (x bits of entropy min)
  • locked phone screen?
  • ???

Maybe this is another issue. I'm just wary of not having spoken with anyone about their security practices before putting them on a security team. (myself included -- I appreciate being gut-checked)


I recognize that there's only so much we can ask of open source contributors, but at my previous job, I'd always be enamoured with the way pivotal labs deals with developer machines: they make each machine entirely disposable, and reprovision them regularly from scratch using an internal project called sprout. All dev's know enough Chef to get their favourite mac config into code, so it's there when they reprovision. The share best practices and security practices by sharing and improving the provisioning code.

I'm getting a new ubuntu laptop this week, and I'll aim to work like this, if anyone else is interested.

@greggles
Copy link
Contributor

greggles commented Mar 5, 2014

Those all work for me. I'd add

  • something about not using untrusted networks (i.e. insecure wifi)
  • having your machine go into "lock" within X minutes of going "away"
  • A preference for: Encrypted filesystem seems like a nice thing to ask for, though honestly I've never taken that step personally.
  • Some form of virus scanning (regardless of OS)

I don't think people on modern Windows machines should be excluded. Some of my favorite security people use Windows :)

@patcon
Copy link
Contributor Author

patcon commented Mar 6, 2014

Rockin. Thanks @greggles

cc: @whit537 (in case you unsubbed previously)

@patcon
Copy link
Contributor Author

patcon commented Mar 12, 2014

When regenerating my ssh key, discovered that encrypting the private key (PKCS#8) is a low-effort way to make brute-forcing more difficult. Might be something to add to our security checklist
http://martin.kleppmann.com/2013/05/24/improving-security-of-ssh-private-keys.html

@patcon
Copy link
Contributor Author

patcon commented Apr 21, 2014

New development in gratipay/inside.gratipay.com#15

@chadwhitacre
Copy link
Contributor

We're on PagerDuty. IRC

@patcon
Copy link
Contributor Author

patcon commented Apr 24, 2014

@clone1018 and I walked through how it works:
https://www.youtube.com/watch?v=SNOnTp32ISo

@patcon
Copy link
Contributor Author

patcon commented Apr 24, 2014

Our action items were to find uptime monitoring services that were more fine-grained than uptimerobot (5 min is the best thy can poll). I was going to test-drive port-monitor, and luke was going to check out statuscake, which both have 1 min checks (port-monitor is free, statuscake costs $6/month for the 1-minute check, free for 5-minute checks).

I had a bad experience with port monitor, and don't recommend we use them, as they don't seem to handle basic auth and throw weird errors:
https://twitter.com/patconnolly/status/459140032096653312

@patcon
Copy link
Contributor Author

patcon commented Apr 24, 2014

Oh, and @seanlinsley raised some valid concerns on us being vulnerable to trolling and maybe worse since there's no auth on the pagerduty commands:
https://botbot.me/freenode/gittip/msg/13750107/

@clone1018
Copy link
Contributor

@patcon this is done right?

@patcon
Copy link
Contributor Author

patcon commented May 2, 2014

Yeah, I think we can close this. Thanks for the grooming

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants