Set up pagerduty for alerting and accountability #2072

patcon · 2014-02-25T19:55:56Z

In setting up hubot and preparing for him to be given responsibility, I feel we need a good way to make sure that if he ever goes down, someone will be alerted right away, and there will be chain of responsibility if that someone is unavailable (ie. an on-call schedule and escalation policy).

I've had experience with PagerDuty in the past, and they are a great tool, trusted by other great product companies (GitHub being one of them).

I personally wouldn't feel comfortable giving hubot any future control unless he's got properly monitored with alerting in place :) I'm really excited to work on implementing on-call schedules and rotations for other critical monitoring areas of Gittip. This will help us distribute responsibility to the right people (and away from Chad). And if you're on an on-call schedule, Gittip gets a little more real for you, which is a good thing :)

Evolving vendor policy: gratipay/inside.gratipay.com#15
Roobot PagerDuty setup issue: gratipay/roobot#8

Todo

Create user for @whit537
Create user for @greggles
Create user for @bruceadams
Create user for @zwn
set up general downtime notifcations via uptimerobot (make sure schedule matching actual availability)
- @whit537
- @zwn
configure security@gittip.com
- ~~@whit537~~
- @patcon
- @greggles
- @bruceadams
- @zwn

chadwhitacre · 2014-02-27T16:45:26Z

Go for it. What do you need from me?

patcon · 2014-02-27T19:26:00Z

Just waiting on @m3matta to have some conversations on his end on Monday. We're not blocked on anything technical here, but I'd like to leave this open until we've sorted out the policy linked above, as I feel like we'll need that to wrap this up tidily.

Also, as part of the convo we had, I've offered to come in an give a Gittip presentation to the @PagerDuty team if they take me up on it (which I hope they do!). They're currently sharing a space with @uken games in Toronto.

patcon · 2014-02-27T19:49:35Z

Ah wait, we can use this issue for setting up security@gittip.com notifications via pagerduty too, as you suggested in IRC :)

I would need you to set up forwarding to not just your email, but also to the pagerduty service email address (which is tied to the "security-email" service). I've created an escalation policy and on-call schedule that only you are a part of, but you could easily decide to include other people. You'll need to log into your pagerduty account (I sent you an invite last week), and set up phone/text/app alerts:
https://gittip.pagerduty.com/users/PUR9ZXS

I'd recommend putting the app first, as you can get in the habit of treating that comm method as VIP.

chadwhitacre · 2014-03-05T21:06:33Z

IRC

patcon · 2014-03-05T21:29:33Z

Thinking we should have a checklist of security practices that we're agreeing we follow before getting on the security team. Like if someone has access to privileges information, and we're making it public that it's going to their inbox and machine, what are the expectations we're holding each other to. Maybe we should come to agreement on that. Thoughts @greggles? Would love to hash this out with you.

Is this a blocker guys?

Rough thoughts:

using a password vault?
x bits of entropy in every password (higher expectations of master password)
up-to-date and patched operating system
looking unfavourably on windows?
understanding of PGP? Email encryption?
ssh keys with passphrases (x bits of entropy min)
locked phone screen?
???

Maybe this is another issue. I'm just wary of not having spoken with anyone about their security practices before putting them on a security team. (myself included -- I appreciate being gut-checked)

I recognize that there's only so much we can ask of open source contributors, but at my previous job, I'd always be enamoured with the way pivotal labs deals with developer machines: they make each machine entirely disposable, and reprovision them regularly from scratch using an internal project called sprout. All dev's know enough Chef to get their favourite mac config into code, so it's there when they reprovision. The share best practices and security practices by sharing and improving the provisioning code.

I'm getting a new ubuntu laptop this week, and I'll aim to work like this, if anyone else is interested.

greggles · 2014-03-05T21:45:29Z

Those all work for me. I'd add

something about not using untrusted networks (i.e. insecure wifi)
having your machine go into "lock" within X minutes of going "away"
A preference for: Encrypted filesystem seems like a nice thing to ask for, though honestly I've never taken that step personally.
Some form of virus scanning (regardless of OS)

I don't think people on modern Windows machines should be excluded. Some of my favorite security people use Windows :)

patcon · 2014-03-06T00:42:23Z

Rockin. Thanks @greggles

cc: @whit537 (in case you unsubbed previously)

patcon · 2014-03-12T22:39:47Z

When regenerating my ssh key, discovered that encrypting the private key (PKCS#8) is a low-effort way to make brute-forcing more difficult. Might be something to add to our security checklist
http://martin.kleppmann.com/2013/05/24/improving-security-of-ssh-private-keys.html

patcon · 2014-04-21T07:23:25Z

New development in gratipay/inside.gratipay.com#15

chadwhitacre · 2014-04-22T00:44:33Z

We're on PagerDuty. IRC

patcon · 2014-04-24T01:30:48Z

@clone1018 and I walked through how it works:
https://www.youtube.com/watch?v=SNOnTp32ISo

patcon · 2014-04-24T01:34:12Z

Our action items were to find uptime monitoring services that were more fine-grained than uptimerobot (5 min is the best thy can poll). I was going to test-drive port-monitor, and luke was going to check out statuscake, which both have 1 min checks (port-monitor is free, statuscake costs $6/month for the 1-minute check, free for 5-minute checks).

I had a bad experience with port monitor, and don't recommend we use them, as they don't seem to handle basic auth and throw weird errors:
https://twitter.com/patconnolly/status/459140032096653312

patcon · 2014-04-24T01:49:48Z

Oh, and @seanlinsley raised some valid concerns on us being vulnerable to trolling and maybe worse since there's no auth on the pagerduty commands:
https://botbot.me/freenode/gittip/msg/13750107/

clone1018 · 2014-05-02T01:09:17Z

@patcon this is done right?

patcon · 2014-05-02T20:06:56Z

Yeah, I think we can close this. Thanks for the grooming

patcon self-assigned this Feb 25, 2014

patcon added 3 - Work in Progress labels Feb 25, 2014

chadwhitacre mentioned this issue Mar 3, 2014

Server congestion #2109

Closed

patcon mentioned this issue Mar 5, 2014

Willingness to support sprout on other *nix platforms? pivotal-sprout/sprout-osx-apps#237

Closed

patcon mentioned this issue Apr 19, 2014

we need a security policy #576

Closed

patcon closed this as completed May 2, 2014

chadwhitacre unassigned patcon Apr 17, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up pagerduty for alerting and accountability #2072

Set up pagerduty for alerting and accountability #2072

patcon commented Feb 25, 2014

chadwhitacre commented Feb 27, 2014

patcon commented Feb 27, 2014

patcon commented Feb 27, 2014

chadwhitacre commented Mar 5, 2014

patcon commented Mar 5, 2014

greggles commented Mar 5, 2014

patcon commented Mar 6, 2014

patcon commented Mar 12, 2014

patcon commented Apr 21, 2014

chadwhitacre commented Apr 22, 2014

patcon commented Apr 24, 2014

patcon commented Apr 24, 2014

patcon commented Apr 24, 2014

clone1018 commented May 2, 2014

patcon commented May 2, 2014

Set up pagerduty for alerting and accountability #2072

Set up pagerduty for alerting and accountability #2072

Comments

patcon commented Feb 25, 2014

Todo

chadwhitacre commented Feb 27, 2014

patcon commented Feb 27, 2014

patcon commented Feb 27, 2014

chadwhitacre commented Mar 5, 2014

patcon commented Mar 5, 2014

greggles commented Mar 5, 2014

patcon commented Mar 6, 2014

patcon commented Mar 12, 2014

patcon commented Apr 21, 2014

chadwhitacre commented Apr 22, 2014

patcon commented Apr 24, 2014

patcon commented Apr 24, 2014

patcon commented Apr 24, 2014

clone1018 commented May 2, 2014

patcon commented May 2, 2014