A short pre-lunch session to absorb a few moment:

John Adams (Twitter) Q&A

Q: How do you log all the info from your APIs?

A: syslog, looking at scribe, generally summarize and toss

Q: How do you control abusive clients?

A: Rate limiting, apply feature limits to abusers, etc.

Q: What would you do differently?

A: Implemented change controls much sooner. Process is much better now with more control, predictability

Q: How does your on-call team work?

A: More people reduces length in rotation. Nagios with alerts and aggregation of alerts. Make alerts actionable (db fails? see one page for db down, not 500 webservers). Also prevents burnout

Q: Carry a real pager?

A: Some, mostly SMS. There are escalations if you don’t answer. Always someone from Ops and Eng on the pager chain.



blog comments powered by Disqus

Published

23 June 2009

Categories