LiveBlog: Q&A with Twitter's John Adams
A short pre-lunch session to absorb a few moment:
John Adams (Twitter) Q&A
Q: How do you log all the info from your APIs?
A: syslog, looking at scribe, generally summarize and toss
Q: How do you control abusive clients?
A: Rate limiting, apply feature limits to abusers, etc.
Q: What would you do differently?
A: Implemented change controls much sooner. Process is much better now with more control, predictability
Q: How does your on-call team work?
A: More people reduces length in rotation. Nagios with alerts and aggregation of alerts. Make alerts actionable (db fails? see one page for db down, not 500 webservers). Also prevents burnout
Q: Carry a real pager?
A: Some, mostly SMS. There are escalations if you don’t answer. Always someone from Ops and Eng on the pager chain.
blog comments powered by Disqus