Nelz's Blog

Mah blogginess

Velocity 2012 - Day 2 Notes

Keynote: “Frying Squirrels & Unspun Gyros” - (Yahoo)

All about HA planning

29% of DataCenter outages caused by UPS failures - point: some defenses create their own failure modes

If you solve for Datacenter-sized outages, the ‘little stuff’ is mostly taken care of

Keynote: Performance of Web vs Apps

Apps are trending to have more impact/viewship/etc than the web

Fragmentism is a scary future with Apps

“Pushing Pixels” (i.e. highly graphical apps) seems to be best served by Apps right now

Responsiveness is very compelling when considering your choice

App lifecycle (approval, etc) is MUCH SLOWER than Web

“Conditional Tier Rendering” – use Node.js to choose client/server rendering

In many scenarios, Web seems to have the advantage right now

Keynote: [Political|Commercial] [Threats|Opportunities] for Faster & Stronger Web (Union Square Ventures)

Ops as viewed by Investors: “Speed is a feature”. Invest in it.

Why has web been so successful:

  • Supportive Legislation
  • Decentralized Internet
  • Open Standards

Now (+ Possible Future)

  • Restrictive Legislation
  • Walled Gardens
  • Controlled Internet

Lame claim by ISPs: “We haven’t invested enough in infrastructure, so therefore we should be able to control what crosses it.” >:–|

App: No “View Source”, No Linking

Historically Recognized Patterns in Technological Adoption: ~20 years in there’s a temporary crash while the world adapts to a structure of Networks over the older Hierarchies

Combat the threats to the internet:

  • Engage (politically)
  • Build!

[? ?]

Keynote: Lightning Talks

Intel - Agile Application Performance Management


Akamai - Akamai Internet Observatory

Basically, they are opening up some subsets of their data to be analyzed by the ‘open data’ community.

Example: in 60 seconds, gather 1.3 BILLION log lines

Their data should give a view of 20-30% of world internet usage

Compuware - UX Management


Citrix - Blah blah Network Utility pitch…


GoDaddy - Blah blah

Unfortunately, it seems that misogyny is a point of company pride for them.

Lightning Demos


  • now has a cmd-line version
  • now open-source
  • new URL:

integrates with PhantomJS; this can then integrate with Continuous Integration tools!!!



you can see low-level chrome processing info

(can help you speed up a site)

“CrRenderMain” is the proc to keep an eye on

FYI, ~16ms is how often a redraw should happen for current refresh-rates to see examples:


Pretty cool tool to see actual API adherence from many browsers

Changing Culture and Being a Force for Awesome… - (Jesse Robbins)

“The right culture is a requirement for survival and success at web scale.”

Bad news: changing culture takes time ;–)

Jesse’s Rule: Don’t Fight Stupid, Make More Awesome

Changing Culture:

  • start small, build trust & safety
  • create champions
  • use metrics to build confidence
  • celebrate successes
  • exploit compelling events

Hack for Change

  • Starting Small
  • small is safe
  • call it an experiment
  • Creating Champions
  • get exec sponsors, start @ your boss
  • give *everyone else* the credit
  • give ‘special status’ (e.g. pins/shirts/hats/etc)


  • find KPI that supports change
  • track & use that KPI ruthlessly
  • first, use it to show value
  • later, use it to show cost of lagging
  • tell a story with your data, not just a number

Celebrate Success

  • tell a powerful story
  • always be positive about people and talk about how they
  • overcame a problem
  • NEVER about people who created a problem
  • leave room for people to come to your side
  • Exploit Compelling Events

just wait, it will happen no “I told you so”, just “what do we do now?”

re: Permission - ”No“ frequently means ”I don’t know how to say Yes to you“: Use creativity to hack whatever the barriers are

Wisdom of the Crowd: Real User Measurement

  • “RUM”
  • Look into using BOTH of these: “window.performance” and “window.onerror”
  • Google Analytics has been doing for ~1 year, you may already have the data!

Leveling Up - Take Your [Ops|Eng] Role to the Next Level

Work: “It’s not just what you do, it’s also how you do it.”

Decent talk, but I didn’t take many notes. Hopefully the slides will be put up later, but this blog post from the speaker covers a lot of the same ground:

Logging as Event Streams - (Rackspace)

Logging meets the cloud

  • many multi-tenant services
  • many servers across many teams

Logging is an ‘Event Emitter’ (as realized when working with Node.js)

Structured Logging

  • since 1995
  • many producers, many consumers
  • many programming languages

JSON pretty much all languages can use it can set up as newline terminated grep works on it can do hierarchical data (recommended: only go 1 level deep)

Message Tags / IDs

  • Easy to search for strings (I.e. “AH02571”)
  • Important for internal usage, and especially for Open Source

Ref: Zipkin: Uses 64bit integer as ID

Shipping Logs

  • ALWAYS write to local disk first (ALWAYS!)
  • svlogd (?)
  • Scribe - Abandoned by Facebook
  • Flume - Hadoop focused
  • syslog - Solid contender, very robust

Chose Scribe

  • does bulk log mv, can buffer / retry / etc


  • opensource
  • many imports
  • indexes in to ElasticSearch
  • use regex to build pre-fab notification sets

Log Data: You don’t need to keep it forever

Scalable System Ops - (Tumblr)

[Note: Lots of great info in this talk, but a lot was out of my experience. (Provisioning servers, etc.) Not gonna recreate it here, but suggest anyone interested in provisioning to look for this talk.]

“Collins” – tool to help w/provisioning

Use a lot of Unix principles when building tools

[Note: By the afternoon I was super tired and worn out, so my notes got very terse, to the point that I will only sum up what I can about the talk.]

Down With the Fancy Pants

“Complex systems fail in complex ways.”

“Premature optimization is the root of all evil.”

Basically, this talk was a caution to critically look at your stack to see if you can reduce weaknesses by removing unproven technologies.

Solving IT Issues with Differential Diagnosis

The speaker talked about his success in adopting learnings from the medical community to apply Differential Diagnosis and the Hypothetico- Deductive Model to IT problems.

Basically, he encourages us to use more scientific rigor when trying to find the root causes of system malfunctions.

Choose Your Own Adventure

This was a very fun talk, appropriately placed at the end of the conference.

Adam brought a long experience in Ops/SysAdmin/Dev to speak about a range of topics chosen by audience members yelling out their preference from a list of choices.

Challenges to Cultural Change

These are slides to a talk that I didn’t attend, but this echo’s a bunch of good advice from Jesse Robbins’ Keynote presentation. I’m including it because it was well recieved:

EDIT: 28 June 2300h – light editing for formatting