Nelz's Blog

Mah blogginess

Velocity 2012 - Day 1 Notes

I started the day poorly. Even though I got in on time, I forgot to bring a notebook.

Also, I didn’t properly charge my laptop, so I couldn’t take notes with that. It took me until 10AMish before I found a notebook for jotting down my notes, so the first couple are very incomplete. Mea culpa.

Keynote by Facebook:

I found the discussion of their new hire process very interesting: ”New Hire Cave” for 6 weeks w/Mentors and fixing bugs

They also had some interesting results with iteration, even when applied to the building of their datacenters and hardware.

Keynote by Google:

Prediction APIs, giving browser hints as to ‘normal’ or expected flows

“chrome://predictions” – Chrome 20+, can see url suggestion trees

Keynote - Lightning Demos:

http://webpagetest.org

  • “page speed index” = how long until pixels rendered
  • PageSpeed Insights (Google)
  • render order vs load order
  • public as of today
  • can see “critical (render) path”

http://httparchive.org

  • real-life data from ‘in the wild’
  • free
  • see trends
  • based on WebPageTest.org

Keynote - “How Complex Systems Fail”

A.k.a. How Complex Systems DON’T Fail

(Speaker comes from medical background, but research translates to what we do in Ops)

We should be programming for resilience in addition to reliability

I.e. how the operations people can use/fix/tune our apps

Design for Resilience:

  • support continuous maintenance
  • reveal control to operators
  • show the ‘lift points’ (heavy equip analogy)
  • support mental simulation
  • open objects/methods
  • deep six “don’t touch me“‘s

Keynote - “Broadening the User Perspective”

historically web ops have concentrated on load-times

new metrics:

  • when browser starts painting
  • when full screen displayed
  • when window becomes interactive
  • “Web Timings” spec

“Beyond CDN” - (Akamai)

Major ISPs are peering networks, secondary networks have to pay to exchange/interchange data with main ISPs Growth Expectations in the near future:

  • First Mile (Site -> Interchanges) - 20X
  • Last Mile (ISP -> End User) - 50X
  • Middle Mile (Interchange -> Interchange) - only 6X - THIS IS GOING TO BE BOTTLENECK!

ISPs not motivated to invest in Middle Mile, cost center

Interesting Example: The two major Brazilian ISPs hate each other, routes via Miami for interchange

“Real Time at Twitter”

monolithic app [‘monorail’ ;–)] –> JVM based SOA

1st problem: ops visibility

  • was all in nagios & Ganglia
  • difficult to change (not responsive)

solved: created visibility stack

  • self-service timings infrastructure (w/a query language?)

then abstracted out network substrate

  • started as local interfaces
  • evolved to remote interface

“Zipkin” – github project; trace tool

“Iago” – github project; load generation tool

“Finagle” – github project; enabled much of their SOA

“Thrift” also used as common ‘language’ between services

They started by deploying new features ‘dark’, then slowly turned up usage

“Rollbacks: The Impossible Dream”

Entire concept based on transactional db model

No One Tests: Disaster Recovery / Rollback

Preventative Design:

  • Increased Resilience - less need for rollback
  • DevOps - better integration b/w engineers and ops
  • Small, Iterative changes
  • Accept that sometimes failure happens
  • TEST! - There are only a *few* things that truly “can’t be tested”
  • “Assumption is the mother of all fuckups”

Ops Mythologies (“it can’t be done”) come from scar tissue – kill your myths!

“Using Node.js to Improve the Performance of Mobile Apps and Mobile Web”

radio waves suck! – vs fiber/copper

high latency requests b/c of physical movement/constraints

client-side MVC sucks

maybe we should emulate the Google homepage, which ‘barfs’ the whole site out quickly

Interesting: mobile networks try to ‘guarantee’ delivery of data once it’s actually made it to the mobile network itself

Rendering HTML vs rendering JSON have tradeoffs to be considered for mobile

“Mojito” for rendering page fragments via Node.js

“Stability Patterns” - (Michael Nygard)

[Note: this was a subset of the book “Release It!” which everyone should just buy & read]

A killer test harness would be one that throws “Out-of-Spec” errors:

  • E.g. Returns MP3 instead of XML when on XHR
  • E.g. … many other evil considerations

“Time to First Tweet” - (Twitter)

Moving initial load from Client-Side Rendering –> Server-Side Rendering

Performance is highly contextual

Used “Navigation Timing API” – supported by IE9, FF7, Chrome

Represents most users, pulled data from small % of users

Because they are in the middle of a Rails –> JVM change, they have a multi-language templating solution

  • Templating on Client: Mustache.js -> Hogan.js
  • Templating on Server: Mustache.java
  • (and another C++ implementation w/ Ruby bindings)

Migrating away from HashBang

Loading JS

  • CommonJS and AMD
  • decouple loading from execution
  • enables multiple loading times: lazy, parallel, etc
  • transparent to JS developers

Layering On pushState support

  • want fast in-app navigation
  • want to avoid full-page refreshes in modern browsers
  • keeps simple index-ability
  • Best of Both Worlds!

In Browser: on click; if has History API support then intercept the click and request link via XHR

On Server: if request is via XHR send just the partial, otherwise send whole HTML page (decorate)

Cut 95th percentile by 75%

Summary

Overall, it was a pretty good day.

I think the presentations I’ve seen were pretty good. The best and most polished presentations are on all the new front-end and/or client-side tooling and techniques. The (still strong but) weakest presentations seem to be around some of the back-end and server-side techniques. I guess my critique is the tone of these presentations: they seem to be saying “this is the way we’re going”, rather than coming from a position of “we’ve done this and it was successful because of X, Y, Z.”

I actually didn’t hate the exhibit hall for this conference. For the most part the vendors seem to be staffing with actual staff, and only about 30% are resorting to ‘booth babes’. (EdgeCast and Dyn seem to be the worst ‘booth babe’ offenders. I’m considering telling them tomorrow that they lose some credibility that way. I am also considering asking the GoDaddy staff “WTF is wrong with your CEO?”)

I’m looking forward to tomorrow, but I can tell I’m going to be a zombie once the day is over.

PS: Too tired to format this mind dump right now… Will come back and fix it up in the next couple of days.

EDIT: Formatting mostly fixed on 28 June 2300h.