Sunday, July 31, 2011

Stupid SPDY Tricks

‹prev | My Chain | next›

For my last official post of the SPDY chain (*sniff*), I would like to goof around with SPDY server push. I am fairly convinced that SPDY server push, pushing data directly into browser cache, is the killer SPDY feature. I am so enamored of it, that I dedicated two chapters to in in the SPDY Book.

One of the many interesting aspects of SPDY push is its ability to defer the push until the server is ready to send it along. What this means in practice is that the server can jam out a bunch of static data to the browser almost immediately. It can then send out dynamic information once it has had a chance to perform any necessary computation.

To see this in action, I propose to send out a static page with very little on it. I will then push out a bunch of static data (jQuery, CSS, images) and a bit of dynamic data.

First, I create a regular express.js application:
➜  samples git:(master) ✗ express goofy-spdy
create : goofy-spdy
create : goofy-spdy/package.json
create : goofy-spdy/app.js
create : goofy-spdy/views
create : goofy-spdy/views/layout.jade
create : goofy-spdy/views/index.jade
create : goofy-spdy/public/stylesheets
create : goofy-spdy/public/stylesheets/style.css
create : goofy-spdy/public/images
create : goofy-spdy/public/javascripts
create : goofy-spdy/logs
create : goofy-spdy/pids
I then install the spdy, express-spdy, and jade npm packages. The spdy package is a dependency of express-spdy, so it is already installed. I have to explicitly install it so that I can get access to the createPushStream() method in the application (ah, the joys of npm).

In the app.js file, I configure my server be SPDY-ized and to perform push directly into the browser cache:
var express = require('express-spdy')
, fs = require('fs')
, createPushStream = require('spdy').createPushStream
, host = "https://jaynestown.local:3000/";


var app = module.exports = express.createServer({
key: fs.readFileSync(__dirname + '/keys/jaynestown.key'),
cert: fs.readFileSync(__dirname + '/keys/jaynestown.crt'),
ca: fs.readFileSync(__dirname + '/keys/jaynestown.csr'),
NPNProtocols: ['spdy/2'],
push: awesome_push
});
Before taking a look at that awesome_push callback, let's first see the home page of this site:



Each letter in "Hello" is a separate image and the page also has some CSS and pulls in jQuery:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to this Awesome Site!</title>
<link rel="stylesheet" href="/stylesheets/style.css"/>
<script src="/javascripts/jquery-1.6.2.min.js"></script>
</head>
<body>

<h1>Just wanna say...</h1>

<div id="hello">
<img src="/images/00-h.jpg">
<img src="/images/01-e.jpg">
<img src="/images/02-l.jpg">
<img src="/images/03-l.jpg">
<img src="/images/04-o.jpg">
</div>

</body>
</html>
To make the page a bit more dynamic, I add the following "profile" section to the HTML:
    <div id="profile" style="display:none">
<p>
Welcome back, <span id="name"></span>.
</p>
<p>
Today is <span id="date"></span>.
</p>
<p>
Your favorite color is: <span id="favorite_color"></span>.
</p>
<p>
I worked really hard to come up with this.
I think your favorite number might be
<span id="random_number"></span>.
</p>
</div>
That profile information will come from Javascript that will be injected into the browser cache:
  <script src="profile.js"></script>
<script language="javascript">
$(function() {
if (profile.name) {
$("#profile").show();
$("#name").text(profile.name);
$("#date").text(profile.date);
$("#favorite_color").text(profile.favorite_color);
$("#random_number").text(profile.random_number);
}
});
</script>
On the file system, profile.js will be empty:
➜  goofy-spdy git:(master) ✗ cat public/profile.js 
var profile = {};
But using dynamic SPDY server push, we can populate the profile Javascript object with something interesting.

So back to the express-spdy backend. The awesome_push() callback will contain:
function awesome_push(pusher) {
// Only push in response to the first request
if (pusher.streamID > 1) return;

// Oh boy, this is going to take a while to compute...
long_running_push(pusher);

// Push resources that can be deferred until after the response is
// sent
pusher.pushLater([
local_path_and_url("stylesheets/style.css"),
local_path_and_url("javascripts/jquery-1.6.2.min.js"),
local_path_and_url("images/00-h.jpg"),
local_path_and_url("images/01-e.jpg"),
local_path_and_url("images/02-l.jpg"),
local_path_and_url("images/03-l.jpg"),
local_path_and_url("images/04-o.jpg")
]);
}
First up, we start a long running push operation with a call to long_running_push(). This node.js, so it will not block. Then we push the stylesheet, JS, and images directly into the browser cache. This is a "later" push in that the data will be sent after the original HTML response is complete.

Finally, in the long_running_push() function, push out the data—once it is available:
function long_running_push(pusher) {
var url = host + "profile.js"
, push_stream = createPushStream(pusher.cframe, pusher.c, url);

// Send the push stream headers
// Ew. Need to expose an API for this...
push_stream._flushHead();
push_stream._written = true;

setTimeout(function () {
// Write the push stream data
push_stream.write(
'var profile = {' +
' name: "Bob",' +
' favorite_color: "yellow",' +
' date: "' + (new Date).toString() + '",' +
' random_number: "' + Math.ceil(Math.random() * 10) + '"' +
'}'
);
push_stream.end();
}, 3*1000);
};
Loading up the app in Chrome, I see the same homepage—for 3 seconds. After 3 seconds have elapsed, and the computationally intensive long_running_push finally sends back its data, I see this in the browser:



Granted, it took a long time for my poor little laptop to calculate my favorite number, but that's not the point.

The point is that the user experience was excellent. The entire page rendered in under a second because the browser had to make only one request to get everything. There was no round trip to request the various Javascript, CSS, and image files. A single request resulted in everything being jammed into browser cache via SPDY server push.

Even more exciting is that dynamic data was injected in the browser cache and (this is important thing) it did so without blocking any requests. Think about that. In vanilla HTTP, a computationally intensive request can block all requests on an interweb tube. Since browsers only get 6 interweb tubes to each site, a few of these types of requests and the page grinds to a halt.

With SPDY server push, nothing blocked. Everything flew across the wire when it was ready—no waiting on round trip times, blocking resources or anything. Just raw speed.

Awesome sauce. Wanna learn more? Buy my SPDY Book!


Day #99

Saturday, July 30, 2011

SPDY Server Push in node-spdy Revisited

‹prev | My Chain | next›

I just realized that some of the SPDY server push stuff that I write about in the SPDY Book is not actually available in node-spdy. Today, I would like to rectify that situation.

I had been spiking in the node_modules directory of a test application. That was not the brightest thing to do (I should have either installed from my local copy of node-spdy or sym-linked it). Anyhow, I copy my changes into my local copy of the node-spdy repo and am left with changes to two files:
➜  node-spdy git:(server-push-fixes) ✗ gst
# On branch server-push-fixes
# Changes not staged for commit:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: lib/spdy/push_stream.js
# modified: lib/spdy/response.js
#
no changes added to commit (use "git add" and/or "git commit -a")
The only change to the PushStream class is the last-modified headers, which are required (at least by Chrome) for pushing CSS into the browser cache. I had hard coded that value:
    this._headers["last-modified"] = "Wed, 20 Jul 2011 01:34:27 GMT";
To get that date format (GMT), I cannot simply call toString() on a new Date object. Rather, I have to call toGMTString():
> (new Date).toString();
'Sat Jul 30 2011 15:15:50 GMT-0400 (EDT)'
> (new Date).toGMTString();
'Sat, 30 Jul 2011 19:15:55 GMT'
I do not use the actual last-modified date here, because there is no cache invalidation in SPDY server push. SPDY server push pushes directly into browser cache regardless of whether or not the browser already has the files. That is not a huge loss since most push occurs after the original request has been fulfilled. In an HTTP world, SPDY server push is all bonus.

My change to PushStream thus becomes no more than the following change to the constructor:
  this._headers = {
status: 200,
version: "http/1.1",
url: url,
"last-modified": (new Date).toGMTString()
};
Most of the actual change takes place in the Response class. It is the Response that needs to initiate push streams and send data out before and after the response proper has been sent. The API that I would like to support is a callback to the standard express.js createServer call:
  push: function(pusher) {
// Only push in response to the first request
if (pusher.streamID > 1) return;

var host = "https://jaynestown.local:3000/";

// Push immediately with pushFile
pusher.pushFile("public/stylesheets/style.css", host + "/stylesheets/style.css");

// Push resources that can be deferred until after the response is
// sent
pusher.pushLater([
["public/one.html", host + "one.html"],
["public/two.html", host + "two.html"],
["public/three.html", host + "three.html"]
]);
}
In either case, I need to tell the Push stream where to find the resource on the filesystem and under with what URL to push the response. It is possible to infer the url from the file system location, but I will not worry about that for now.

In Response, when writing the response data to the browser, I invoke the push callback:

Response.prototype._write = function(data, encoding, fin) {
if (!this._written) {
this._flushHead();
this._push_stream();
}
The pushLater() method from the push callback is responsible for sending out the response headers (SPDY headers are separate from data) and for remembering the data to be pushed:
Response.prototype.pushLater = function(resources) {
var that = this;

this.deferred_streams = [];

// Send headers for each post-response server push stream, but DO
// NOT sent data yet
resources.forEach(function(push_contents) {
var filename = push_contents[0]
, url = push_contents[1]
, data = fs.readFileSync(filename)
, push_stream = createPushStream(that.cframe, that.c, url);

push_stream._flushHead();
push_stream._written = true;
that.deferred_streams.push([push_stream, data]);
});
};
Then, after the response is written, the deferred data can be pushed:
  this.c.write(dframe);

// ...

// Push any deferred data streams
this._pushLaterData();
The only "real" change to the overall behavior is the support of deferred push with the addition of the pushLater method. I have also renamed the old push_file as pushFile to better fit Javascript and node-spdy conventions (I do leave a deprecated push_file() wrapper to retain backwards compatibility).

With that, there is little to do aside from trying it out in the browser. I do it right this time and sym-link my copy of the node-spdy repository into the application's node_modules. Loading it up in the browser, and checking Chrome's SPDY tab about:net-internals, I see that the reply to the web page request is immediately followed by the CSS being pushed directly into browser cache:
t=1312064818809 [st=124]     SPDY_SESSION_SYN_REPLY  
--> flags = 0
--> connection: keep-alive
content-length: 50360
content-type: text/html
status: 200 OK
version: HTTP/1.1
x-powered-by: Express
--> id = 1
t=1312064818812 [st=127] SPDY_SESSION_PUSHED_SYN_STREAM
--> associated_stream = 1
--> flags = 2
--> last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/stylesheets/style.css
version: http/1.1
--> id = 2
t=1312064818812 [st=127] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 111
--> stream_id = 2
t=1312064818812 [st=127] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 0
--> stream_id = 2
Even the data associated with the CSS is pushed into browser cache as evidenced by the SPDY_SESSION_RECV_DATA events with the stream ID (2) of the CSS push stream.

Once node-spdy has sent out the pushFile() resource, it is time to push the pushLater() resources, but only the headers:
t=1312064818814 [st=129]     SPDY_SESSION_PUSHED_SYN_STREAM  
--> associated_stream = 1
--> flags = 2
--> content-type: text/html
last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/one.html
version: http/1.1
--> id = 4
t=1312064818815 [st=130] SPDY_SESSION_PUSHED_SYN_STREAM
--> associated_stream = 1
--> flags = 2
--> content-type: text/html
last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/two.html
version: http/1.1
--> id = 6
t=1312064818817 [st=132] SPDY_SESSION_PUSHED_SYN_STREAM
--> associated_stream = 1
--> flags = 2
--> content-type: text/html
last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/three.html
version: http/1.1
--> id = 8
t=1312064818821 [st=136] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 8184
--> stream_id = 1
Once all of the push headers have been sent, then node-spdy begins to send the response to the originally requested resource. The stream ID (1) tells us that this is for the original request and not one of the push streams which all have IDs of 2 or higher.

After all of the response data has been sent out, only then do the push resources begin to go out:
t=1312064818823 [st=138]     SPDY_SESSION_RECV_DATA  
--> flags = 0
--> size = 0
--> stream_id = 1
t=1312064818857 [st=172] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 8184
--> stream_id = 4
That continues all the way through stream ID #8 at which point Chrome acknowledges that we have a legitimate push stream via the a SPDY_STREAM_ADOPTED_PUSH_STREAM event:
t=1312064818884 [st=199]     SPDY_SESSION_RECV_DATA  
--> flags = 0
--> size = 0
--> stream_id = 8
t=1312064818929 [st=244] SPDY_STREAM_ADOPTED_PUSH_STREAM
Nice! That's a good stopping point for today. I will push my new branch to the node-spdy github repository and discuss with Fedor Indutny to make sure it aligns with his thinking.

For now, it's back to slogging through the last edits of SPDY Book!


Day #98

Friday, July 29, 2011

Downgrading Google Chrome (unstable) on Ubuntu Ain't Easy

‹prev | My Chain | next›

Up tonight, I would like to see if it is possible to downgrade my Chrome installation. A few days back, Chrome stopped working with the Speed Tracer extension. At this point in SPDY Book development, I pretty much have everything I need from a research standpoint, but I would hate to find this weekend that I absolutely have to redo a Speed Tracer screen shot and have no way of doing so.

I have no idea if the old google-chrome-unstable package is still on the Google download server, but there is one way to find out. I can never remember how packages map to download URLs in Debian/Ubuntu. Fortunately, there is the --print-uris option to apt-get which can help:
➜  ~  sudo apt-get install -qq --reinstall --print-uris google-chrome-unstable
➜ ~
Or maybe not.

I would think that the --reinstall option would instruct apt-get to, well, re-install. Well, what happens when I pick the stable version?
➜  ~  sudo apt-get install -qq --reinstall --print-uris google-chrome-stable                                        
'http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-stable/google-chrome-stable_12.0.742.124-r92024_amd64.deb' google-chrome-stable_12.0.742.124-r92024_amd64.deb 22241936 MD5Sum:f0c30436363cb2f3965ee0c41a8723cb
Ah better.

I currently have Chrome version 14.0.835.8-r94414 installed. So that ought to translate into http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-unstable/google-chrome-unstable_14.0.835.8-r94414_amd64.deb. And, indeed, it does.

Looking through /var/log/apt/history.log, I see the following recent upgrades took place:
...
google-chrome-unstable:amd64 (14.0.797.0-r89638, 14.0.803.0-r90483)
google-chrome-unstable:amd64 (14.0.803.0-r90483, 14.0.814.0-r91661)
google-chrome-unstable:amd64 (14.0.814.0-r91661, 14.0.825.0-r92801)
google-chrome-unstable:amd64 (14.0.825.0-r92801, 14.0.835.0-r94025)
google-chrome-unstable:amd64 (14.0.835.0-r94025, 14.0.835.8-r94414)
I reported the Speed Tracer no-worky issue back at revision 4.0.835.0. I think it actually broke in the previous release, so I would like try to grab two releases before 4.0.835.0, which is 14.0.814.0-r91661. The URL for that would then be:http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-unstable/google-chrome-unstable_14.0.814.0-r91661_amd64.deb.

Unfortunately, that results in a 404:



Dang. Looks like Google removes old builds relatively quickly. I wonder if yesterday's build is still around?

Checking http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-unstable/google-chrome-unstable_14.0.835.0-r94025_amd64.deb, I get yet another 404. Darn it. They are nothing if not efficient.

So, to summarize:It is only at this point that I remember /var/cache/apt:
➜  ~  ls /var/cache/apt/archives/google-chrome-unstable*      
/var/cache/apt/archives/google-chrome-unstable_14.0.803.0-r90483_amd64.deb
/var/cache/apt/archives/google-chrome-unstable_14.0.814.0-r91661_amd64.deb
/var/cache/apt/archives/google-chrome-unstable_14.0.825.0-r92801_amd64.deb
/var/cache/apt/archives/google-chrome-unstable_14.0.835.0-r94025_amd64.deb
/var/cache/apt/archives/google-chrome-unstable_14.0.835.8-r94414_amd64.deb
Yay! Good thing I rarely autoremove my apt-get!

Before downgrading, I close Chrome and backup my config:
➜  ~  tar zcf google-chrome_config.tar.gz .config/google-chrome
I do not think I have anything in there that I cannot lose, but better safe. With that out of the way, I install my target archive:
➜  ~  sudo dpkg -i /var/cache/apt/archives/google-chrome-unstable_14.0.814.0-r91661_amd64.deb 
dpkg: warning: downgrading google-chrome-unstable from 14.0.835.8-r94414 to 14.0.814.0-r91661.
(Reading database ... 327451 files and directories currently installed.)
Preparing to replace google-chrome-unstable 14.0.835.8-r94414 (using .../google-chrome-unstable_14.0.814.0-r91661_amd64.deb) ...
Unpacking replacement google-chrome-unstable ...
Setting up google-chrome-unstable (14.0.814.0-r91661) ...
Processing triggers for python-gmenu ...
Rebuilding /usr/share/applications/desktop.en_US.utf8.cache...
Processing triggers for bamfdaemon ...
Rebuilding /usr/share/applications/bamf.index...
Processing triggers for desktop-file-utils ...
Processing triggers for menu ...
Processing triggers for man-db ...
Processing triggers for python-support ...
Yay! It looks like that worked. When I start up Chrome (with the --enable-extension-timeline-api command line option required by Speed Tracer), I am greeted with:



Meh. I can live with that. If it give me my precious Speed Tracer back. And it does!



I wind up downgrading one more version to 14.0.803.0-r90483 because of an annoyance with new tabs. The bottom line is that I have Speed Tracer back and life is good.

Now I have no obstacles to finishing off SPDY Book. Except it's a huge book and I only have 3 days to proof read it. Ugh.


Day #86

Thursday, July 28, 2011

I Am Profoundly Sorry to Introduce express-unstable and connect-unstable

‹prev | My Chain | next›

I shift gears today to solve my express-spdy installation problem. With the release of SPDY Book imminent, I need the express-spdy installation to be as easy as possible. Unfortunately, it already needs a compiled, edge-openssl. I would prefer not to require a custom node.js install, but that is what I am faced with.

My dilemma is this. The node-spdy package depends on features in node 0.5.0-pre or later. The express.js and connect packages (on which express-spdy is dependent) only support node before 0.5.

Prior to the official release of 0.5, one could compile node from the github repository and use express-spdy. This worked because, at that point, the version of node was 0.5.0-pre (less than 0.5) which was good enough for express. Once node hit 0.5, however, connect would no longer install.

To workaround, I could tell readers to checkout a specific SHA-1 in the git repository, but that feels hackish. Instead, I choose to support express and connect on 0.5 myself--in the form of express-unstable and connect-unstable packages.

I feel bad about this, I truly do, but I see no better way of making the express-spdy install easy on readers. So...

I follow along with the Github guide to forking a repos and add the real connect as my upstream:
➜  connect git:(master) git remote add upstream git://github.com/senchalabs/connect.git   
➜ connect git:(master) git fetch upstream
remote: Counting objects: 268, done.
remote: Compressing objects: 100% (84/84), done.
remote: Total 222 (delta 160), reused 200 (delta 138)
Receiving objects: 100% (222/222), 31.01 KiB, done.
Resolving deltas: 100% (160/160), completed with 34 local objects.
From git://github.com/senchalabs/connect
* [new branch] 1.x -> upstream/1.x
* [new branch] features/staticProvider-cache -> upstream/features/staticProvider-cache
* [new branch] gh-pages -> upstream/gh-pages
* [new branch] master -> upstream/master
From git://github.com/senchalabs/connect
* [new tag] 1.6.0 -> 1.6.0
I then merge in the changes from upstream since I forked my version of connect:
➜  connect git:(master) git merge upstream/master
Updating 36986a2..93f999f
Fast-forward
History.md | 8 ++
Readme.md | 7 ++-
examples/csrf.js | 36 ++++++++++
lib/connect.js | 95 +++++++++-----------------
lib/https.js | 47 ------------
lib/index.js | 2 +-
lib/middleware/compiler.js | 163 -------------------------------------------
lib/middleware/csrf.js | 105 +++++++++++++++++++++++++++
lib/middleware/directory.js | 1 +
lib/middleware/logger.js | 6 +-
lib/patch.js | 76 ++++++++++++--------
...
30 files changed, 463 insertions(+), 603 deletions(-)
create mode 100644 examples/csrf.js
delete mode 100644 lib/https.js
delete mode 100644 lib/middleware/compiler.js
create mode 100644 lib/middleware/csrf.js
rename lib/{http.js => proto.js} (82%)
create mode 100644 test/common.js
delete mode 100644 test/compiler.test.js
➜ connect git:(master) gp origin master
Total 0 (delta 0), reused 0 (delta 0)
To git@github.com:eee-c/connect.git
36986a2..93f999f master -> master
Now, I update the package.json file that will publish to npm. I am trying to strike a balance here between avoiding the appearance of stealing connect while also ensuring that no one bothers the real connect guys with support requests should they happen across this. Let me be clear:

I am in no way affiliated with connect. They are awesome and this work is entirely their own. Most importantly: DO NOT BOTHER THEM WITH SUPPORT REQUESTS!!!!

Anyhow, in the package.json, I change the name of the npm package:
-  "name": "connect",
+ "name": "connect-unstable",
I change the description to reflect the non-association with the real connect middleware:
-  "description": "High performance middleware framework",
+ "description": "Unstable, tracking fork of the real connect middleware. Only use if you really, *really* need node 0.5+.",
I point the repository to my fork:

- "repository": "git://github.com/senchalabs/connect.git",
+ "repository": "git://github.com/eee-c/connect.git",
I hate to make the next change, but I do not want anyone bothering TJ about this package, so I assume authorship even though I have made no actual code commits myself:
-  "author": "TJ Holowaychuk  (http://tjholowaychuk.com)",
+ "author": "Chris Strom (http://eeecomputes.com)",
And lastly, I make the change that I set out to make in the first place:

- "engines": { "node": ">= 0.4.1 < 0.5.0" }
+ "engines": { "node": ">= 0.4.1" }
With that, I am ready to publish:
➜  connect git:(master) npm publish
npm WARN Sending authorization over insecure channel.
The warning aside, "my" package is now published:



I do the same for express-spdy, but do my work in the 2.x branch since it seems that they have already begun to transistion to node 0.5+.

After that, I re-publish express-spdy and connect-spdy with dependencies on express-unstable and connect-unstable. I look forward to being able to get off of this in the future.

Now, I install express-spdy under node 0.5.2 on my VM:
cstrom@debian:~/node-spdy-example$ `which node` --version
v0.5.2
cstrom@debian:~/node-spdy-example$ npm install express-spdy
> zlibcontext@1.0.7 install /home/cstrom/node-spdy-example/node_modules/express-spdy/node_modules/spdy/node_modules/zlibcontext
...
'build' finished successfully (0.893s)
express-spdy@0.0.3 ./node_modules/express-spdy
├── express-unstable@2.4.3 (mime@1.2.2 qs@0.3.0 connect-unstable@1.6.0)
├── connect-spdy@0.0.3 (connect-unstable@1.6.0)
└── spdy@0.1.1
cstrom@debian:~/node-spdy-example$ npm ls
/home/cstrom/node-spdy-example
└─┬ express-spdy@0.0.3
├─┬ connect-spdy@0.0.3
│ └─┬ connect-unstable@1.6.0
│ ├── mime@1.2.2
│ └── qs@0.3.0
├─┬ express-unstable@2.4.3
│ ├── connect-unstable@1.6.0
│ ├── mime@1.2.2
│ └── qs@0.3.0
└─┬ spdy@0.1.1
└── zlibcontext@1.0.7
Unfortunately, that is not quite the end of the story. When I actually try to run things, I find that all of the require('express') and require('connect') are now broken. After searching and replacing them with require('express-unstable') and require('connect-unstable') in express-unstable, connect-unstable, express-spdy, and connect-spdy, I am finally able to install (without changing the express-spdy installation instructions) and run an express-spdy app on node 0.5.2:



Yay! It was a little touch and go there for a while, but it's good to have that item checked off from my TODO list. For now, it's back to finishing off SPDY Book.


Day #85

Wednesday, July 27, 2011

SPDY and Good SSL

‹prev | My Chain | next›

Now that I am a Certificate Authority (CA), I am ready to revisit some SSL timing woes from a few nights back. Specifically, I was seeing 700ms+ for SSL negotiation. Some of that was due to the imposed 100ms round trip time (RTT) on my little network, but that was still high.

To see if I can eliminate some of that, I am going to replace my faux SSL certificate with a real one signed by my new CA. First, I need to generate my private key and an associated certificate request (which will be sent to my CA):
➜  ~  openssl genrsa -out spdy.key 1024           
Generating RSA private key, 1024 bit long modulus
............................++++++
....................++++++
e is 65537 (0x10001)
➜ ~ openssl req -new -key spdy.key -out spdy.csr
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [US]:
State or Province Name (full name) [Maryland]:
Locality Name (eg, city) []:
Organization Name (eg, company) [EEE Computes, LLC]:
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:spdy.local
Email Address []:spdy.local@eeecomputes.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
Now, with my CA hat on, I sign the certificate request:
➜  ~  openssl ca -out ./CA/newcerts/spdy.crt -in spdy.csr              
Using configuration from /home/cstrom/local/ssl/openssl.cnf
Enter pass phrase for ./CA/private/cakey.pem:
Check that the request matches the signature
Signature ok
Certificate Details:
Serial Number: 2 (0x2)
Validity
Not Before: Jul 28 00:18:46 2011 GMT
Not After : Jul 27 00:18:46 2012 GMT
Subject:
countryName = US
stateOrProvinceName = Maryland
organizationName = EEE Computes, LLC
commonName = spdy.local
emailAddress = spdy.local@eeecomputes.com
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
Netscape Comment:
OpenSSL Generated Certificate
X509v3 Subject Key Identifier:
BF:BB:1D:1F:6A:04:5D:48:79:8D:B6:E1:A3:88:40:8E:4B:DB:9D:77
X509v3 Authority Key Identifier:
keyid:3D:1B:A2:E4:94:D4:0C:D0:3B:D5:BC:78:B9:F7:97:40:73:C8:59:A2

Certificate is to be certified until Jul 27 00:18:46 2012 GMT (365 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
After copying my private key and signed certificate to the spdy.local VM, I access the SPDY server to find that the new certificate is, indeed, working:



Looking at the networking tab of Chrome's developer tools (the only thing available to me since Speed Tracer is broken), I see that SSL negotiation is cut down:



Much better. It is still 400ms, but on a 100ms RTT network that may be the best I can hope for.

Before calling it a night, I try out the SPDY server push implementation. Observing in the network tab again, I see:



Hrm... The images are certainly pushed quickly, but for some reason the CSS and JS are taking longer in SPDY server push than in normal requests.

Overall, it saves about 300ms. That is nothing to sneeze at, but what is going on with the CSS and Javascript transfer times? I try to have a look at the SPDY tab in about:net-internals, but about:net-internals seems hopelessly broken in my current Chrome (14.0.835.0 dev):



Ugh. Well, I will leave that as mystery for tomorrow. For the time being, I am going to get back to finishing off SPDY Book.


Day #84

Tuesday, July 26, 2011

How Root Certificate Authorities Work (more or less)

‹prev | My Chain | next›

Up tonight I do something that I should have done a long time ago: set myself up as an SSL Certificate Authority (CA). Since SPDY is run over SSL, it makes sense to simulate as closely as possible real SSL. For the most part in SPDY Book I can get close enough with localhost certificates. Every now and then, I need more.

I could certainly buy a legitimate certificate, but I would worry that I would miss something and need to buy more and more to cover cases that I had not thought of in the first place. If I set myself up as a CA and tell my browser that I trust myself as a CA, then I can make myself legitimate testing certificates any time I need them.

This primarily involves the use of openssl's ca command. The openssl ca command is a bit ugly and hard to use. I am not making this up, it says so in its own documentation:
The ca command is quirky and at times downright unfriendly.
It was originally intended for reporting with limited ability to perform CA-like functions. As people have "mis-used" it for CA work, features got added. The result is, unfriendly.

I am more or less following along with Marcus Redivo's HOWTO. I think that guide will just work for normal installs, but I am using my locally installed ($HOME/local) openssl with NPN support. In many ways, that makes like easier for me because I can edit $HOME/local/ssl/openssl.cnf without fear of messing up my entire system (Marcus edits local configs and forces ca to use them via command line switches).

In my $HOME/local/ssl/openssl.cnf config file, I change the location were my CA work is going to take place:
#dir  = ./demoCA  # Where everything is kept
dir = ./CA
I am not going to be a demo CA, I am a real one, dammit!

Real CAs, at least those that use openssl ca need to do a little ground work before issuing commands wily-nily. The openssl ca command requires a flat file database to exist and a pointer to the next ID of the next certificate blessed by my awesome CA. The openssl.cnf configurations specifies these two pieces of information with:
database = $dir/index.txt # database index file.
#...
serial = $dir/serial # The current serial number
Remember how ca confessed to being really unfriendly? This is one of the ways: both of those files needs to be created before ca will work. No, ca cannot create them itself. The database file needs to simply exist. The serial pointer file needs to contain the ID of the first certificate that we will create. I will do all of this work in my home directory, so I create the CA directory that makes me a real CA:
mkdir CA
And then I initialize the flat file database and record index:
touch CA/index.txt
echo '01' > CA/serial
The last bit of setup involves organization. I need a couple of subdirectories to hold certificates that I, as a CA, sign, and a place to store my super-secret private keys:
mkdir CA/newcerts CA/private
NOTE: Were I a real boy, er... CA, I would need to guard that private directory with my life.

Before moving on, I make my life a bit easier by setting some useful defaults in openssl.cnf:
#countryName_default  = AU
countryName_default = US
#stateOrProvinceName_default = Some-State
stateOrProvinceName_default = Maryland
#0.organizationName_default = Internet Widgits Pty Ltd
0.organizationName_default = EEE Computes, LLC
With that, I can create my CA certificate:
openssl req -new -x509 \
-extensions v3_ca \
-keyout CA/private/cakey.pem \
-out CA/cacert.pem \
-days 3650
The keyout and out options store the private key and my CA certificate in the locations specified in the config file:
certificate = $dir/cacert.pem  # The CA certificate
private_key = $dir/private/cakey.pem# The private key
The openssl req writes the private key, and then requests some CA info:
Generating a 1024 bit RSA private key
........................++++++
....................................................++++++
writing new private key to 'CA/private/cakey.pem'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [US]:
State or Province Name (full name) [Maryland]:
Locality Name (eg, city) []:
Organization Name (eg, company) [EEE Computes, LLC]:
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:EEE Computes Root CA
Email Address []:ca@eeecomputes.com
Most of that comes from the default values I put in the config. The common name is the name that will identify the CA certificate when it is installed on the browser.

To install the new root CA in Chrome, I go to Preferences // Under the Hood. There I Manage Certificates:



On the Root Certificates tab, I import my new, public cacert.pem:



And now I am official!



I am a real live CA now, but my CA still needs to issue a certificate.

My first customer generates a key and a certificate request to send my way:
openssl genrsa -out jaynestown.key 1024
openssl req -new -key jaynestown.key -out jaynestown.csr
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [US]:
State or Province Name (full name) [Maryland]:
Locality Name (eg, city) []:
Organization Name (eg, company) [EEE Computes, LLC]:
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:jaynestown.local
Email Address []:chris@eeecomputes.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
My customer, who wants a certificate for the jaynestown.local server, sends me the jaynestown.csr certificate request. Once I have verified that the check has not bounced, I am ready to sign the jaynestown certificate. Naturally, I use the ugly ca command to do this:
openssl ca -out ./CA/newcerts/jaynestown.crt -in jaynestown.csr
Using configuration from /home/cstrom/local/ssl/openssl.cnf
Enter pass phrase for ./CA/private/cakey.pem:
Check that the request matches the signature
Signature ok
Certificate Details:
Serial Number: 1 (0x1)
Validity
Not Before: Jul 27 01:32:00 2011 GMT
Not After : Jul 26 01:32:00 2012 GMT
Subject:
countryName = US
stateOrProvinceName = Maryland
organizationName = EEE Computes, LLC
commonName = jaynestown.local
emailAddress = chris@eeecomputes.com
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
Netscape Comment:
OpenSSL Generated Certificate
X509v3 Subject Key Identifier:
1C:21:62:29:B2:BB:84:26:4B:69:93:5D:E8:A2:82:A5:0C:EA:0C:00
X509v3 Authority Key Identifier:
keyid:3D:1B:A2:E4:94:D4:0C:D0:3B:D5:BC:78:B9:F7:97:40:73:C8:59:A2

Certificate is to be certified until Jul 26 01:32:00 2012 GMT (365 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
I send the signed jaynestown.crt certificate back to my first customer, who might install it (and the private key) under Apache thusly:
        SSLCertificateFile    /etc/ssl/certs/jaynestown.crt
SSLCertificateKeyFile /etc/ssl/private/jaynestown.key
And, with that, my first customer has a legit certificate:



And the certificate is signed by me:



Nice! I have the feeling my new CA is going to have more than a few certificates to sign in the next couple of days.


Day #84

Monday, July 25, 2011

It is Really Easy to Do Dumb SSL Things

‹prev | My Chain | next›

In an ongoing attempt to figure out why my SSL negotiation is suddenly taking ~700ms instead of 250ms, tonight I try SSL in nginx. It is conceivable that the node.js SSL implementation is somehow causing problems (connections to other HTTPS sites work). So trying the same key + certificate with a reference SSL implementation seems like a good next step.

To enable SSL, I edit /etc/nginx/sites-enabled/default, uncommenting the HTTP server configuration:
# HTTPS server
#
server {
listen 443;
server_name localhost;

ssl on;
ssl_certificate cert.pem;
ssl_certificate_key cert.key;

ssl_session_timeout 5m;

ssl_protocols SSLv3 TLSv1;
#ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv3:+EXP;
ssl_ciphers DES-CBC3-SHA;
ssl_prefer_server_ciphers on;

location / {
#root html;
root /var/www;
index index.html index.htm;
}
}
I change the root setting to re-use the root directory that already contains static HTML (/var/www). I also stick with a cipher that I know Wireshark can decode.

When I load the site, I find a normal SSL client hello from the browser:



And then another client hello from the browser:



Followed by several connection closures (FINs):



Until finally, I see a new connection open that successfully complete the handshake:



Since I am seeing the same behavior with nginx and my express-spdy implementation, it is time to admit that either there is something wrong with my certificate or my methodology.

Either way it's my mistake. But how to track it down?

Aha! Chrome's about:net-internals page has more than just a SPDY tab. The Events tab keeps track of all sorts of events. Maybe it logs SSL events?

It does!

Filtering events on my VM's www.local hostname, I see:



Hrm... there are three different groups of CONNECT_JOB events. Taking a closer look, I find a common name error:



In fact, that common name error happens twice:
t=1311649121151 [st=17]    -SOCKET_POOL_CONNECT_JOB_CONNECT  
--> net_error = -200 (CERT_COMMON_NAME_INVALID)
Arrgh! Now that I think about it, these SSL woes started right about the time that I started messing about with VMs and differently named hosts.

Chrome remembers that I OK'd the invalid certificate (even after I clear the cache). It seems as though Chrome is still trying to verify the SSL certificate like normal, but eventually giving up and falling back to the session exemption that I have already made for this invalid certificate.

The SSL certificate is for localhost. When I am testing against localhost (as I did for most of my book), I am not incurring this obscene negotiation penalty. When I am trying to use the localhost SSL certificate against a differently named server all sorts of badness arises.

Sigh.

Up tomorrow, I think I go back to generating and signing certificates.


Day #83

Sunday, July 24, 2011

Really Slow SSL

‹prev | My Chain | next›

Tonight, I try to take a closer look at the long delay in SSL connection establishment that I noticed last night. The delay is somewhat to be expected because I am testing with a 50ms delay in each direction, for a round trip time (RTT) of 100ms. I am still perplexed because it is taking a very long time to SSL negotiate (700-800ms). Also, this delay seems to be relatively recent. In previous runs, I saw less than 300ms.

To the Wiresharks!

My VM is restarted so there are no existing connections to the server. I also clear my browser cache (all time), which clears out the certificate as well. When I access the site, I see:



So far, so good. Nice, normal TCP connection establishment. SSL negotiation is off to a good start. It has take 200ms so far, but if the key exchange / SSL handshake come through quickly things might get done within 400ms.

Unfortunately, what happens next is:



And this I do not understand at all. The TCP/IP connection that was being used for SSL negotiation is closed, a new stream is established and SSL negotiation starts all over!



By the time the first bits of SSL data hits the wire, more than 600ms have elapsed. No wonder I am seeing such a big delay. But why?

My VMs no longer have Apache SSL installed on them. I will have to get that back tomorrow, but first I check out a normal SSL conversation—with github. Loading up my dashboard (after again clearing the cache) looks like:



Whoa. That's just weird. The browser is starting up three interweb tubes. It is as if Chrome knows that github is going to need three interweb tubes and opens them immediately after I request the page. That seems crazy because the SSL connection needs to be negotiated on all three tubes right away. Happily, it does conclude just fine, without any connection resets:



So the extremely slow SSL seems specific to my internal VM. Tomorrow, I will install SSL on an Apache VM to see if I can determine if this is specific to my node.js server or if it is a VM + network issue.

For now, I really need to finish off SPDY Book!

Day #83

Saturday, July 23, 2011

SPDY vs SSL

‹prev | My Chain | next›

Yesterday, I ran a moderately simple SPDY web site over a simulated 100ms round-trip connection.

Today, I would like to see if I can cut down on that time some by using SPDY server push. By pushing resources directly into the browser cache, I should be able to overcome at least some of the RTT that is in place.

But when I first load up the page, pushing all static resources into cache, I find:



Bah! 2+ seconds?! That's terrible. Far worse than the simulated CDN over vanilla HTTP. What gives?

To figure out the delay, I check things out in the SPDY tab of Chrome's about:net-internals:
t=1311473528051 [st=   0]     SPDY_SESSION_SYN_STREAM  
--> flags = 1
--> accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
accept-encoding: gzip,deflate,sdch
accept-language: en-US,en;q=0.8
host: spdy.local:3000
method: GET
referer: https://spdy.local:3000/
scheme: https
url: /real
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.825.0 Safari/535.1
version: HTTP/1.1
--> id = 1
Off to a reasonable start. Next, I reach the point that server push should go out:
t=1311473528260 [st= 209]     SPDY_SESSION_PUSHED_SYN_STREAM  
--> associated_stream = 1
--> flags = 2
--> status: 200
url: https://jaynestown.local:3000/stylesheets/style.css
version: http/1.1
--> id = 2
OK. That looks good. Unfortunately, Chrome does not agree. It sends back a reset stream on my server push:
t=1311473528260 [st= 209]     SPDY_SESSION_SEND_RST_STREAM  
--> status = 3
--> stream_id = 2
Hunh?

Oh! This is not longer my local machine (jaynestown). This is a new spdy.local VM on which I am testing RTT, so I need to update the hostname accordingly:
function local_path_and_url(relative_path) {
return [
"public/" + relative_path,
"https://spdy.local:3000/" + relative_path
];
}
Now, when I run it, I get... only somewhat better results:



Hrm... That initial connection is a lot longer than last night (only took ~200ms for a SPDY/SSL connection last night). Tonight, the browser is spending a lot of time during SSL negotiation:



Bah. This is one of the hazzards of small bites of research. I am doing something very different tonight than I did last night. I am going to need to go back and redo last night's work to make sure I did not skip a step when simulating SPDY over a high latency connection.

As for tonight's results, the remainder of the resources load in less than 600ms. I am still looking at 1.3 seconds total—a far cry from the 700ms download time of the simulated CDN from the other night. I can certainly eek out a bit more performance by gzip'ing the jQuery library, but still, if the initial SSL negotiation takes more almost as much time as the HTTP + CDN site, SPDY isn't much of a match.

I should note that SPDY is not supposed to mean and end for the CDN, but it would be pretty cool to get response times down to near CDN levels.

Ugh. Mostly I miss the Speed Tracer extension for Chrome. It seemed to have a better SPDY sense (I've been wait a long time to use that one) than does network manager. I'd really like to get a definitive handle on SPDY and RTT, but it may have to wait until I regain the use of Speed Tracer (it starts locked with Chrome 14.0.825.0 dev). Dang it.


Day #82

Friday, July 22, 2011

SPDY vs RTT

‹prev | My Chain | next›

Last night I got a simplistic website + CDN up and running in VMs. Nothing too fancy -- a single web page served across a 100ms RTT connection and a 30ms RTT CDN. There were only a dozen or so resources served from the CDN, which is a far cry from the typical 53(!) served on most websites. Still, it ought to serve as a useful point of discussion.

In all the web site + CDN completed all data transfer in 670ms.

Tonight, I am going to put a SPDY server on the 100ms RTT web server and serve all of the content from the SPDY server.

I have my express-spdy server laid out thusly:
app.js
views/real.jade
views/layout.jade
views/index.jade
public/images/hello_world.jpg
public/images/11-bang.jpg
public/images/03-l.jpg
public/images/02-l.jpg
public/images/05-space.jpg
public/images/10-d.jpg
public/images/09-l.jpg
public/images/00-h.jpg
public/images/01-e.jpg
public/images/07-o.jpg
public/images/04-o.jpg
public/images/08-r.jpg
public/images/06-w.jpg
public/javascripts/jquery-1.6.2.min.js
public/stylesheets/style.css
It's not a 100% fair comparison with the static since since we're compiling Jade templates, but hopefully it is not too far off.

I set my delay of 100ms and clear any routing caches that might be about:
➜  ~  sudo tc qdisc add dev lo root netem delay 50ms
[sudo] password for cstrom:
➜ ~ ping localhost
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=1 ttl=64 time=100 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=2 ttl=64 time=100 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=3 ttl=64 time=100 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_req=4 ttl=64 time=100 ms
^C
--- localhost.localdomain ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 100.091/100.128/100.156/0.225 ms
➜ ~ sudo ip route flush cache
When I load the site up, I find:



Bleh. That's not all that speedy.


1.11s (onload: 1.12s, DOMContentLoaded: 685ms).


That is all worse than just using the vanilla HTTP. The initial SSL negotiation makes the web page load slower than vanilla HTTP and it is all downhill from there.

My takeaways from this? There is no way to overcome poor RTT. If it takes 50ms for a request to hit the server and another 50ms for the response to reach the browser, there is nothing that can change that. Not even SPDY.


But, as readers of the SPDY Book already know, this is not the end of the story. I will pick back up with that tomorrow.


Day #81

Thursday, July 21, 2011

Simulating a Site + CDN

‹prev | My Chain | next›

Thanks to my work last night, I have two VMs. One to hold my main web server (www.local) and the other to simulate a content distribution network (cdn.local).

I am going to try to serve up the following page:



It includes 12 images, jQuery (minified) and some CSS. The various sizes are:
cstrom@debian:/var/www$ ls -lh
total 196K
-rw-r--r-- 1 cstrom cstrom 8.0K Jul 21 21:34 00-h.jpg
-rw-r--r-- 1 cstrom cstrom 6.5K Jul 21 21:34 01-e.jpg
-rw-r--r-- 1 cstrom cstrom 4.2K Jul 21 21:34 02-l.jpg
-rw-r--r-- 1 cstrom cstrom 5.4K Jul 21 21:34 03-l.jpg
-rw-r--r-- 1 cstrom cstrom 6.4K Jul 21 21:34 04-o.jpg
-rw-r--r-- 1 cstrom cstrom 624 Jul 21 21:34 05-space.jpg
-rw-r--r-- 1 cstrom cstrom 13K Jul 21 21:34 06-w.jpg
-rw-r--r-- 1 cstrom cstrom 6.1K Jul 21 21:34 07-o.jpg
-rw-r--r-- 1 cstrom cstrom 3.6K Jul 21 21:34 08-r.jpg
-rw-r--r-- 1 cstrom cstrom 5.0K Jul 21 21:34 09-l.jpg
-rw-r--r-- 1 cstrom cstrom 7.9K Jul 21 21:34 10-d.jpg
-rw-r--r-- 1 cstrom cstrom 6.9K Jul 21 21:34 11-bang.jpg
-rw-r--r-- 1 cstrom cstrom 90K Jul 21 21:51 jquery-1.6.2.min.js
-rw-r--r-- 1 cstrom cstrom 221 Jul 21 21:52 style.css
I will serve up Javascript and CSS gzip'd from the CDN. In the nginx configuration:
    gzip  on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
gzip_types text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
Now, all of this isn't exactly real world. The average site uses 13 domains (not 2), 53 images, 4 CSS files, 14 Javascript files for a grand total of 800kb. By the way, you people are insane.

But I am not going to set up that many servers to test things out. I hope to see useful results with just simple tests.

Anyhow, following along with the idea of making this "real world" like, I add a minimal RTT to the CDN:
cstrom@debian:/var/www$ sudo tc qdisc add dev eth0 root netem delay 30ms
And an somewhat substantial one to the web server:
cstrom@debian:/var/www$ sudo tc qdisc add dev eth0 root netem delay 100ms
Loading up the site, I find the following in the network tab of Chrome's Developer Tools:



14 requests. 112kb transferred. 669ms (onload: 708ms, DOMContentLoaded: 521ms)

Not too shabby. Lost a little bit of time connecting to the main site due to the 100ms round trip time, but the images and other static content did not have to wait too long because they are served up from my awesome CDN!

Let's see how SPDY fares. Tomorrow.

(How's that for a teaser?)


Day #81

Wednesday, July 20, 2011

VMs to Simulate Internet Traffic

‹prev | My Chain | next›

I have it my head that I would like to compare the performance of a single SPDY site to various CDN scenarios. First up, creating VMs to hold HTTP sites.

It seems easiest to clone my existing express-spdy VMs (used for testing install instructions). First things first, I clone the VirtualBox image to two separate instances: a www server to host the main site and a cdn server to hold a simulated content delivery network:
➜  VirtualBox VMs  vboxmanage clonehd express-spdy_node05.vdi www.local.vdi
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Clone hard disk created in format 'VDI'. UUID: 28dcc937-9520-43a5-a025-10f6188b773d
➜ VirtualBox VMs vboxmanage clonehd express-spdy_node05.vdi cdn.local.vdi
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Clone hard disk created in format 'VDI'. UUID: 485a1a00-e3d1-4c95-b8c6-48c8466e280f
My thinking right now is to host a web page on the www server that references a bunch of static files on the cdn server.

Now I need to create new VirtualBox VMs. Sadly the vboxmanage clonehd command has not added the new image to the list of disks known to VirtualBox:



So I have to do the "hard" way. I select the OS type to be the same as the source virtual machine:



Since the cloned hard drives are not known to Virtual, have to use the "Add disk" button:



After locating the cloned hard drive on disk, I am now all set:



All other options are set to the VirtualBox defaults.

Before I start the VM, I need to set it up so that I can connect to it like a real network machine. I opt for "Bridged Adapter":




I opt for this adapter because it is the easiest way to get networking between the host and guest working. I secretly fear that, by re-using the existing network interface (wlan0 on my laptop), the network filtering will lose packets in this configuration. But my desire for expediency wins out.

Anyhow, I set up both the www and cdn servers in the same way.

Normally, I would have to edit /etc/udev/rules.d/70-persistent-net.rules on a cloned Debian/Ubuntu/Linux machine to use a wildcard MAC address, but I already did that in the source machine:
# PCI device 0x8086:0x100e (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:*", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
The only other thing I do tonight is install sudo apt-get install nginx. After making a dummy HTML page in /var/www/index.html, I can access the web page from my local machine:



With both www.local and cdn.local working, I have reached a fine stopping point for tonight. I will pick back up tomorrow configuring the two to simulate typical traffic and use Speed Tracer to put them through their paces.

Day #80

Tuesday, July 19, 2011

SPDY Server Push and CSS

‹prev | My Chain | next›

In the interest of slimming down my chain posts to focus on writing SPDY Book, I am going to try to solve just one problem today. And it may not even be a SPDY problem.

The problem is that my CSS is not being cached via SPDY server push. It seemingly gets pushed OK:
t=1311041561541 [st=  197]     SPDY_SESSION_PUSHED_SYN_STREAM  
--> associated_stream = 1
--> flags = 2
--> content-type: text/css
status: 200
url: https://localhost:3000/stylesheets/style.css
version: http/1.1
--> id = 2
But a little while later, a request for a second page asks for the stylesheet again:
t=1311041603258 [st=41914]     SPDY_SESSION_SYN_STREAM  
--> flags = 1
--> accept: text/css,*/*;q=0.1
accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
accept-encoding: gzip,deflate,sdch
accept-language: en-US,en;q=0.8
host: localhost:3000
method: GET
referer: https://localhost:3000/one.html
scheme: https
url: /stylesheets/style.css
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.814.0 Safari/535.1
version: HTTP/1.1
--> id = 5
Interestingly, the referrer, /one.html, was not requested. It was SPDY server pushed into the browser's cache. And apparently stayed there.

Also odd is that the original CSS push did work—insofar as the original request did not make a secondary request for the CSS. Only upon a subsequent request did Chrome encounter a cache-miss.

So SPDY server push is working, but not for second requests of the CSS file. Weird.

Attempting to resolve this, I try just about everything I can think of. I push the CSS before the response. I push it after. I push before the HTML pages, I push it after. I add all sort of headers to the push, until...

I finally get it working when I add the "Last-Modified" header to CSS responses. I now have two exception in my SPDY server push code—one for pushing HTML (needs a content type) and a new one for CSS:
  if (/\.html?$/.test(url))
this._headers["content-type"] = "text/html";
if (/\.css$/.test(url))
this._headers["last-modified"] = "Wed, 20 Jul 2011 01:34:27 GMT";
Craziness.

I will most likely do away with the conditionals and simply always add the content-type and last-modified. Were this behavior to stick around, there might be some interesting games one could play with first page CSS and subsequent CSS. But that is hardly the kind of behavior that can be counted on (especially since this just happened in a recent dev release of Chrome).

As I was fiddling with cache, I noticed that SPDY server push resources do not seem to actually show up in about:cache. I try adding all necessary headers for caching:
  if (/\.html?$/.test(url))
this._headers["content-type"] = "text/html";
if (/\.css$/.test(url)) {
this._headers["cache-control"] = "public, max-age=3600";
this._headers["content-type"] = "text/css; charset=UTF-8";
this._headers["content-length"] = 111;
this._headers["etag"] = "111-1311125667000";
this._headers["last-modified"] = "Wed, 20 Jul 2011 01:34:27 GMT";

}
But it has no effect. The only cache entries for my local server are for the SSL cert that I am using and an accidental Google search:



Interesting. It seems that the SPDY cache is separate from the regular cache—at least in Chrome.

That is a fine stopping point for tonight. Up tomorrow: I start fiddling around with comparisons between SPDY and CDNs. That should be fun :)

Day #79