japh(r) by Chris Strom: Blurry Benchmarks

Saturday, July 12, 2014

Blurry Benchmarks

tl;dr You may regret reading this post. I am grateful that I wrote it because I wound up correctly a major problem in my thinking because of a dumb mistake, but it was a really dumb mistake....

At the risk of overdoing my benchmark exploration for Design Patterns in Dart, there is one other facet of them that I would like to make sure I understand.

After a bit of code reorganization last night, I found that my three approaches to the Visitor Pattern broken down like:

Classic Visitor Pattern (RunTime): 1374 µs.
--
Nodes iterate w/ single dispatch (RunTime): 1290 µs.
--
Visitor Traverses (RunTime): 1362 µs.

This more or less breaks down like I expected from the outset. There is a slight win for the second approach, which exploits some knowledge of the structure being visited in order to invoke a collection with single dispatch.

What strikes me as odd is that, prior to reorganizing the file locations, I was seeing numbers like:

Classic Visitor Pattern (RunTime): 1345 us.
--
Nodes iterate w/ single dispatch (RunTime): 1344 us.
--
Visitor Traverses(RunTime): 1411 us.

Only the location of the code changed. Not of the actual code changed—well except for one thing...

I have settled on laying out the code for the public facing book repository like this:

lib
├── inventory.dart
├── visitor.dart
└── alt
    ├── single_dispatch_iteration
    │   ├── inventory.dart
    │   └── visitor.dart
    └── visitor_traverse
        ├── inventory.dart
        └── visitor.dart

The preferred solution (either because of suitability or performance) will reside at the top the package's lib directory. Alternate approaches will reside in lib/alt. Before this file reorganization, all implementations resided in the top level of the lib directory. And some shared code.

I am fairly sure that the sharing of the code inadvertently caused the slowness in the pre-file-reorg benchmarks. I would actually be somewhat surprised if that is the case because it would be an issue caused by types which I thought Dart ignored. Anyway, enough speculation.

Previously, the “classic” visitor pattern implementation (e.g. with double dispatch) and the single-dispatch code shared visitor.dart. Once I re-organized files, they are still identical copies:

$ diff -s lib/visitor.dart lib/alt/single_dispatch_iteration/visitor.dart
Files lib/visitor.dart and lib/alt/single_dispatch_iteration/visitor.dart are identical

But previously they were the same file and both benchmarks loaded the same file. To reproduce the prior situation, I change the top-level benchmark to use the single-dispatch's visitor:

#!/usr/bin/env dart

import 'package:benchmark_harness/benchmark_harness.dart';

import 'package:visitor_code/alt/single_dispatch_iteration/visitor.dart';
import 'package:visitor_code/inventory.dart';

main () {
  _setup();

  VisitorBenchmark.main();
  VisitorBenchmark.main();
  VisitorBenchmark.main();
}
//  ...

When I run the benchmarks, however, I find no change. The single dispatch approach still runs the quickest and the double dispatch implementation still runs slightly slower. So it seems like my supposition about types was wrong after all. I had begun to suspect that importing the single dispatch's visitor was in turn pulling in the single dispatch node structure (where the single dispatch actually resides). But that is not the case.

So why did I get nearly identical single and double benchmark numbers? Well, it turns out that the explanation is much simpler. I luckily saved my work in git, so I can git checkout the benchmarks in question:

$ git checkout HEAD~1

The previous double dispatch benchmark looked like:

#!/usr/bin/env dart

import 'package:benchmark_harness/benchmark_harness.dart';

import 'package:visitor_code/visitor.dart';

main () {
  _setup();

  NodesDoubleBenchmark.main();
  NodesDoubleBenchmark.main();
  NodesDoubleBenchmark.main();
}
// ...

So I was relying on visitor.dart to export the relevant inventory. For double dispatch, this should have been the double-dispatching inventory library, but... it was the single dispatching code:

library visitor;

import 'inventory_single_dispatch.dart';
export 'inventory_single_dispatch.dart';
// ...

So of course the two benchmarks were identical—they were explicitly running the same damn code!

Ugh.

I am so ashamed. It turns out that select is not broken. I had readily built up a ridiculous explanation in my head for why two different approaches resulted in the same results. It turns out that the results were the same because I used the same approach both times.

I can take some measure of solace in the fact that I took the time to correct myself. I almost did not investigate the discrepancy in numbers tonight (I was seriously that certain that my rationalization was valid). Luckily I did investigate. And even though I once again proved that I am an idiot, hopefully I am a slightly more knowledgeable idiot.

More importantly, I am far more convinced that I need to organize my implementations in the lib/alt scheme. Before tonight, I thought keeping everything in the top-level was a little messy, but hardly cause for worry. Had I not attempted to shove everything into the top-level as I had, I never would have gotten myself into this situation.

Now if you'll excuse me, I need to go flog myself.

Day #120

‹prev | My Chain | next›

japh(r) by Chris Strom

Saturday, July 12, 2014

Blurry Benchmarks

No comments:

Post a Comment