Friday, July 18, 2014

Close, But Not Quite Visualizing dart2js Benchmark Comparisons


Thanks to benchmark_harness and gnuplot I can make quick, pretty, and useful visualizations of the performance of different code implementations:



The actual numbers are almost inconsequential at this point. I need to know that I can make and visualize them in order to know that I can choose the right solution for inclusion in Design Patterns in Dart. I can investigate the numbers later—for now I only need know that I can generate them.

And I think that I finally have that sorted out. I think that I have eliminated inappropriate comparisons, loops, types and just plain silly mistakes. To be sure, the code could (and probably will need to be) better. But, it works. More importantly, it can be run through a single tool/benchmark.sh script:
#!/bin/sh

# Initialize artifact directory
mkdir -p tmp
cat /dev/null > tmp/benchmark_loop_runs.tsv
cat /dev/null > tmp/benchmark_summary.tsv

# Individual benchmark runs of different implementations
echo "Running benchmarks..."
for X in 10 100 1000 10000 100000
do
    ./tool/benchmark.dart --loop-size=$X
    ./tool/benchmark_single_dispatch_iteration.dart --loop-size=$X
    ./tool/benchmark_visitor_traverse.dart --loop-size=$X
done
echo "Done. Results stored in tmp/benchmark_loop_runs.tsv."

# Summarize results
echo "Building summary..."
./tool/summarize_results.dart
echo "Done. Results stored in tmp/benchmark_summary.tsv."

# Visualization ready
echo ""
echo "To view in gnuplot, run tool/visualize.sh."
I am little worried about the need for artifacts, but on some level they are needed—the VM has to be freshly started with each run for accurate numbers. To retain the data between runs—and between the individual runs and the summary—artifacts files will need to store that information. I will worry about that another day.

My concern today is that I am benchmarking everything on the Dart VM. For the foreseeable future, however, the Dart VM will not be the primary runtime environment for Dart code. Instead, most Dart code will be compiled to JavaScript via dart2js. I cannot very well recommend a solution that works great in Dart, but fails miserably in JavaScript.

So how to I get these numbers and visualizations in JavaScript?

Well, I already know how to benchmark dart2js. I just compile to JavaScript and run with Node.js. Easy-peasey, right?
$ dart2js -o tool/benchmark.dart.js \
>            tool/benchmark.dart
tool/src/score_emitters.dart:3:8:
Error: Library not found 'dart:io'.
import 'dart:io';
       ^^^^^^^^^
tool/src/score_emitters.dart:39:18:
Warning: Cannot resolve 'File'.
  var file = new File(LOOP_RESULTS_FILE);
                 ^^^^
tool/src/score_emitters.dart:40:23:
Warning: Cannot resolve 'FileMode'.
  file.openSync(mode: FileMode.APPEND)
                      ^^^^^^^^
Error: Compilation failed.
Arrrgh. All of my careful File code is for naught if I want to try this out in JavaScript. On the bright side, this is why you try a solution in a bunch of different environments before generalizing.

This turns out to have a fairly easy solution thanks to tee. I change the score emitter to simply print to STDOUT in my Dart code:
recordTsvTotal(name, results, loopSize, numberOfRuns) {
  var averageScore = results.fold(0, (prev, element) => prev + element) /
    numberOfRuns;

  var tsv =
    '${name}\t'
    '${averageScore.toStringAsPrecision(4)}\t'
    '${loopSize}\t'
    '${(averageScore/loopSize).toStringAsPrecision(4)}';

  print(tsv);
}
Then I make the benchmark.sh responsible for writing the tab separated data to the appropriate file:
RESULTS_FILE=tmp/benchmark_loop_runs.tsv
# ...
dart2js -o tool/benchmark.dart.js \
           tool/benchmark.dart
# ...
for X in 10 100 1000 10000 100000
do
    ./tool/benchmark.dart --loop-size=$X | tee -a $RESULTS_FILE
    # ...
done
That works great—for the default case:
$ node tool/benchmark.dart.js
Classic Visitor Pattern 6465    10      646.5
Unfortunately, the dart:args package does not work with Node.js. Supplying a different loop size still produces results for a loop size of 10:
$ node tool/benchmark.dart.js --loop-size=100
Classic Visitor Pattern 6543    10      654.3
My initial attempt at solving this is String.fromEnvironment():
const String LOOP_SIZE = const String.fromEnvironment("LOOP_SIZE", defaultValue: "10");

class Config {
  int loopSize, numberOfRuns;
  Config(List<String> args) {
    var conf = _parser.parse(args);
    loopSize = int.parse(LOOP_SIZE);
    // ...
  }
  // ...
}
But that does not work. When I run the script, I still get a loop size of 10:
$ LOOP_SIZE=100 node tool/benchmark.dart.js
Classic Visitor Pattern 6160    10      616.0
Stumped there, I call it a night.

I would have preferred to get all the way to a proper visualization, but the switch to tee for building the tab separated artifact is already a win. The filename is now located in a single script (the overall shell script) instead of two places (the shell script and Dart code). Hopefully I can figure out some way to read Node.js command line arguments from compiled Dart. Tomorrow.


Day #126

2 comments:

  1. Could you not run this via a web server, like node.js run one, or a pure dart server, controlled via a browser UI.

    ReplyDelete
    Replies
    1. I could — and might not even need a web server (http://japhr.blogspot.com/2014/06/benchmarking-dart-and-dart2js-code-in.html). For some benchmarks in the book, that might even be preferable. But I would still like to have a solid approach to pure CLI benchmarks when they are most appropriate.

      Delete