Monday, July 21, 2014

Refactoring Bash Scripts


I'll be honest here: I'm a pretty terribly Bash script coder. I find the man page too overwhelming to really get better at it. If I need to do something in Bash—even something simple like conditional statements—I grep through /etc/init.d scripts or fall back to the Google machine.

But tonight, there is no hiding. I have two Bash scripts (actually, just shell scripts at this point) that do nearly the same thing: benchmark.sh and benchmark_js.sh. Both perform a series of benchmarking runs of code for Design Patterns in Dart, the idea being that it might be useful to have actual numbers to back up some of the approaches that I include in the book. But, since this is Dart, it makes sense to benchmark both on the Dart VM and on a JavaScript VM (the latter because most Dart code will be compiled with dart2js). The two benchmark shell scripts are therefore responsible for running and generating summary results for Dart and JavaScript.

The problem with two scripts is twofold. First, I need to keep them in sync—any change made to one needs to go into the other. Second, if I want to generalize this for any design pattern, I have hard-coded way too much in both scripts. To the refactoring machine, Robin!

To get an idea where to start, I diff the two scripts. I have been working fairly hard to keep the two scripts in sync, so there are only two places that differ. The JavaScript version includes a section that compiles the Dart benchmarks in to JavaScript:
$ diff -u1 tool/benchmark.sh tool/benchmark_js.sh 
--- tool/benchmark.sh   2014-07-21 22:32:44.047778498 -0400
+++ tool/benchmark_js.sh        2014-07-21 20:54:20.803634500 -0400
@@ -11,2 +11,16 @@
 
+# Compile
+wrapper='function dartMainRunner(main, args) { main(process.argv.slice(2)); }';
+dart2js -o tool/benchmark.dart.js \
+           tool/benchmark.dart
+echo $wrapper >> tool/benchmark.dart.js
+
+dart2js -o tool/benchmark_single_dispatch_iteration.dart.js \
+           tool/benchmark_single_dispatch_iteration.dart
+echo $wrapper >> tool/benchmark_single_dispatch_iteration.dart.js
+
+dart2js -o tool/benchmark_visitor_traverse.dart.js \
+           tool/benchmark_visitor_traverse.dart
+echo $wrapper >> tool/benchmark_visitor_traverse.dart.js
+
...
The other difference is actually running the benchmarks—the JavaScript version needs to run through node.js:
$ diff -u1 tool/benchmark.sh tool/benchmark_js.sh 
--- tool/benchmark.sh   2014-07-21 22:32:44.047778498 -0400
+++ tool/benchmark_js.sh        2014-07-21 20:54:20.803634500 -0400
...
@@ -15,7 +29,7 @@
 do
-    ./tool/benchmark.dart --loop-size=$X \
+    node ./tool/benchmark.dart.js --loop-size=$X \
         | tee -a $RESULTS_FILE
-    ./tool/benchmark_single_dispatch_iteration.dart --loop-size=$X \
+    node ./tool/benchmark_single_dispatch_iteration.dart.js --loop-size=$X \
         | tee -a $RESULTS_FILE
-    ./tool/benchmark_visitor_traverse.dart --loop-size=$X \
+    node ./tool/benchmark_visitor_traverse.dart.js --loop-size=$X \
         | tee -a $RESULTS_FILE
For refactoring purposes, I start with the latter difference. The former is a specialization that can be performed in a single conditional. The latter involves both uses of the script.

That will suffice for initial strategy, what about tactics? Glancing at the Dart benchmark.sh script, I see that I have a structure that looks like:
#!/bin/bash

RESULTS_FILE=tmp/benchmark_loop_runs.tsv
SUMMARY_FILE=tmp/benchmark_summary.tsv
LOOP_SIZES="10 100 1000 10000 100000"

# Initialize artifact directory
...

# Individual benchmark runs of different implementations
...

# Summarize results
...

# Visualization ready
...
Apparently I have been quite fastidious about commenting the code because those code section comments are actually there. They look like a nice first pass a series of functions. Also of note here is that the script starts by setting some global settings, which seems like a good idea even after refactoring—I can use these and others to specify output filenames, benchmark scripts, and whether or not to use the JavaScript VM.

But first things first, extracting the code in each of those comment sections out into functions. I make a top-level function that will invoke all four functions-from-comment-sections:
_run_benchmarks () {
    initialize
    run_benchmarks
    summarize
    all_done
}
Then I create each function as:
# Initialize artifact directory
initialize () {
  # ...
}

# Individual benchmark runs of different implementations
run_benchmarks () {
  # ...
}

# Summarize results
summarize () {
  # ...
}

# Visualization ready
all_done () {
  # ...
}
Since each of those sections is relying on top-level global variables, this just works™ without any additional work from me.

Now for some actual refactoring. One of the goals here is to be able to use this same script not only for JavaScript and Dart benchmarking of the same pattern, but also for different patterns. To be able to use this for different patterns, I need to stop hard-coding the scripts inside the new run_benchmarks function:
run_benchmarks () {
    echo "Running benchmarks..."
    for X in 10 100 # 1000 10000 100000
    do
      ./tool/benchmark.dart --loop-size=$X \
          | tee -a $RESULTS_FILE
      ./tool/benchmark_single_dispatch_iteration.dart --loop-size=$X \
          | tee -a $RESULTS_FILE
      ./tool/benchmark_visitor_traverse.dart --loop-size=$X \
          | tee -a $RESULTS_FILE
    done
    echo "Done. Results stored in $results_file."
}
The only thing that is different between those three implementation benchmarks is the name of the benchmark file. So a list of files in a global variable that could be looped over is my next step:
BENCHMARK_SCRIPTS=(
    tool/benchmark.dart
    tool/benchmark_single_dispatch_iteration.dart
    tool/benchmark_visitor_traverse.dart
)
Up until this point, I think I could get away with regular Bourne shell scripting, but lists like this are only available in Bash. With that, I can change run_benchmarks to:
run_benchmarks () {
    echo "Running benchmarks..."
    for X in 10 100 1000 10000 100000
    do
        for script in ${BENCHMARK_SCRIPTS[*]}
        do
            ./$script --loop-size=$X | tee -a $results_file
        done
    done
    echo "Done. Results stored in $results_file."
}
At this point, I would like to get a feel for what the common part of the script is and what specialized changes are needed for each new pattern benchmark. So I move all of my new functions out into a _benchmark.sh script that can be ”sourced” by the specialized code:
#!/bin/bash

source ./tool/_benchmark.sh

BENCHMARK_SCRIPTS=(
    tool/benchmark.dart
    tool/benchmark_single_dispatch_iteration.dart
    tool/benchmark_visitor_traverse.dart
)

RESULTS_FILE=benchmark_loop_runs.tsv
SUMMARY_FILE=benchmark_summary.tsv

_run_benchmarks
That is pretty nice. I can easily see how I would use this for other patterns—for each implementation being benchmarked, I would simple add them to the list of BENCHMARK_SCRIPTS.

Now that I see what is left of the code, I realize that the RESULTS_FILE and SUMMARY_FILE variables were only varied to keep from overwriting artifacts of the different VM runs. The basename for each remains the same between the two original scripts—the JavaScript artifacts include an additional _js. That is the kind of thing that can be moved into my new common _benchmark.sh file:
#...
results_basename="benchmark_loop_runs"
summary_basename="benchmark_summary"
results_file=""
summary_file=""
type="js"
# ...
initialize () {
    if [ "$type" = "js" ]; then
        results_file=$tmpdir/${results_basename}_js.tsv
        summary_file=$tmpdir/${summary_basename}_js.tsv
    else
        results_file=$tmpdir/${results_basename}.tsv
        summary_file=$tmpdir/${summary_basename}.tsv
    fi
    # ...
}
If _run_benchmarks is invoked when type is set to "js", then the value of the results_file variable will then include _js in the basename. If "js" is not supplied, then the old basename will be used.

To obtain that information from the command-line, I read the first argument supplied when the script is called ($1) and send it to _run_benchmarks:
_run_benchmarks $1
I can then add a new (very simple) parse_options function to see if this script is running the Dart or JavaScript benchmarks:
_run_benchmarks () {
    parse_options $1
    initialize
    run_benchmarks
    summarize
    all_done
}

parse_options () {
  if [ "$1" = "-js" -o "$1" = "--javascript" ]
  then
      type="js"
  fi
}
# ...
The $1 inside _run_benchmarks is not the same as the $1 from the command-line. Inside a function, $1 refers to the first argument supplied to it.

I can also make use of this to pull in the last piece of refactoring—the compiling that I decided to save until later. Well, now it is later and I can compile as needed when type has been parsed from the command line to be set to "js":
_run_benchmarks () {
    parse_options $1
    initialize
    if [ "$type" = "dart" ]; then
        run_benchmarks
    else
        compile
        run_benchmarks_js
    fi
    summarize
    all_done
}
With that, I have everything I need to keep the specialized versions of the benchmarking script small:
#!/bin/bash

source ./tool/_benchmark.sh

BENCHMARK_SCRIPTS=(
    tool/benchmark.dart
    tool/benchmark_single_dispatch_iteration.dart
    tool/benchmark_visitor_traverse.dart
)

_run_benchmarks $1
No doubt there is much upon which I could improve if I wanted to get really proficient at Bash scripting, but I am reasonably happy with this. It ought to suit me nicely when I switch to other patterns. Which I will try next. Tomorrow.



Day #129

No comments:

Post a Comment