Sunday, September 19, 2010

Picking Up After a Long Time Away

‹prev | My Chain | next›

Since I have been at Ruby DCamp this weekend, I thought it would make some sense to actually do some Ruby code. Way back when, I started this chain with the idea that I would come up with a mechanism to update EEE Cooks. The data resides in a CouchDB database. At least one way that I might accomplish that goal is via my couch_docs gem.

So I pick back up with my couch_docs. It has been a while, but I remember that I was partly through adding attachment support. How to pick it back quickly?

Happily, I left myself some clues:



Actually, I left myself a complete plan of what I need to accomplish:



Pending specs are incredibly powerful for this kind of thing. I use them even when I am not abandoning gems. If I am in the middle of working a problem at the end of the day, I will brain dump the remaining work in the form of pending specs.

The first pending spec that I will work today is:
      it "should guess the mime type"
That is nice and compact—no block or anything other cermony. Just a descrition of what I need to implement.

Unfortunately, when I look at the other specs in here, there is a ton of setup involved:
      it "should connect attachments by sub-directory name (foo.json => foo/)" do
everything = []
@it.each_document do |name, contents|
everything << [name, contents]
end

everything.
should include(['baz_with_attachments',
{'baz' => '3',
"_attachments" => { "spacer.gif" => {"data" => @spacer_b64} } }])
end
That feels like a lot of work (especially considering there is additional setup in before(:each) blocks). Instead of rewriting immediately, I try to work with the same stuff for my should-infer-mime-time pending spec:
      it "should guess the mime type" do
JSON.stub!(:parse).
and_return({ "baz" => "3",
"_attachments" => {
"spacer.gif" => "asdf",
}
})

everything = []
@it.each_document do |name, contents|
everything << [name, contents]
end

everything.
should include(['baz_with_attachments',
{ 'baz' => '3',
"_attachments" => {
"spacer.gif" => { "data" => @spacer_b64},
"baz.jpg" => "asdf",
"content_type" => "image/gif"
}
}])
end
Holy wow! That is a crazy amount of code just to test that content type. It passes, but clearly it is time to refactor.

With all tests passing, I can focus on the main culprit of my testing complexity, the each_document method:
    def each_document
Dir["#{couch_doc_dir}/*.json"].each do |filename|
id = File.basename(filename, '.json')
json = JSON.parse(File.new(filename).read)

if File.directory? "#{couch_doc_dir}/#{id}"
json["_attachments"] ||= { }
Dir["#{couch_doc_dir}/#{id}/*"].each do |attachment|
next unless File.file? attachment

attachment_name = File.basename(attachment)
type = mime_type(File.extname(attachment))
data = File.read(attachment)
json["_attachments"][attachment_name] =
{
"data" => Base64.encode64(data).gsub(/\n/, '')
}
if type
json["_attachments"][attachment_name].
merge!({"content_type" => type})
end

end
end

yield [ id, json ]
end
end
Yikes! I choose to believe that I knew there was way too much complexity the last time I worked this and that I intended to refactor. That is just silly.

Not coincidentally, the behavior for which I wrote my test is part of the complexity in each_document. Specifically, converting the file on the filesystem (including knowing the file type) can be factored out into a smaller method:
    def file_as_attachment(file)
type = mime_type(File.extname(file))
data = File.read(file)

attachment = {
"data" => Base64.encode64(data).gsub(/\n/, '')
}
if type
attachment.merge!({"content_type" => type})
end

attachment
end
That is still a largish method, but there is no looping and only a single conditional. That is pretty good. And much easier to test for the content type:
      it "should guess the mime type" do
File.stub!(:read).and_return("asdf")
Base64.stub!(:encode64).and_return("asdf")

@it.file_as_attachment("spacer.gif").
should == {
"data" => "asdf",
"content_type" => "image/gif"
}
end
That also makes for quicker BDDing of other mime types. Once I have everything working to my satisfaction, I load my fixtures into a test DB via the command line:
cstrom@whitefall:~/repos/couch_docs$ ./bin/couch-docs push http://localhost:5984/couch_docs_test ./fixtures/
Updating documents on CouchDB Server...
...and, checking in the DB:



Yay! The algorithm being used for mime inference is extraordinarily rudimentary, but it is well isolated in the mime_type method for future improvement. My main takeaway from today was that I was able to pick up the code very quickly, even after a long time away, thanks to pending specs. Those same specs are also pretty good for refactoring.


Day #231

No comments:

Post a Comment