Dumping and restoring CouchDB is almost mine. The only things known to be lacking in couch_docs are:
- omitting the revision number from the dumped files (CouchDB tries to resolve a non-existent conflict with the revision number present)
- attachments—restoring documents with stubs in them does not work
it "should strip revision numbers" doWhen that example is run, it fails because the revision number is still included:
@store.stub!(:map).
and_return([{'_id' => 'foo', '_rev' => '1-1234'}])
@dir.
should_receive(:store_document).
with({'_id' => 'foo'})
CouchDocs.dump("uri", "fixtures")
end
1)As always, failure is a good thing—it means my example is testing what I think it ought to be testing. I can make that pass with a simple delete in the
Spec::Mocks::MockExpectationError in 'CouchDocs dumping CouchDB documents to a directory should strip revision numbers'
Mock 'Document Directory' expected :store_document with ({"_id"=>"foo"}) but received it with ({"_rev"=>"1-1234", "_id"=>"foo"})
/home/cstrom/repos/couch_docs/lib/couch_docs.rb:58:in `dump'
/home/cstrom/repos/couch_docs/lib/couch_docs.rb:55:in `each'
/home/cstrom/repos/couch_docs/lib/couch_docs.rb:55:in `dump'
./spec/couch_docs_spec.rb:79:
dump
method:def self.dump(db_uri, dir)Maybe I am letting Erlang influence me too much here, but that side-effect really bothers me. I let it slide here because there is no idiomatic way in Ruby to pass a copy of a modified hash. I could
store = Store.new(db_uri)
dir = DocumentDirectory.new(dir)
store.
map.
reject { |doc| doc['_id'] =~ /^_design/ }.
each { |doc| doc.delete('_rev'); dir.store_document(doc) }
end
dup
the hash, but nothing is driving me to do that. So I leave it, even though it bothers me.As for the attachments, I only need to alter the
get
of each document from the database to include ?attachments=true
. There is no need for a new spec, just a slight change to an existing example:it "should be able to load each document" doSimilarly, to get that example passing, only a slight change is needed in the code that iterates over each document in the CouchDB store:
Store.stub!(:get).
with("uri/_all_docs").
and_return({ "total_rows" => 2,
"offset" => 0,
"rows" => [{"id"=>"1", "value"=>{}, "key"=>"1"},
{"id"=>"2", "value"=>{}, "key"=>"2"}]})
Store.stub!(:get).with("uri/1?attachments=true")
Store.should_receive(:get).with("uri/2?attachments=true")
@it.each { }
end
def eachThat should do it.
Store.get("#{url}/_all_docs")['rows'].each do |rec|
yield Store.get("#{url}/#{rec['id']}?attachments=true")
end
end
It could be argued that I ought to create a spec that exercises the full CouchDB stack at this point. An example that starts with a JSON document in a seed directory, uses
CouchDocs.upload_dir
to store that document in a test CouchDB database, CouchDocs.dump
to a separate directory, and finally compares the original document with the dumped copy to ensure that they are the same. I must confess laziness here. I use examples to drive clean implementation. That they provide some measure of regression testing is pure bonus for me. That said, I will be sure to add such a regression test the first time I introduce a bug in the future.Before claiming completeness, I try the
couch-docs
scripts on my 1,000+ document database:cstrom@jaynestown:~/repos/eee-code$ time couch-docs dump http://localhost:5984/eee couch/seed/Wow, that is a significant increase over the 5 seconds it took to dump the documents without the attachments. I certainly expected an increase, but maybe not that much. I make a mental note of that, but optimization will come later (if it is becomes necessary).
real 0m56.536s
user 0m7.048s
sys 0m0.516s
Before restoring, I need a target database:
Now to test a CouchDB restore (again with timing):
cstrom@jaynestown:~/repos/eee-code$ time couch-docs load couch/seed/ http://localhost:5984/couch-docs-testWell, it seems the 50 seconds for 1,000 documents is going to be typical.
real 0m52.946s
user 0m5.068s
sys 0m0.476s
Checking the database in the browser, I see that there are, indeed, documents:
And, clicking through to one document's attachments:
I update the README, History and the VERSION number in
couch_docs.rb
. Prior to publishing the code to Github, I update the gemspec, using the rake task from Bones:cstrom@jaynestown:~/repos/couch_docs$ rake gem:spec # Write the gemspecAlso from Bones, I create a tag for this version of the code:
cstrom@jaynestown:~/repos/couch_docs$ rake git:create_tag VERSION=1.0.0 # Create a new tag in the Git repositoryGithub uses tags to create download files—they are not too useful for gems, but still nice to have.
(in /home/cstrom/repos/couch_docs)
Creating Git tag 'couch_docs-1.0.0'
Counting objects: 1, done.
Writing objects: 100% (1/1), 180 bytes, done.
Total 1 (delta 0), reused 0 (delta 0)
To git@github.com:eee-c/couch_docs.git
* [new tag] couch_docs-1.0.0 -> couch_docs-1.0.0
With that, I am now able to load all of my seed data onto my new server in less than a minute. That will make it much easier to get started than populating this from my legacy Rails app.
(commit)
No comments:
Post a Comment