My daily activities as Developer Founder of Firmhouse. Productivity hints, leading a team, working on code.

I love learning new things. Yesterday I started picking up guitar practice again. When I was in my teenage years (true, only four years ago) I played the piano a lot and DJ't around as a hobby so I have always been interested in playing (musical) instruments.

What I like about practicing the guitar is that for me as a beginner, so much depends on getting the hold of the strings right against the frets. Every minor change of my palm, my wrist and the angle in that I keep my finger tips down has an effect on the vibrations when I pick the notes in the accord I am practicing.

No one can tell me how to hold my hand exactly or how much pressure I should use to press down on the strings. It is only I who can feel and hear when I've got the right vibration. By picking the same chord 50 times in a row with a minor change of my palms and fingers each time I can improve my play.

I think a lot about learning an instrument can be applied to software design and development as well. Or any craft for that matter. You can only know what you need to change when you've tried it and you can only get better by practicing over and over again.

How are you getting better at your craft? Whatever it is.

Today, I wanted to test a synchronization script we have running between two apps that uses an Active Resource.

It's a script that runs in a DelayedJob worker that gets queued when a user pushes a button. The job synchronizes new products from one backend application to a web shop frontend.

In the test that I was going to write I didn't want to make a call to some test server so I was looking for some way to mock ActiveResource.

Two resources to get started

Luckily, you get that out of the box as described in a the post ActiveResource and Testing by ThoughtBot. Also, there is some info about it in the API docs for ActiveResource::HttpMock.

Testing my new feature

A bit simplified the job does the following:

  • Fetch all the products using the ProductResource from /products.xml
  • Walks trough them, see if we have them in the local web shop database.
  • If not, create it. Otherwise, update current info and descriptions.

The new thing that I am adding today is a way to check if current products have been archived (deleted) in the backend warehousing system and to disable and hide them in the shop frontend with the synchronization job.

For this, I created the following test case. Be sure to explicitly require active_resource/http_mock since it isn't loaded by default.

require 'test_helper'
require 'active_resource/http_mock'

class SyncTest < ActiveSupport::TestCase

  def setup

    archived_product = { :id => 1, :archived => true, :ean8 => "12345678" }.to_xml(:root => "product")
    regular_product = {:id => 2, :ean8 => "11111111"}.to_xml(:root => "product")

    ActiveResource::HttpMock.respond_to do |mock|
      mock.get "/products.xml", {}, "<products type='array'>#{archived_product}#{regular_product}</products>"
    end
  end

  test "an archived product should be set to deleted" do

    product = Factory(:product, :external_id => "12345678", :state => "available")

    Sync.perform

    assert_equal "deleted", product.reload.state

  end

  test "an available product should not be set to deleted" do

    product = Factory(:product, :external_id => "11111111", :state => "available")

    Sync.perform

    assert_equal "available", product.reload.state

  end

  test "a currently deleted shop article should be set to inactive when not archived anymore" do

    product = Factory(:product, :external_id => "11111111", :state => 'deleted')

    Sync.perform

    assert_equal "inactive", product.reload.state
  end

end

And here is the code for the actual job. I have left out all other code that does not reflect those test cases for simplicity in presentation:

class Sync

  def self.perform
    products = ProductResource.find(:all)

    products.each do |product_from_api|
      if product = Product.find_by_external_id(product_from_api.ean8)
        if product_from_api.archived?
          product.state = 'deleted'
        else
          if product.state == "deleted"
            product.state = "inactive"
          end
        end
        product.save!
      end
    end
  end

end

Need help with testing your app?

If you want some help testing stuff like this on your own app please let me know. I would be very happy to help you in a comment on this post or just get in touch!

This short post is inspired by a @coreyhaines tweet: One mark of a good developer is the ability to conform to the style of an existing codebase.

I agree with this. One of the most important skill of every programmer is the willingness and ability to confirm with an existing codebase.

If you want to even be more awesome as a developer I think it is also important that you are always looking to improve the current code bas and state of the product.

Every little detail you notice that can be improved, you should improve it even if it's not officially part of the task you are working on right now.

Some simple examples:

  • Fix some spacing in CSS.
  • Refactor code or move code over to better methods.
  • Rename some variables in a method so they are easier to read.
  • Look for opportunities to create indexes in the database if you feel a query is just too slow.
  • Write some new tests or get rid of old or obsolete ones.

37signals posted a few blog posts about how they made their upcoming Basecamp (Next) product so blazingly fast.

Amongst other things, an important factor is using their caches to THE MAX (as @dhh puts it). Which basically means: cache every representation of some state (a to do item, a todo list, a project page, etc).

The technique they are using is key-based cache expiration. Meaning, that you keep caching every fragment based on the updated_at timestamp of the object that might change.

With this comes an advantage: you will never have to expire caches since your views will always look for the fragment with the most recent updated_at timestamp of the underlying model. Your memcached server will then take care of expiring by popping the oldest key out if memory runs out.

It's really that simple

So just to illustrate how simple it is, I wanted to show you how I implemented this today in the custom-build blog engine we're using to host some of our own and other people's blogs. This is an app that runs on Heroku and you are looking at it right now.

Every post you see on this blog is a fragment that gets cached using the free 5mb memcached addon from Heroku.

To install the addon, run the following command in your app:

heroku addons:add memcache

It is recommended to use the Dalli gem for using the memcache server in your app. So add the gem to your Gemfile:

gem 'dalli'

And add this to your config/environments/production.rb so Rails knows you're using Dalli for the cache:

config.cache_store = :dalli_store

For all the posts on the blog I have a _posts.html.erb partial that renders a single post. So, in that partial I can include the line:

<%= cache post do %>
  <h1><%= post.title %></h1>
  ... etc ...
<% end %>

The cache method will use a key like "posts/3-20071224150000" for the object based on the updatedat method. Under the hood, the *cachekey* method is used.

So, when I would fix a typo in the post the updated_at column will have been updated, hence the cache key will also be updated and the cache will generate the updated post partial with the new cache key.

But what if I change the contents of the partial?

If you change the actual code in the partial the posts that are already in the database won't have a new updated_at so they will keep using the cached old generated partial.

I am currently solving this by adding a prefix to the cache key like so:

<%= cache ["v2", post] do %>
  <h2><%= post.title %></h2>
<% end %>

This way the next time someone opens a page with posts, the new partial is going to be used since the key that is now requested includes "v2".

I found this method a bit tedious for now since it is easy to forget to update the prefix cache key for the version of the partial. Would be great to have some kind of automated thing for this based on if the partial is changed or something.

But for now, manually updating the prefix seems easiest.

Questions or comments?

I hope this post illustrates how easy it is to use this method on Heroku for some substantial speed improvements. If you need help implement this for your app or on non-Heroku environments please let me know and I'll see if I can help or answer questions!

How we do database backups

February 23, 2012 09:09 comments

We have several apps running in our EC2 hosting environment. Inspired by Engine Yard, I decided to make our database backups a bit more accessible.

Previously, we trusted AWS RDS to take care of database backups with their promise to be able to roll back when something might go wrong. Of course, this doesn't help you much if you need to easily get some data from a few weeks ago without booting a new instance. Also, backing up the databases in an AWS RDS instance offsite is impossible without some conversion step (like a SQL dump) in place.

Since I want to sleep at night I decided to create a periodic snapshotter utility script that would be able to upload a SQL dump to an S3 bucket in Europe.

You can find the code for it in our GitHub organization.

Because our DB instance is in it's own security group with the rest of our apps I wanted the script to be able to tunnel the dump over SSH. But, if you have direct shell access to your MySQL server the script can also run without the ssh option.

Automatic 60 days bucket retention

The script creates a S3 bucket with a file retention of a configured number of 60 days. Every file older than those 60 days will be removed. This is great since this way our S3 bucket with the database backups will not outgrow to enormous sizes without us knowing about it.

Server-side encryption

S3 supports server-side encryption nowadays. This might seem a lot cooler than it is, because it isn't that good of a solution for backup security. What it does is that it encrypts all the data dat is stored on S3's internal file systems with a key that is stored offsite in some other part of the S3 infrastructure.

So, in the unlikely case that bad people break into AWS' Ireland bunker, steal some harddrives and are able to put all the chunks together into actual files, the data is at least encrypted.

Accessible backups and no single point of failure

The Cron job will run the backup script for the configured databases every night and store them into our S3 bucket.

This way, all members of our team will be able to access the backups when something might go wrong with the data or in the case that we need to move to another hosting provider and I'm on vacation.

This is also quite handy for testing new features and updating our staging database and local development databases.