Le blog d'Archiloque

You should watch what your ORM and your SQL queries are doing, Sinatra + Sequel edition

SQL: a fine tool of the trade

For structured data I love SQL: it’s not perfect but good enough: the model is easy yet expressive and the syntax is plain.

A good ORM is a nice thing

My criteria to choose an ORM are:

  • It should provide a good API to code simple queries with a readable syntax.

  • It shouldn’t prevent me from writing my own SQL when I need it, because sometimes it’s the best solution (2 lines of SQL vs. 5 lines of “SQL-masqueraded as chained method invocations” horrors). Bonus point if it provides some help to read the query result.

In Ruby Sequel is my tool of choice:

  • It matches my two prerequisites.

  • The documentation is good enough for most cases.

  • The code is clear enough for other cases.

But you can’t trust any of them

  • With SQL it’s really easy to write a query with a bad join and have results that scan too much data.

  • With an ORM it’s easy to hit the N + 1 select problem.

So I watch them carefully

The bad thing is that both problems are unfit for unit tests:

  • You can’t easily test that a query is using the right index.

  • Testing that some code is performing the correct number of queries takes too much work and maintenance.

So an acceptable solution is to enable metrics for both of them, and then to use those metrics as much as you can.

Just a small hack and we’re there

The best way to enable metrics is to provide them bundled with the operation they measure.

For a start, you can use custom HTTP headers.

It’s done in two steps:

  • Count the queries and execution time in the ORM.

  • Send the data back in the service result.

An example with the Sinatra web framework with the Sequel ORM:

  • I need to hack a bit into Sequel to retrieve the data.

  • I store the data in Thread local variables so I can store the values somewhere and read them somewhere else.

app.rb
require 'sinatra/base'
require 'sequel'
require 'logger'

# Add hooks in sequel logging
# since it's called for each query
module Sequel
  class Database

    # Alias a method so we can override it
    # but still call the original
    alias :log_yield_old :log_yield

    # This method is called for each query
    # so we can use it to count
    def log_yield(sql, args=nil, &block)
      Thread.current[:queries_count] += 1
      log_yield_old(sql, args, &block)
    end

    # Alias a method so we can override it
    # but still call the original
    alias :log_duration_old :log_duration

    # This method is called to measure duration
    # so we can use it to measure duration
    def log_duration(duration, message)
      Thread.current[:queries_duration] += duration
      log_duration_old(duration, message)
    end

  end
end

class App < Sinatra::Base

  DB = Sequel.sqlite(
    '',
    :loggers => [Logger.new(STDOUT)]
  )

  # Hook called before each call
  before do
    # Initialize the thread local variables
    Thread.current[:queries_count] = 0
    Thread.current[:queries_duration] = 0
  end

  # Hook called after each call
  after do
    # Set the headers
    current_thread = Thread.current
    headers['X-QUERIES-COUNT'] =
      current_thread[:queries_count].to_s
    headers['X-QUERIES-DURATION'] =
      current_thread[:queries_duration].to_s
    # Clean the thread local variables
    current_thread[:queries_count] = nil
    current_thread[:queries_duration] = nil
  end

  # This is my service, you'll be impressed
  get '/' do
    # Simple query that doesn't need any table
    DB.run "SELECT 'OK'"
    'OK'
  end

end

And here we are:

curl -v "http://localhost:9292"
* Rebuilt URL to: http://localhost:9292/
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9292 (#0)
> GET / HTTP/1.1
> Host: localhost:9292
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/html;charset=utf-8
< X-Queries-Count: 3
< X-Queries-Duration: 0.00047699999999999994
< Content-Length: 2
< Date: Wed, 10 Feb 2016 21:07:55 GMT
< Connection: Keep-Alive
<
* Connection #0 to host localhost left intact
OK

Warning: you may want to hide the metrics in production environments.

And now?

Now that we have the basic structure, you can do plenty of things:

  • Obviously you can have a look at them when an HTTP request takes too long, it’s a great help for testers and developers of client applications.

  • You can add triggers alerts when thresholds are exceeded, like making the call fail so somebody will notice it.