Le blog d'Archiloque

Minimal access management when sharing your database in Rails

At work, the business intelligence system (BI) is fetching data from our Rails application by connecting to our PostgreSQL database.

It means our database schema is used as an API so it’s not an ideal solution, but in practice the coupling is manageable and it keeps things simple for the data team in charge of the BI.

One of my current point of interest is reducing the databases accesses to protect against accidents and malevolent actions.

For the BI it means connecting to the database with a read-only account, but also preventing them from accessing the table and columns they don’t need.

PostgreSQL provide the required capabilities, so the problem is to find a way to manage them.

A possible solution is to create a new configuration file with an ad-hoc syntax that list the tables and columns that should be accessible, to put it somewhere in the project then write a script that read the file’s content and run the right queries.

Something like:

access-rights.json
{
  "contracts": [
    "status",
    "created_at"
  ],
  "elder_gods": [
    "surname",
    "followers"
  ]
}

It should work, but I dislike having a separate source of information for this: people must learn where it is stored, must understand the structure (even if it’s obvious), and remember to update it when the schema change, for example when a column is renamed.

Meanwhile, Rails has already lots of database information in the db/schema.rb file:

db/schema.rb
ActiveRecord::Schema[7.0].define(version: 2022_05_01_102734) do
    # …
    create_table "contracts", force: :cascade do |t|
        t.enum "status", null: false, enum_type: "contract_status"
        t.datetime "created_at", precision: nil, null: false
    end

    create_table "elder_gods", force: :cascade do |t|
        t.string "surname", null: false
        t.bigint "followers", null: false
    end
    # …
end

If I could put the access rights data in this file, devs would know how to find it, and as a bonus when creating a new columns, I could use the migrations system to add the access right info.

As far as I know there’s no out of the box mechanism to add custom fields to the columns description. But there is an existing field that we don’t use: the comment field.

Many SQL database systems provide a way to add a string comment to a column and to other objects. I’ve seen it used by DBAs to store documentation about the stored data, since they often lack a better tool for it, and using the comments enable the info to be read by the developers who use the database.

It’s a bit hackish, but it enable us to leverage the existing tool which is strong win for me.

Let’s say that the comment of a column should contain a comma-separated list of used that should access the column.

db/migrate/20220518125955_add_access_rights.rb
ActiveRecord::Schema[7.0].define(version: 2022_05_18_125955) do
class AddAccessRights < ActiveRecord::Migration[7.0]
  def change
    change_column :contracts, :status, :contract_status, comment: "data"
    change_column :contracts, :created_at, :datetime, comment: "data"

    change_column :elder_gods, :surname, :string, comment: "data, other_app"
    change_column :elder_gods, :followers, :bigint, comment: "data"
  end
end

After running the migration, the data is in the db/schema.rb file (and in the database):

db/schema.rb
ActiveRecord::Schema[7.0].define(version: 2022_05_18_125955) do
    # …
    create_table "contracts", force: :cascade do |t|
        t.enum "status", null: false, enum_type: "contract_status", comment: "data"
        t.datetime "created_at", precision: nil, null: false, comment: "data"
    end

    create_table "elder_gods", force: :cascade do |t|
        t.string "surname", null: false, comment: "data, other_app"
        t.bigint "followers", null: false, comment: "data"
    end
    # …
end

In a single file I have all the information related to the database structure: nice.

Now it’s time to rely to manage the access right.

I’ll do it with a rake task that will be run after the code is deployed on the server. This way it will be run just after the schema update which means any new columns will be made available (or unavailable) ASAP.

lib/tasks/access_rights.rake
require 'set'

namespace :access_rights do

  MANAGED_USERS = [
    'data'
  ]

  # @param [ConnectionAdapters::Column] column
  # @return [Array<String>]
  def allowed_users(column)
    comment = column.comment
    comment.blank? ? [] : comment.split(', ').map { |e| e.strip }
  end

  # @param [String] sql
  def run_query(sql)
    ActiveRecord::Base.connection.exec_update(sql, "Access right", [])
  end

  task update: :environment do
    ActiveRecord::Base.connection.tables.each do |table_name|
      # Ignore the schema migration table and tables without models
      if (table_name != 'schema_migrations') &&
        ActiveRecord.const_defined?(table_name.classify)
        pp "Processing table [#{table_name}]"
        users_allowed_for_table = Set.new

        table_name.classify.constantize.columns.each do |column|
          users_allowed_for_column = allowed_users(column)

          users_allowed_for_column.each do |user|
            if MANAGED_USERS.include?(user)
              pp "User #{user} is allowed for [#{table_name}.#{column.name}]"
              run_query(
                "GRANT SELECT (#{column.name}) ON #{table_name} TO #{user}"
              )
              users_allowed_for_table << user
            else
              warn("Unknown user [#{user}] for [#{table_name}.#{column.name}]")
            end
          end

          (MANAGED_USERS - users_allowed_for_column).each do |user|
            pp "User #{user} is not allowed for [#{table_name}.#{column.name}]"
            run_query(
              "REVOKE SELECT (#{column.name}) ON #{table_name} FROM #{user}"
            )
          end
        end

        # Table level-access
        MANAGED_USERS.each do |user|
          if users_allowed_for_table.include?(user)
            pp "User #{user} is allowed for [#{table_name}]"
            run_query(
              "GRANT SELECT ON #{table_name} TO #{user}"
            )
          else
            pp "User #{user} is not allowed for [#{table_name}]"
            run_query(
              "REVOKE SELECT ON #{table_name} FROM #{user}"
            )
          end
        end
      end
    end
  end
end

Note : with grant and revoke queries variable binding doesn’t seem to work, if you know how to do it please tell me.

Then we can run it:

> bundle exec rake access_rights:update
…
"Processing table [contracts]"
"User data is allowed for [contracts.status]"
"User data is not allowed for [contracts.email]"

It’s currently just an idea so it hasn’t been used on a real application, but at least it seems to work.

I’ve kept the code the simplest I can so you can adapt it to your use cases (but if you begin to store JSON or YAML data in the comment field, please check ).

I hope you enjoyed the idea and that maybe it will inspire you to try other nefarious things with this approach.