Tag: database

Book Review: Beginning Hibernate by Jeff Linwood, Dave Minter

Begining HibernateSweet introduction to Hibernate. I can’t believe it doesn’t cover migrations (and I’m sure some people will point out migrations it’s not part of Hibernate, but without it, there’s no good way to maintain a production database).

The book is on the short side for a computer book and that’s a huge plus. Even then, if you are going to do Spring Boot, like I am, there’s quite a few sections of the book you can skip.


Buy Beginning Hibernate: For Hibernate 5 in USA
Buy Beginning Hibernate: For Hibernate 5 in United Kingdom
Buy Beginning Hibernate: For Hibernate 5 in Canada

Setting (database) credentials on a Spring Boot project, the right way

Searching online for how to set up the credentials to access the database (or any other service) while in development leads to a lot of articles that propose something that works, but it’s wrong: putting your credentials in the application.properties file that you then commit to the repository.

The source code repository should not have any credentials, ever:

  • You should be able to make your project open source without your security being compromised.
  • You should be able to add another developer to your team without them knowing any credentials to your own development machine.
  • You should be able to hire a company that does a security analysis of your application, give them access to your source code and they shouldn’t gain access to your database.
  • You should be able to use a continuous integration service offered by a third party without that third party learning your database credentials.

If you want to see what happens when you commit your credentials to your repo, check out these news articles:

That’s probably enough. I hope I convinced you.

In an effort to find a solution for this, I asked in Stack Overflow and I got pointed in the right direction.

Leave application.properties where it is, in your resources of code folder, commit it to the repository. Instead, create a new file in ${PROJECT_ROOT}/config/application.properties and also add it to your version control ignore file (.gitignore, .hgignore, etc). That file will contain the credentials and other sensitive data:

# This should be used only for credentials and other local-only config.
spring.datasource.url = jdbc:postgresql://localhost/database
spring.datasource.username = username
spring.datasource.password = password

Then, to help onboard new developers on your project (or yourself in a new computer), add a template for that file, next to it. Something like ${PROJECT_ROOT}/config/application.template.properties that will contain:

# TODO: copy this to application.properties and set the credentials for your database.
# This should be used only for credentials and other local-only config.
spring.datasource.url = jdbc:postgresql://localhost/database
spring.datasource.username = 
spring.datasource.password = 

And voila! No credentials on the repo  but enough information to set them up quickly.

Disclaimer: I’m new to Spring Boot, I only started working with it a few days ago, so, I may be missing something big here. If I learn something new that invalidates this post, I’ll update it accordingly. One thing I’m not entirely sure about is how customary it would be to have ${PROJECT_ROOT}/config/application.properties on the ignore list. Please, leave a comment with any opinions or commentary.

How to use Lobos with Heroku

Lobos is a Clojure library to create and alter tables which also supports migrations similar to what Rails can do. I like where Lobos is going but it’s a work in progress, so the information here might be out of date soon, beware!

Let’s imagine a project called px (for Project X of course) with the usual Leiningen structure. In the src directory you you need to create a lobos directory and inside there let’s get started with config.clj which contains the credentials and other database information:

(ns lobos.config)

(def db
  {:classname "org.postgresql.Driver"
   :subprotocol "postgresql"
   :subname "//localhost:5432/px"})

then we create a simple migration in lobos/migrations.clj that creates the users table:

(ns lobos.migrations
  (:refer-clojure :exclude [alter defonce drop bigint boolean char double float time])
  (:use (lobos [migration :only [defmigration]] core schema) lobos.config))

(defmigration create-users
  (up [] (create (table :users
                   (integer :id :primary-key)
                   (varchar :email 256 :unique))))
  (down [] (drop (table :users))))

You run a REPL, load the migrations and run them (using the joyful Clojure example code convention):

(require 'lobos.migrations)
;=> nil
;=> java.lang.Exception: No such global connection currently open: :default-connection, only got [] (NO_SOURCE_FILE:0)

and you get an error because you didn’t open the connection yet, so, let’s do that:

(require 'lobos.connectivity)
;=> nil
(lobos.connectivity/open-global lobos.config/db)
;=> {:default-connection {:connection #<Jdbc4Connection org.postgresql.jdbc4.Jdbc4Connection@2ab600af>, :db-spec {:classname "org.postgresql.Driver", :subprotocol "postgresql", :subname "//localhost:5432/px"}}}

and now it works:

; create-users
;=> nil

and you can also rollback:

; create-users
;=> nil

You might be tempted to open the global connection in your config.clj and that might be fine for some, but I found it problematic that the second time I load the file, I get an error: “java.lang.Exception: A global connection by that name already exists (:default-connection) (NO_SOURCE_FILE:0)”.

My solution was to write a function called open-global-when-necessary that will open a global connection only when there’s none or when the database specification changed, and will close the previous connection in that case, leaving a config.clj that looks like:

(ns lobos.config
  (:require lobos.connectivity))

(defn open-global-when-necessary
  "Open a global connection only when necessary, that is, when no previous
  connection exist or when db-spec is different to the current global
  ;; If the connection credentials has changed, close the connection.
  (when (and (@lobos.connectivity/global-connections :default-connection)
             (not= (:db-spec (@lobos.connectivity/global-connections :default-connection)) db-spec))
  ;; Open a new connection or return the existing one.
  (if (nil? (@lobos.connectivity/global-connections :default-connection))
    ((lobos.connectivity/open-global db-spec) :default-connection)
    (@lobos.connectivity/global-connections :default-connection)))

(def db
  {:classname "org.postgresql.Driver"
   :subprotocol "postgresql"
   :subname "//localhost:5432/px"})

(open-global-when-necessary db)

That works fine locally, so let’s move to Heroku. To get started with Clojure on Heroku I recommend you read:

  1. Getting Started With Clojure on Heroku/Cedar
  2. Building a Database-Backed Clojure Web Application

I took the code used to extract the database specification from DATABASE_URL but I modified it so I don’t depend on that environment variable existing on my local computer and I ended up with the following config.clj:

(ns lobos.config
  (:require [clojure.string :as str] lobos.connectivity)
  (:import (java.net URI)))

(defn heroku-db
  "Generate the db map according to Heroku environment when available."
  (when (System/getenv "DATABASE_URL")
    (let [url (URI. (System/getenv "DATABASE_URL"))
          host (.getHost url)
          port (if (pos? (.getPort url)) (.getPort url) 5432)
          path (.getPath url)]
       {:subname (str "//" host ":" port path)}
       (when-let [user-info (.getUserInfo url)]
         {:user (first (str/split user-info #":"))
          :password (second (str/split user-info #":"))})))))

(defn open-global-when-necessary
  "Open a global connection only when necessary, that is, when no previous
  connection exist or when db-spec is different to the current global
  ;; If the connection credentials has changed, close the connection.
  (when (and (@lobos.connectivity/global-connections :default-connection)
             (not= (:db-spec (@lobos.connectivity/global-connections :default-connection)) db-spec))
  ;; Open a new connection or return the existing one.
  (if (nil? (@lobos.connectivity/global-connections :default-connection))
    ((lobos.connectivity/open-global db-spec) :default-connection)
    (@lobos.connectivity/global-connections :default-connection)))

(def db
  (merge {:classname "org.postgresql.Driver"
          :subprotocol "postgresql"
          :subname "//localhost:5432/px"}

(open-global-when-necessary db)

After you push to Heroku, you can run heroku run lein repl, load lobos.config and run the migrations just as if they were local.

Thanks to Daniel Magliola and Nicolas Buduroi for reading drafts of this.

Sharing my code

I’ve recently wrote several posts that contiained code to copy and paste:

I don’t like copying and pasting code and since I was already doing it between several of my projects, I took those pieces of code and package them as gems. If you want, you can use them too:

I find it really awesome how many times some of my gems were downloaded:

Update: fofof is actually useless.

Generating sample data

As I’ve said in previous posts, I like being able to generate sample data for the projects I’m working on very quickly. It allows new developers to get up to speed fast, and new developers to move faster.

When I don’t have a sample data generation method, I’m always scare to try whether, for example, deleting a project deletes all the todos in a project tracking system. Simply because I’ll have to generate that project and all todos by hand. Many times I end up not testing those destructive actions as often as I should.

The other reason while having a stable set of sample data is that you start to know it: “Hey! the users Paul and John are supposed to be on the same team, why I am not seeing them together? Something is broken”. To help with that I also use data that we already know. If I have teams with members I would create one team with John, Paul, George and Ringo called Beatles and another with Freddie, Brian, Roger and John called Queen. If you see Paul next to Freddie, something is broken.

To generate the sample data I use factories; which I also use to test instead of fixtures. If you are not familiar with factories, please, stop reading and go to check factory girl. I don’t care if you never come back to this blog if you start using factories instead of fixtures. Factories is so much better! But that’s probably repeated a thousand times over the web, so I’m not going to go into details.

In lib/tasks/data.rake I end up creating:

namespace :db do
  desc "Generate sample data for developing"
  task :sample_data => :environment do

    puts "==  Data: generating sample data ".ljust(79, "=")

    beatles = Factory.create :team, :name => "The Beatles"
    Factory.create :user, :name => "John Lennon", :team => beatles
    Factory.create :user, :name => "Paul McCartney", :team => beatles
    Factory.create :user, :name => "George Harrison", :team => beatles
    Factory.create :user, :name => "Ringo Starr", :team => beatles

    queen = Factory.create :team, :name => "Queen"
    Factory.create :user, :name => "Freddie Mercury", :team => queen
    Factory.create :user, :name => "Brian May", :team => queen
    Factory.create :user, :name => "John Deacon", :team => queen
    Factory.create :user, :name => "Roger Taylor", :team => queen

    puts "==  Data: generating sample data (done) ".ljust(79, "=") + "\n\n"

For the implementation of destroy_data look at Deleting all records in a Rails project.

The problem with doing that with factories is that it is too silent. I like knowing what’s going on and for new developers it’s good to get a glimpse of the data. All users have the same password so after rake db:sample_data finishes, a new developer already know what email and password to use to log in. If you want to make it even easier, you can print out the password doing sample data generation.

The password is of course defined in the user factory:

Factory.define :user do |user|
  user.email { Factory.next :email }
  user.password "testing"
  user.password_confirmation "testing"

To be able to make factories verbose I created VFactory (for Verbose Factory of course) that you use just like Factory, but it prints out everything. This is its code:

# Verbose factory.
module VFactory
  def self.create *args
    human_factory_name = args.first.to_s.gsub("_", " ")
    if args.size > 1
      human_arguments = args.second.map { |name, value| "#{name}=>#{value.is_a?(Array) ? value.join(", ") : value}" }.to_sentence
      puts "-- creating #{human_factory_name} with #{human_arguments}."
      puts "-- creating #{human_factory_name}."
    Factory.create(*args).tap do |obj|
      puts "   -> done: #{obj}"

The output of this is more or less like this:

==  Data: generating sample data ==============================================
-- creating team with name=>The Beatles.
   -> done: #
-- creating user with name=>John Lennon and team=>#.
   -> done: #
-- creating user with name=>Paul McCartney and team=>#.
   -> done: #
-- creating user with name=>George Harrison and team=>#.
   -> done: #
-- creating user with name=>Ringo Starr and team=>#.
   -> done: #
-- creating team with name=>Queen.
   -> done: #
-- creating user with with name=>Freddie Mercury.
   -> done: #
-- creating user with with name=>Brian May.
   -> done: #
-- creating user with name=>John Deacon.
   -> done: #
-- creating user with name=>Roger Taylor
   -> done: #
==  Data: generating sample data (done) =======================================

If you are wondering why my objects look so pretty when printed, that’s because I always define a to_s for all models that contain the id and other important data. In this case it would be:

def to_s

That’s very useful for debugging. I also try to always have a name method in my models that give me something that represents the object and that I can show to the users.

The next step in data awesomeness would be, with one command, being able to download and import all production data. This really helps reproducing and debugging reported issues; specially when those issues are related to destructive changes.

Update: this is now a gem.

Deleting all records in a Rails project

During the initial phase of development of a Rails application I don’t use migrations as migrations but as table definitions. Until I deploy I feel free to modify the migration files as much as I want and I have one per table.

The downside of that is that the only way to apply the changes is to destroy all the tables and re-build them. I’ve explained how to do that in my post really resetting the database. The nice side effect of doing this is that you end up with a task that sets sample data to work with.

Being able to quickly set up sample data or download production data is very important. It helps new developers getting started with the project but it also allows you to play much more freely with the project, do destructive actions and then in a quick command have system reset to a known state. Once you have sample data you’ll probably become as addictive as I am to reseting.

But the truth is that 90% of the time you reset your data, you don’t need to nuke the database and re-create all records, you just need to delete all records and this is the code I use to do that:

def destroy_data
  puts "==  Data: Destroying all data ".ljust(79, "=")
  sql = ActiveRecord::Base.connection()

  sql.execute "SET autocommit=0"

  if sql.adapter_name == "MySQL"
    sql.execute("/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */")

  tables = sql.tables - ["schema_migrations"]

  tables.each do |table|
    puts "-- Deleting all for #{table}."
    # So far, disabling and enabling keys was not needed.
    #sql.execute("/*!40000 ALTER TABLE `#{table}` DISABLE KEYS */") if sql.adapter_name == "MySQL"
    record_count = sql.delete("DELETE FROM `#{table}`")
    #sql.execute("/*!40000 ALTER TABLE `#{table}` ENABLE KEYS */") if sql.adapter_name == "MySQL"
    puts "   -> done: #{record_count} reconds"

  if sql.adapter_name == "MySQL"
    sql.execute("/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */")
    sql.execute("/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */")


  puts "==  Data: Destroying all data (done) ".ljust(79, "=") + "\n\n"

Note that I’m not deleting anything in the table schema_migration. The output is more or less like this:

==  Data: Destroying all data =================================================
-- Deleting all for blogs.
   -> done: 4 reconds
-- Deleting all for posts.
   -> done: 10 reconds
-- Deleting all for users.
   -> done: 11 reconds
==  Data: Destroying all data (done) ==========================================

I also have some nice code to generate sample data, but that is for another post.

Really resetting the database

When I start coding a Ruby on Rails project, I find myself modifying the migration files over and over. I know this is not the way they were intended to use, but to avoid upfront design, I only ad fields when I need them. If I respected the way migrations were intended I would end up with hundred of migrations the first day and I would waste half my day just creating migrations.

After a project is deployed or a second developer is working on it, I revert to the way migrations are intended and I create a new one every time there’s a change I need in the database.

As migrations are intended to run only once, if you modify them, they won’t get run; and if you force them to run them, they’ll fail, because the database already contains such a table. So I’ve found myself doing this quite often:

rake db:drop && rake db:create && rake db:migrate && rake db:seed && rake db:data

db:data is a task I created to generate some sample data. Good known data that I can use to test the site locally. I’m using Factory Girl to create it, which I also use for the tests so I can re-use as much data creating logic as possible. It’s very good to get to a known state of the project you are developing and to get other developers started right away. I really recommend everyone doing it.

The problem is that I also need to reset my test data, so I end up having this other command and it gets horrible:

RAILS_ENV=test rake db:drop && RAILS_ENV=test rake db:create && RAILS_ENV=test rake db:migrate && RAILS_ENV=test rake db:seed

Note: no db:data this time.

I’ve got tired of re-writing these commands or trying to find them in my bash history, so I decided to write a Ruby task that will do it for me and here it is in case you want to use it too:

namespace :db do
  desc "Crush and burn the database"
  task :hard_reset => :environment do
    if !Rails.env.test?

  desc "Generate sample data for developing"
  task :data => :environment do
    # Create the sample data in here


Update: To delete all records without resetting the whole database, check my post Deleting all records in a Rails project.