Everything You Always Wanted to Know About Writing Good Rake Tasks * But Were Afraid to Ask

Rake tasks are a very important component of our Rails Apps, because we usually use it to do maintenance or data migration jobs over a collection of data.

One of the guys at the office asked me. What things should I keep in mind when writing a rake task and how do I know if my rake task is well written.

The answer is not simple, because most of the time depends on the task you need to accomplish. I have a few rules I use to make my rake task what I consider good rake task.

What Makes a Rake Task a Good Task? #

I think a rake task is good if:

Writing Meaningful Descriptions #

Bad #

# lib/tasks/import_topics.rake
task import_topics: :environment do
   ...
end

Writing a description it’s useful because it give us some details without reading the code. Also it’s useful when you want to inspect the list of available rake tasks using rake -T. Now we only know that this task imports topics, nothing else.

Good #

Adding a good description for the previous task like “Migrate topics from legacy database to new database” give us more details about what the previous task do or should do. Now we know that these topics are imported from the legacy database to our current app database.

# lib/tasks/migrate_topics.rake
desc 'Migrate topics from legacy database to new database'
task migrate_topics: :environment do
   ...
end

Note: If you can’t explain your rake task in one sentence, that would probably mean that your rake task is doing more than one job and you should consider splitting your rake task.

Group Your Tasks Using Namespaces #

Bad #

# lib/tasks/migrate_topics.rake
desc 'Migrate topics from legacy database to new database'
task migrate_topics: :environment do
   ...
end
# lib/tasks/migrate_users.rake
desc 'Migrate users from legacy database to new database'
task migrate_users: :environment do
   ...
end
# lib/tasks/migrate_questions.rake
desc 'Migrate questions from legacy database to new database'
task migrate_questions: :environment do
   ...
end

All these task has one thing in common. They are used to migrate information. So you should group this similar tasks under a common namespace, migrate.

Good #

# lib/tasks/migrate/topics.rake
namespace :migrate do
  desc 'Migrate topics from legacy database to new database'
  task topics: :environment do
     ...
  end
end
# lib/tasks/migrate/users.rake
namespace :migrate do
  desc 'Migrate users from legacy database to new database'
  task users: :environment do
    ...
  end
end
# lib/tasks/migrate/questions.rake
namespace :migrate do
  desc 'Migrate questions from legacy database to new database'
  task questions: :environment do
    ...
  end
end

Now we know all these rake tasks are related. Using namespaces will help you keeping your code organized, clean and understandable.

Rake File Structure #

Bad #

# lib/tasks/migrate_topics.rake
desc 'Migrate topics from legacy database to new database'
task migrate_topics: :environment do
   ...
end
# lib/tasks/migrate_users.rake
desc 'Migrate users from legacy database to new database'
task migrate_users: :environment do
   ...
end
# lib/tasks/migrate_questions.rake
desc 'Migrate questions from legacy database to new database'
task migrate_questions: :environment do
   ...
end

File Structure:

lib
 └── tasks
     ├── recalculate_badges_for_users.rake
     ├── migrate_users.rake
     ├── migrate_topics.rake
     ├── migrate_questions.rake
     ├── migrate_answers.rake
     ├── recalculate_best_answer.rake
     ├── topic_accessible_by_url.rake
     ├── invalid_questions.rake
     ├── remove_duplicated_topics.rake
     ├── calculate_last_activity_for_question.rake
     ├── ...
     ├── clean_votes.rake
     └── cache_visits.rake

Although the file name has an intention revealing name, having 30 or more rake tasks on the same folder makes hard to indetify quickly which rake belongs to which resource.

Good #

Every rake task perform operations over a resource or several resources (but it always works around a main resource). Identifying this main resource helps us building a good file structure and group our rake task under namespaces and folders.

# lib/tasks/migrate/topics.rake
namespace :migrate do
  desc 'Migrate topics from legacy database to new database'
  task topics: :environment do
     ...
  end
end
# lib/tasks/migrate/users.rake
namespace :migrate do
  desc 'Migrate users from legacy database to new database'
  task users: :environment do
     ...
  end
end
# lib/tasks/migrate/questions.rake
namespace :migrate do
  desc 'Migrate questions from legacy database to new database'
  task questions: :environment do
     ...
  end
end

The main resource here it’s the migration, so we used migrate namespace and create a folder under tasks folder named as the namespace.

And so on with the remaining tasks.

File Structure:

This is how our rake file structure will look like when applying this rule over all the rake tasks.

lib
 └── tasks
     ├── migrate
     │   ├── users.rake
     │   ├── topics.rake
     │   ├── questions.rake
     │   └── answers.rake
     ├── users
     │   ├── recalculate_badges.rake
     │   └── cache_visits.rake
     ├── ...
     ├── questions
     │   ├── recalculate_best_answer.rake
     │   ├── topic_accessible_by_url.rake
     │   ├── clean_votes.rake
     │   ├── log_invalid.rb
     │   └── calculate_last_activity.rake
     └── topics
         └── remove_duplicated.rb

Now our files are more organized and it’s easier to find an specific task.

Isolating Your Task Using a Class #

I’ll start with an example because it make easy to understand the concept.

We have an App kind of StackOverflow, our users can make questions, answer them, leave comments and so, and we already have a lot of information in our database. Sunddenly we decided to implement a Badge System in our app.

Now that we developed the Badge System, we only need to recalculate the badges for every user. Now it’s the time to use a rake task.

Bad #

# lib/tasks/users/recalculate_badges.rake
namespace :users do
  desc 'Recalculates Badges for All Users'
  task recalculate_badges: :environment do
    User.find_each do |user|

      # Grants teacher badge
      if user.answers.with_votes_count_greater_than(5).count >= 1
        user.grant_badge('teacher')
      end

      ...

      # Grants favorite question badge
      user.questions.find_each do |question|
        if question.followers_count >= 25
          user.grant_badge('favorite question') && break
        end
      end

      # Grants stellar question badge
      user.questions.find_each do |question|
        if question.followers_count >= 100
          user.grant_badge('stellar question') && break
        end
      end

    end
  end
end

This task may seem simple to understand but it has a lot of problems:

Good #

Now that we pointed all things that’s wrong with this task, let’s fix it. We’ll extract all the logic and move it to a Service Object.

# lib/tasks/users/recalculate_badges.rake
namespace :users do
  desc 'Recalculates Badges for All Users'
  task recalculate_badges: :environment do
    User.find_each do |user|

      RecalculateBadges.new(user).all

    end
  end
end
# app/services/recalculate_badges.rb
class RecalculateBadges

  attr_reader :user, :questions, :answers

  def initialize(user)
    @user      = user
    @questions = user.questions
    @answers   = user.answers
  end

  def all
    teacher
    favorite_question
    stellar_question
  end

  def teacher
    ...
    grant_badge('teacher')
  end

  def favorite_question
    question_followers_count_badge(25, 'favorite question')
  end

  def stellar_question
    question_followers_count_badge(100, 'stellar question')
  end

  private

    def grant_badge(badge_name)
      return unless badge_name
      user.grant_badge(badge_name)
    end

    def question_followers_count_badge(followers_count, badge_name)
      ...
      grant(badge_name)
    end

end

Now we extracted all the logic to an specific class you will notice the following benefits:

There are some important concepts I would like to highlight:

Display Details About Task Progress Without Being Too Verbose #

Bad #

One of the things that I find very annoying is displaying irrelevant information when running a task. This makes harder to monitor the progress and only pollutes your terminal.

# lib/tasks/users/recalculate_badges.rake
namespace :users do
  desc 'Recalculates Badges for All Users'
  task recalculate_badges: :environment do
    User.find_each do |user|

      puts "#{user.first_name} #{user.last_name} - #{user.email}"
      RecalculateBadges.new(user).all

    end
  end
end

Terminal Output:

Mario Krols - mkrols@gmail.com
Kristen Delt - kdelt@gmail.com
Monica Lewinsky - mlewinsky@clinton.com
...
Fake User - fuser@outlook.com

As you can see this task will display user’s first name, last name and email, and it will not notify you about any errors or how many users have processed so far. Also on this particular case, we don’t need to display user’s first name or last name, they are polluting our terminal.

Good #

Now lets say that RecalculateBadges#all method returns true if badges recalculated successfully and returns false and saves the errors on an instance variable @errors if it fails. Also we have a new instance method errors which returns the value of @errors as a string.

# lib/tasks/users/recalculate_badges.rake
namespace :users do
  desc 'Recalculates Badges for All Users'
  task recalculate_badges: :environment do
    users_count = User.count

    User.find_each.with_index do |user, index|
      recaulculate_badges = RecalculateBadges.new(user)

      if recalculate_badges.all
        puts "#{index}/#{users_count} - #{user.email}".green
      else
        puts "#{index}/#{users_count} - #{user.email} - #{recalculate_badges.errors}".red
      end

    end
  end
end
# app/services/recalculate_badges.rb
class RecalculateBadges

  attr_reader :user, :questions, :answers, :errors

  def initialize(user)
    @user      = user
    @questions = user.questions
    @answers   = user.answers
  end

  def all
    if user.can_receive_badges?
      teacher
      favorite_question
      stellar_question
      true
    else
      @errors = user.badges_validation_messages
      false
    end
  end

  def teacher
    ...
    grant_badge('teacher')
  end

  def favorite_question
    question_followers_count_badge(25, 'favorite question')
  end

  def stellar_question
    question_followers_count_badge(100, 'stellar question')
  end

  private

    def grant_badge(badge_name)
      return unless badge_name
      user.grant_badge(badge_name)
    end

    def question_followers_count_badge(followers_count, badge_name)
      ...
      grant_badge(badge_name)
    end

end

Terminal Output:

1/100 - mkrols@gmail.com
2/100 - kdelt@gmail.com
3/100 - mlewinsky@clinton.com - This user can't receive any badge because it's blocked
...
100/100 - fuser@outlook.com

This task display all necessary details without being too verbose. Now you can see:

Remember to display only the information you need, anything else will end up being too verbose and will only make harder to read and understand the information you’re displaying.

Note: If you are not interested on displaying any kind of information, a useful output will be printing a green dot for every element processed successfully and a red X for every element that fails, this way you’ll know that the task is running.

.................X...........X.......XX........X.

Always Use a Log File #

Bad #

Imagine now that our application has 100.000 users and we need to recalculate badges for everyone.

# lib/tasks/users/recalculate_badges.rake
namespace :users do
  desc 'Recalculates Badges for All Users'
  task recalculate_badges: :environment do
    users_count = User.count

    User.find_each.with_index do |user, index|
      recaulculate_badges = RecalculateBadges.new(user)

      if recalculate_badges.all
        puts "#{index}/#{users_count} - #{user.email}".green
      else
        puts "#{index}/#{users_count} - #{user.email} - #{recalculate_badges.errors}".red
      end

    end
  end
end

The main problem you have here it’s that you may need infinite scroll on your terminal to keep track, also processing 100.000 will take some time and you probably won’t pay attention all the time. Also how about if you are running this task remotely or in background, you won’t be able to see any information.

Good #

I think having a log file is a MUST when writing rake tasks. This helps you to keep track for every task triggered, consult it every time you want to and share it with anyone easily.

# lib/tasks/users/recalculate_badges.rake
namespace :users do
  desc 'Recalculates Badges for All Users'
  task recalculate_badges: :environment do
    log = ActiveSupport::Logger.new('log/users_recalculate_badges.log')
    start_time = Time.now
    users_count = User.count

    log.info "Task started at #{start_time}"

    User.find_each.with_index do |user, index|
      recaulculate_badges = RecalculateBadges.new(user)

      if recalculate_badges.all
        log.info "#{index}/#{users_count} - #{user.email}"
      else
        log.info "#{index}/#{users_count} - #{user.email} - #{recalculate_badges.errors}"
      end

    end

    end_time = Time.now
    duration = (start_time - end_time) / 1.minute
    log.info "Task finished at #{end_time} and last #{duration} minutes."
    log.close
  end
end

Tracking start time, end time and duration is very important. If you have a rake task that you run every one hour and your rake task takes one and a half hour to be completed, your task will overlap with the same task and you’ll end up running out of memory on your server for example.

Note: Besides using a log file, you should also print some output, that way you’ll know that your rake it’s working. I didn’t add any output on this example because I wanted to keep this example as simple as possible.

 
1,191
Kudos
 
1,191
Kudos

Now read this

Strong Parameters The Right Way

StrongParameters is a great gem and it comes with Rails 4 by default. Currently, there are two patterns to work with your attributes. 1. Creating a private method on your controller which returns the whitelisted attributes. # #... Continue →