Rails: Switching from "gettext_i18n_rails" to vanilla translations

It's all about switching from [`Gettext_i18n_rails`](https://github.com/grosser/gettext_i18n_rails) to YAML translations in a Rails engine.

Gettext_i18n_rails is a great gem to translate rails apps using po and mo files. For ExperimentsLabs engine, I decided to finally switch back to vanilla translation system (YAML files).

This article is explaining how I did it, step by step.

Currently, a merge request is open, if you have any thoughts on this

Use i18n-tasks

The i18n-tasks gem provides a lots of commands to check the validity of the yml translations:

~ i18n-tasks

Usage: i18n-tasks [command] [options]
    -v, --version                    Print the version
    -h, --help                       Show this message

Available commands:

    health                           is everything OK?
    missing                          show missing translations
    add-missing                      add missing keys to locale data
    find                             show where keys are used in the code
    unused                           show unused translations
    remove-unused                    remove unused keys
    normalize                        normalize translation data: sort and move to the right files
    check-normalized                 verify that all translation data is normalized
    [...]

See `i18n-tasks <command> --help` for more information on a specific command.

Add it to project and follow the installation instructions on the project’s page.

Generate model attributes

One of the reasons i’m leaving gettext_i18n_rails is because it can’t generate model attributes in an engine: when models are loaded for a Rails > 3 application, GettextI18nRails searches for direct descendants of ::ActiveRecord::Base (in model_attributes_finder.rb:69). But in the context of an engine, models are loaded from the dummy application. In our case, only the User model is loaded, as it’s present in dummy app.

I could have searched for a workaround, but @neomilium helped me with a simpler script, which will get the model attributes from model files (with a list of excuded models).

This script is available in the ExperimentsLabs sources and a rake task was created.

Run it to generate a file containing all the references in app/<engine_name>/i18n/model_attributes.i18n

rake app:i18n:add-models-attributes

Replace translation methods

Next, this is the less tedious part of the process: replace the translation method _() from gettext_i18n_rails by t() from Rails. This can be done with your favourite IDE by replacing every _\( by t( in app/; or with sed or a similar command: find app/ -name '*.rb' -print0 | xargs -0 sed -i 's/_(/t(/g'. Do the same with haml files and you’re done (at least check for false positive before commiting (i.e.: some_method_()).

Replace interpolations strings

Using gettext_i18n_rails, all interpolations were made using format(): format(_('%<what>s is great'), what: 'Rails'). But _() comes with intperpolation abilities: t('%{what} is fun', what: 'Ruby'),

Again, replacing stuff globally and checking if nothing is wrong is a good idea: replace %<([a-zA-Z_]*)>[a-z] by %\{$1\} in app/**/*.rb and app/**/*.haml.

Replace format() method usages

This one had to be done manually: find every line where format is used, remove it and remove the parenthese after the string to translate:

- format(t('%{what} is fun'), what: 'Ruby')
+ t('%{what} is fun', what: 'Ruby')

Note: In the above example, t() is already used as the previous _() has been replaced.

Generate keys in app files

For this step, I created a small runner which will transform all the strings to possible keys:

-# from
= t('Hello world, my name is %{name}', name: @user.name)
-# to
= t('.hello_world_my_name_is_name_', name: @user.name)

The runner is as follow:

[
  'app/**/*.rb',
  'app/**/*.haml'
].each do |glob|
  Dir.glob(glob) do |node|
    next if File.directory? node

    key_prefix = '.'
    # Check if file "supports" relative keys in translations
    relative_match = node.match(/app\/(model|validator)/)
    # I.e., for a model: "elabs.model.user."
    key_prefix = "elabs.#{relative_match[1]}.#{File.basename(node, '.*')}." if relative_match

    File.open(node) do |source_file|
      contents = source_file.read
      # Finds strings with a "t('"               => group 1
      # And anything that's not a quote          => group 2
      # And a quote followed by either "," or"]" => group 3
      # and replace it with a clean, sanitized string, prefixed by something
      contents.gsub!(/(t\(')([^']*)('[),])/) do |_|
        match = Regexp.last_match
        "#{match[1]}#{key_prefix}#{match[2].gsub(/[^0-9a-zA-Z]/, '_')}#{match[3]}"
      end
      File.open(node, 'w+') { |f| f.write(contents) }
    end
  end
end

At this point, you can find the missing keys to update the locales:

rake app:i18n:add-missing

Use po files to automatically translate files

Opening the tranlation files will give you something like

---
en:
  # [...]
  elabs:
    layouts:
     content_associations:
        Albums: TRANSLATE_ME Albums
        Articles: TRANSLATE_ME Articles
        Last___amount__albums_: 'TRANSLATE_ME Last   amount  albums '
        Last___amount__articles_: 'TRANSLATE_ME Last   amount  articles '
        Last___amount__notes_: 'TRANSLATE_ME Last   amount  notes '
        Last___amount__projects_: 'TRANSLATE_ME Last   amount  projects '
        Last___amount__uploads_: 'TRANSLATE_ME Last   amount  uploads '

The keys are ugly, the strings are ugly. But we have a tree, and we also have our old po files.

msgid "%<model_name>s was successfully created." # key and english translation
msgstr "%<model_name>s mis à jour avec succès."  # french translation

msgid "%<model_name>s was successfully destroyed."
msgstr "%<model_name>s supprimé avec succès."

msgid "%<model_name>s was successfully locked."
msgstr "%<model_name>s verrouillé avec succès."

msgid "%<nb>i note"         # English singular
msgid_plural "%<nb>i notes" # English plural
msgstr[0] "%<nb>i note"     # French singular
msgstr[1] "%<nb>i notes"    # French plural

To have a nice file quickly, we remove the headers and convert all the old interpolation strings (replace %<([a-zA-Z_]*)>[a-z] by %\{$1\})

Here is the runner created to process a clean po file:

require 'yaml'

EN_FILE    = 'config/locales/en.yml'
FR_FILE    = 'config/locales/fr.yml'
locales_en = YAML.load_file(EN_FILE)
locales_fr = YAML.load_file(FR_FILE)

def keyize(string)
  # Same method to transform strings to keys as in previous runner
  string.gsub(/[^0-9a-zA-Z]/, '_')
end

def unprotect_quotes(string)
  # Strings are double-quoted in po files, so double quotes are escaped
  string.gsub(/\\"/, '"')
end

def handle_simple_string(string, locales_en, locales_fr)
  en_string = unprotect_quotes string[0].sub(/msgid "(.*)"/, '\1')
  fr_string = unprotect_quotes string[1].sub(/msgstr "(.*)"/, '\1')
  key       = keyize en_string

  assign_value(locales_en, key, en_string)
  assign_value(locales_fr, key, fr_string)
end

def handle_plural_string(string, locales_en, locales_fr)
  en_singular_string = unprotect_quotes string[0].sub(/msgid "(.*)"/, '\1')
  en_plural_string   = unprotect_quotes string[1].sub(/msgid_plural "(.*)"/, '\1')
  fr_singular_string = unprotect_quotes string[2].sub(/msgstr\[0\] "(.*)"/, '\1')
  fr_plural_string   = unprotect_quotes string[3].sub(/msgstr\[1\] "(.*)"/, '\1')
  key                = keyize en_singular_string

  assign_value(locales_en, key, [en_singular_string, en_plural_string])
  assign_value(locales_fr, key, [fr_singular_string, fr_plural_string])
end

def assign_value(locale_hash, key, value)
  locale_hash.each_key do |hash_key|
    if locale_hash[hash_key].is_a?(Hash)
      # Go deeper
      assign_value locale_hash[hash_key], key, value
    elsif hash_key == key
      # Replace value
      locale_hash[hash_key] = value.is_a?(Array) ? { one: value[0], other: value[1] } : value
    end
  end
end

File.open('locale/fr/app_clean.po') do |source_file|
  source_file.read.split("\n\n").each do |group|
    # There is no multiline strings
    string = group.split("\n")

    if string.size == 2
      handle_simple_string string, locales_en, locales_fr
    else
      handle_plural_string string, locales_en, locales_fr
    end
  end
end

File.open(EN_FILE, 'w+') do |file|
  file.write(locales_en.to_yaml)
end
File.open(FR_FILE, 'w+') do |file|
  file.write(locales_fr.to_yaml)
end

And the french translation file now looks like this:

---
fr:
  Album: TRANSLATE_ME Album
  Albums: Albums
  Article: TRANSLATE_ME Article
  Articles: Articles
  # [...]
  elabs:
    acts_helper:
      act_action:
        created: créé
        locked: verrouillé
        published: publié
        removed_from_publication: retiré de la publication
        unlocked: déverrouillé
        updated: mis à jour

Keys are still ugly, but we’ll see that later.

Some translations are still missing and we miss singular plurals who had no correspondences.

Replace nt usages (pluralisations)

This step is a totally manual step. But hey, for this app, we’re speaking of about 30 ~ 40 strings…

Gettext_i18n_rails uses nt(<singular_string>, <plural_string>, <amount>) method to translate plurals. Rails’ i18n system uses t(<string>, count: <amount>). This ends in something like this in the translations files:

string:
  one: There's one thing
  other: There are %{count} things
  zero: There's nothing

The idea is to do something like:

--- a/app/views/elabs/admin/announcements/_form.html.haml
+++ b/app/views/elabs/admin/announcements/_form.html.haml
-      %h2= format(nt('An error prevented this announcement from being saved:',
-                      '%{nb} errors prevented this announcement from being saved:',
-                      announcement.errors.count), nb: announcement.errors.count)
+      %h2= t('.save_error', count: announcement.errors.count)

…and in the translation files:

en:
  elabs:
    admin:
      announcements:
        form:
          save_error:
            one: 'An error prevented this announcement from being saved:'
            other: "%{count} errors prevented this announcement from being saved:"

So I did this:

Manually rename keys?

This step is semi-manual: a runner created a YAML file with translation keys as keys and a “proposition” fo the new key, as value.

Once the list was ready, All there is to do is to manually assign new keys to old ones, and write a runner to change the keys in both YAML translation files and source code.

# Runner to extract unique keys
def extract_keys(locale_hash)
  keys = {}
  locale_hash.each_key do |hash_key|
    if locale_hash[hash_key].is_a?(Hash)
      # Go deeper
      keys.merge! extract_keys(locale_hash[hash_key])
    else
      # Replace value
      keys[hash_key] = hash_key.humanize.parameterize.underscore
    end
  end
  keys
end

IN_FILE  = 'config/locales/en.yml'.freeze
OUT_FILE = 'tmp/temp_keys.yml'.freeze
locales  = YAML.load_file(IN_FILE)

keys = extract_keys(locales).to_yaml

File.open(OUT_FILE, 'w+') do |file|
  file.write(keys)
end
# Runner to replace keys
NEW_KEYS_FILE = 'tmp/temp_keys.yml'
new_keys      = YAML.load_file(NEW_KEYS_FILE)

def keyize(string)
  # Same method to transform strings to keys as in previous runners
  string.gsub(/[^0-9a-zA-Z]/, '_')
end

def change_keys(hash, new_keys)
  keys = {}
  hash.each_key do |hash_key|
    new_hash_key = new_keys[hash_key] || hash_key

    keys[new_hash_key] = if hash[hash_key].is_a?(Hash)
                           # Go deeper
                           change_keys(hash[hash_key], new_keys)
                         else
                           # Replace value
                           hash[hash_key]
                         end
  end

  keys
end

[
  'app/**/*.rb',
  'app/**/*.haml',
  'app/**/*.erb'
].each do |glob|
  Dir.glob(glob) do |node|
    next if File.directory? node

    File.open(node) do |source_file|
      contents = source_file.read
      contents.gsub!(/(t\(')([^']*)('[),])/) do |_|
        match   = Regexp.last_match
        new_key = ''
        matched = false

        new_keys.each_key do |k|
          next if matched

          matched = match[2].match(/^\.?#{k}$/)
          new_key = match[2].sub k, new_keys[k]
          puts "#{match[2]} => #{new_key}" if matched
        end

        "#{match[1]}#{new_key}#{match[3]}"
      end

      File.open(node, 'w+') { |f| f.write(contents) }
    end
  end
end

Dir.glob('config/locales/*.yml') do |file|
  hash = YAML.load_file file

  File.open(file, 'w+') do |f|
    f.write change_keys(hash, new_keys).to_yaml
  end
end

Make some cleanup:

# Check for status
i18n-tasks health

# Normalize first
i18n-tasks normalize

# Check for missed translations and fix manually (2 strings in my case)
i18n-tasks add-missing

# Remove unused ones, if any
i18n-tasks remove-unused

You should have some remaining strings left to translate: ActiveReord strings, some plurals in other languages than english, etc… (about 20 strings for me). It’s quicker to search for values in po files.

Review, test, validate

Check if everything is good running RSpec, Cucumber, Rubocop, etc…

If you have a chance to work with people on your project, ask as many reiews as you can, it was a lot of work, and even if we automated a large part of it, it’s still error-prone.

Running RSpec, I got the following errors:

undefined method `t' for #<Elabs::AssociatedAuthorValidator:0x000055bc7871ec58>

RuntimeError:
  Cannot use t(".created") shortcut because path is not available
# ./app/helpers/elabs/acts_helper.rb:5:in `act_action'

Running Cucumber, some translations were not working well. As some controllers and models are metaprogrammed, Rails searches for the strings in the wrong places :

translation missing: en.elabs.albums.create_comment.create_success_robot

The translation was defined under en.elabs.content_application.create_comment.create_success_robot

To avoid this, I specified the full path instead of a relative one, and had to update every metaprogrammed controller and model with full keys.

As a hack, to finally find all the missing translations due to metaprogramming things, I temporarily created a navigation step to visit pages and updated every steps visiting something to use it:

When('I visit {string} and check it') do |url|
  visit url
  # Raw text
  expect(page).not_to have_content('translation missing:')
  # Escaped html
  expect(page).not_to have_content('translation_missing')
  # Classes
  expect(find_all('.translation_missing').count).to eq(0)
end

# Instead of visit 'xyz', in other steps:
step "I visit \"xyz\" and check it"

Side effects

Going further

For Experiments Labs, the visitor’s locale is stored in the session, the recommended way is to have the locale in URL (subdomain or querystring). Which is nice.

This makes change the routes or add a ?locale=<locale> chunk in the URL.

Conclusion

Switching from gettext_i18n_rails to vanilla translation system in an engine is something I won’t do everyday. Even if a lot of the tedious parts of the work has been automated with runners, there was still a lot of workarounds and manual reviews.

Additionally, meta-programmed things makes translations hard to find for Rails and/or for i18n-tasks, wich leads to full keys in some controllers/models.

At the time of writing this article, I still don’t know if i’m merging the changes or not.

Edit, 2019/04/29: Changes were merged after a discussion about standards with a friend. It’s always good to stick to them, even more when the target application is meant to be publicly available.


Thanks to Luc and Olivier for pointing out the typos !