16

some of our devs (me included) don't always take it serious to put text in a localization file, result is a lot of hardcoded texts scattered around a lot of views. I'm wondering if any of you has an idea to automate the search for hardcoded texts in views? Has anyone a tool or an approach how to check for this? I was thinking if a nifty bash script would do the job, but I'm a bit lost where to start. Any help much appreciated.

Edit: Not 100% accurate but works best for me so I accepted Andi's answer.

supersize
  • 13,764
  • 18
  • 74
  • 133

6 Answers6

2

I think you can get very far by just using grep:

cat $(find . | grep .html.erb) | grep -v '[=<>{}$/;]' | grep '\w \w'

This finds texts based on the idea that there are some characters which are not typical for texts

grep -v '[=<>{}$/;]'

and that there should be at least one space with a preceding word character and one where a word character follows

grep '\w \w'

This might not be a hundred percent accurate but is a fast and easy way to quickly check for hard coded text.

Andi
  • 1,172
  • 1
  • 11
  • 16
  • This doesn't work for me. It found one of the pieces of hard-coded text in my project, but it mostly found Ruby code where there was a space between the method name and the arguments. – Sekalf Nroc Feb 26 '17 at 17:51
  • Perhaps you can extend it to find two white spaces per line `grep '\(\w\+\s\)\{2,\}'`. It was just meant as a good starting point which can be optimized. Just change `2` to `4` and see what happens, for example. – Andi Feb 27 '17 at 12:52
  • @Andi this is working quite well so far. I need to check more in depth to see if it is reliable! – supersize Mar 06 '17 at 12:59
  • @supersize good to hear that i could help. Did you extend some of the `grep` commands? Could you post your final solution so i can edit the answer and add it? – Andi Mar 24 '17 at 10:34
  • @Andi a late reply but I was involved in something else until now. I noticed that it would be great that if we could exclude a preceding `if` or `unless` and it would get closer to the desired result. I have a lot of Ruby methods matches but just because of conditions. – supersize Jul 08 '17 at 17:26
  • @Andi consider this as the best answer! – supersize Sep 01 '17 at 10:00
1

If most lines of code are short and the hard-coded text is long, you can use strings -n [number] to find any text with a particular number of characters.

  <html>                                  |
   <head>                                 |
     <meta http-equiv="content-type" content="text/html; charset=utf-8" />
                                          |
     <title>Example Page</title>          |
                                          |
   </head>                                |
                                          |
   <body>                                 |
     <h1><%= @page.name %></h1>           |
     <p>                                  |
       This is a piece of hard coded text which must be found.
     </p>                                 |
   </body>                                |
  </html>                                 | 40 characters

If you set the length to 40...

$ cat $(find . | grep .html.erb) | strings -n 40
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
  This is a piece of hard coded text which must be found.

It should be mostly accurate in finding hard-coded text.

Sekalf Nroc
  • 457
  • 3
  • 7
  • neat idea but in reality this spits out loads of long lines which are CSS classes in HAML too – supersize Feb 21 '17 at 09:06
  • you could probably use this method + nokogiri and consume the views/partials. But you'd have to skip the preprocessor that erb runs so that you see the erb tags and have something to filter based on, as you'd see the erb tags (<%= %>) as text. You might have to script the duplication of erb files and remove the .erb extension but I would think you could write something fairly quickly to do this operation. – engineerDave Feb 23 '17 at 22:23
1

You could use a regular expression to find anything neither enclosed within angle brackets (catching most HTML tags and Ruby) nor inside style, script or title tags.

^(?!.*(<(style|script|title).*?<\/\1>|<.*?>)).*$

If you discover that any other tags are getting through, just add them to the list of exceptions.

Sekalf Nroc
  • 457
  • 3
  • 7
  • I notice that someone voted down this answer. Please, could you explain? I'm happy to improve it according to your advice. – Sekalf Nroc Feb 25 '17 at 20:09
1

I was inspired by Andi's answer but also wanted an easy way to jump straight to the file and line (and to instead search for words that start with a capital letter)

grep -r -n ".\+[ >^=]\([A-Z][a-z]\+\b\)" .

This command recursively greps all files in a folder and puts the filename and line number in each result, like so:

./interviews/show.html.erb:17:              Your interview has been scheduled
./interviews/show.html.erb:49:              Click the button below to add this event to your calendar.
Tim Krins
  • 3,134
  • 1
  • 23
  • 25
1

There are some "extractors" mentioned in the custom tasks page for i18n-tasks.

They offer to automatically extract hardcoded text from your Views into your Yaml files, or even the database (Lost In Translation).

Most seem to offer an interactive mode so you could use them just to identify hardcoded text, even if you don't want to auto-extract it.

I have tried any of them so can't comment on their effectiveness.

TomG
  • 456
  • 3
  • 11
-2

why don't you raise exception in development and test environments when the translations are missing. In development and test environments you can add this:

Rails.application.configure do |config|
  config.action_view.raise_on_missing_translations = true
end

This should help. For more details read this.


Also if you just want to find all the missing translations this gem looks promising. I've personally not used this gem, but seems like an ideal way to find missing translations instead of writing a script by myself:

i18n-tasks missing

Gem also has a task to find all unused translations.

TarunJadhwani
  • 1,151
  • 7
  • 21
  • 3
    this question is not about missing translations, it is about to find hardcoded text to move it to the translation file. – supersize Mar 06 '17 at 12:57