-1

I want to automate text language detection in LibreOffice Calc.

I have only 4 languages, each language has its own character set. Languages are not or rarely mixed in cells.

Languages are: English, Hebrew, Arabic, Russian.

As depicted in the picture bellow: enter image description here

I want to write a formula in column C cell, that will indicate the text language in the corresponding A cell.

I failed to identify any style indicator I can use.

I looked around and found a solution for Microsoft Office VBA.

I hope I do not need to write a macro using this API function getStringType(...)

Thanks.

Dudi Boy
  • 4,551
  • 1
  • 15
  • 30

1 Answers1

1

Assuming all the text in a given cell is using the same script and that all text starts with a letter, testing the first character should be enough. This can be done with:

=UNICODE(A2)

If the number returned is between 65 and 122, the text is in English (this would need to be extended if you need to include characters with diacritical marks (ex.: é, à, ñ, ø, etc.)

The same can be done with the other alphabets. A Unicode character list can be used to determine the range in question. Here is one though you can easily find others that may better suit your purpose

cybernetic.nomad
  • 6,100
  • 3
  • 18
  • 31