Questions tagged [hunspell]

An open source spell checker, used by OpenOffice, Firefox, Google Chrome and Mac OS X.

Hunspell is a spell checker and morphological analyzer designed for languages with rich morphology and complex word compounding and character encoding, originally designed for the Hungarian language.

http://en.wikipedia.org/wiki/Hunspell

238 questions
2
votes
0 answers

Are all C/C++ Hunspell APIs thread-unsafe?

Are all C/C++ APIs for example spell(), suggest(); analyse() in the Hunspell library thread-unsafe? When I use the suggest() API with a lock, I see on an average 50-100 requests for suggestion processed in a second? Did any one try to do benchmarks…
s.s
  • 138
  • 8
2
votes
1 answer

Capitalized words with UTF-8 in Hunspell

I'm trying to use Hunspell for spellchecking in Polish. I did convert encoding of a dictionary from http://wiki.openoffice.org/wiki/Dictionaries to UTF-8. However, capitalized words (i.e. with first letter in uppercase) with a non-latin character…
Piotr Migdal
  • 11,864
  • 9
  • 64
  • 86
2
votes
3 answers

Custom Dictionary Implementation: Can I Create My Custom affix file?

I am using this Hunsplell iOS Implementation. And I want to create my custom dictionary and affix file with my choice of words. I know how to create .dic files but I have no idea how to generate the affix file with .aff extension for that respective…
n.by.n
  • 2,458
  • 22
  • 31
2
votes
0 answers

Apache Solr 3.5 - 4.0 HunspellStemFilter returns another values than Hunspell in console / command line with same dictionaries

When I used HunspellStemFilter for stemming the czech language text, it returns me bad results. For example word "praha" returns "praha" and "prahnout", what is not correct. So I try the same in my console (Hunspell command line) with exactly same…
jrx
  • 21
  • 1
2
votes
1 answer

Call a program via shell_exec with utf-8 text input

Prerequisites: hunspell and php5. Test code from bash: user@host ~/ $ echo 'sagadījās' | hunspell -d lv_LV,en_US Hunspell 1.2.14 + sagadīties - works properly. Test code (test.php): $encoding = "lv_LV.utf-8"; setlocale(LC_CTYPE, $encoding); //…
Kristaps Karlsons
  • 482
  • 1
  • 7
  • 22
1
vote
2 answers

Hunspell: Any solid example?

I have downloaded and compiled hunspell fine. Now I want to make a test app on wxWidgets and I started looking for example or tutorial. So far I have found none. I can find "example" executable but no code (May be hidden somewhere haven't found?).…
Stefano Mtangoo
  • 6,017
  • 6
  • 47
  • 93
1
vote
1 answer

COMPOUNDRULE inside SFX

I don't understand why my example doesn't work (maybe it's my mistake?) I have this aff file FULLSTRIP COMPOUNDMIN 1 COMPOUNDRULE 1 COMPOUNDRULE AC SFX B Y 1 SFX B a b/A a This dic file 2 a/AB c/C And my tests are ac bc The result by running…
MauroT
  • 320
  • 2
  • 12
1
vote
1 answer

Consider split words to be correct words

Let's assume I have 2 words in dic file this that If I type a word "thisthat" it will suggest "this that" because by default split suggestion option is enabled. If I disable that option then hunspell will suggest some other word. I have understood…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
1
vote
2 answers

Using C++ / Objective-C package (Hunspell) function in Swift with char*** argument

I'm a recent Swift learner and now I'm trying to use following package: https://github.com/aaronSig/Hunspell-iOS The package itself is written using C++ and has some headers for Objective-C. I've added *-Bridging-Header.h file and have been able to…
naffiq
  • 1,030
  • 1
  • 9
  • 19
1
vote
0 answers

Using Personal Dictionary files

The man page mentions the "Personal dictionary file" ... https://manpages.ubuntu.com/manpages/trusty/en/man4/hunspell.4.html Personal dictionaries are simple word lists. Asterisk at the first character position signs prohibition. A second…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
1
vote
0 answers

Detailed documentation on the Hunspell. Very needed

Who has detailed documentation on the Hunspell format? Unfortunately, I found out that it is difficult to find detailed documentation for this format. This is all the more strange because Hunspell is considered to be a very popular format. Maybe…
Dziglo Dz
  • 21
  • 2
1
vote
0 answers

compiling Hunspell dictionary

If at the end of the word there are two identical letters (kk, pp, tt), then the word is inclined in one way, and if the last two letters are different, then in another way. How can this be taken into account within one group (flag)? SFX m Y 6 SFX m…
Dziglo Dz
  • 21
  • 2
1
vote
0 answers

Understanding hunspell stemming, why aren't plural and singular stemmed the same?

We are using hunspell in elasticsearch to help us stem irregular nouns, but it doesn't really give us the expected result. Fx "gulerod" (carrot) vs "gulerødder" (carrots) are stemmed to "gulerod" (word root) and "gulerødder" respectively. I have…
1
vote
0 answers

Elasticsearch. Full text search for the Russian language

Right now I am using hunspell dictionary as my search engine in ES. It works weirdly and I don't understand why. For example, I have several entries in my index with the word "перец" in different forms: 1 ч. л. смеси перцев горошком; 2–3 колечка…
1
vote
1 answer

R Hunspell autocorrect and stemming in pipe function for 2 columns tribble / with unnest_tokens

I am currently unsuccessfully trying to apply an autocorrection and a stemming using Hunspell to my data. The data in question is a tribble of sentences, each with an author, which are then to be evaluated via a more complex unnest function. This…
Alex_
  • 189
  • 8