Questions tagged [linguistics]

Linguistics is the scientific study of language and its structure, including the study of morphology, syntax, phonetics, and semantics.

Linguistics is the scientific study of language and its structure, including the study of morphology, syntax, phonetics, and semantics.

Specific branches of linguistics include sociolinguistics, dialectology, psycholinguistics, computational linguistics, historical-comparative linguistics, and applied linguistics.

323 questions
9
votes
5 answers

Build a natural language model that fixes misspellings

What are books about how to build a natural language parsing program like this: input: I got to TALL you output: I got to TELL you input: Big RAT box output: Big RED box in: hoo un thum zend three out: one thousand three It must have the…
EugeneP
  • 11,783
  • 32
  • 96
  • 142
9
votes
4 answers

LSA - Latent Semantic Analysis - How to code it in PHP?

I would like to implement Latent Semantic Analysis (LSA) in PHP in order to find out topics/tags for texts. Here is what I think I have to do. Is this correct? How can I code it in PHP? How do I determine which words to chose? I don't want to use…
caw
  • 30,999
  • 61
  • 181
  • 291
8
votes
2 answers

Linguistic tagger incorrectly tagging as 'OtherWord'

I've been using NSLinguisticTagger with sentences and have been encountering a strange issue with sentences such as 'I am hungry' or 'I am drunk'. Whilst one would expect 'I' to be tagged as a pronoun, 'am' as a verb and 'hungry' as an adjective,…
Joshua
  • 15,200
  • 21
  • 100
  • 172
8
votes
3 answers

Word Stemming in iOS - Not working for single word

I am using NSLinguisticTagger for word stemming. I am able to get a stem words of words in a sentence, but not able to get a stem word for a single word. Following is the code I am using, NSString *stmnt = @"i waited"; …
Ab'initio
  • 5,368
  • 4
  • 28
  • 40
8
votes
9 answers

Is there a fairly simple way for a script to tell (from context) whether "her" is a possessive pronoun?

I am writing a script to reverse all genders in a piece of text, so all gendered words are swapped - "man" is swapped with "woman", "she" is swapped with "he", etc. But there is an ambiguity as to whether "her" should be replaced with "him" or…
katie
  • 91
  • 2
7
votes
1 answer

understanding semcor corpus structure h

I'm learning NLP. I currently playing with Word Sense Disambiguation. I'm planning to use the semcor corpus as training data but I have trouble understanding the xml structure. I tried googling but did not get any resource describing the content…
Sharmila
  • 1,637
  • 2
  • 23
  • 30
7
votes
1 answer

How can I use Python NLTK to identify collocations among single characters?

I want to use NLTK to identify collocations among particular kanji characters in Japanese and hanzi characters in Chinese. As with word collocations, some sequences of Chinese characters are far more likely than others. Example: Many words in…
WordBrewery
  • 197
  • 9
7
votes
4 answers

Justadistraction: tokenizing English without whitespaces. Murakami SheepMan

I wondered how you would go about tokenizing strings in English (or other western languages) if whitespaces were removed? The inspiration for the question is the Sheep Man character in the Murakami novel 'Dance Dance Dance' In the novel, the Sheep…
craigs
  • 123
  • 5
7
votes
2 answers

An algorithm for declension of nouns of Polish/Slavic languages

Attention!! It will help a lot to know Polish or any other natural language with strong flexion, preferably with a case system (like German for instance), to answer this question. In particular, Polish declension system is very similar to systems of…
GA1
  • 1,568
  • 2
  • 19
  • 30
7
votes
5 answers

Checking if a string contains an English sentence

As of right now, I decided to take a dictionary and iterate through the entire thing. Every time I see a newline, I make a string containing from that newline to the next newline, then I do string.find() to see if that English word is somewhere in…
Nicholas Pipitone
  • 4,002
  • 4
  • 24
  • 39
7
votes
2 answers

Is there software that outputs speech-to-text at the Phonological level?

Is there any software out there capable of taking audio files and outputting phonological (IPA) text? I understand much of the software out there takes it straight to a language, but is there one that is 'teachable'?
6
votes
1 answer

Converting adjectives and adverbs to their noun forms

I am experimenting with word sense disambiguation using wordnet for my project. As a part of the project, I would like to convert a derived adjective or an adverb form to it's root noun form. For example beautiful ==> beauty wonderful ==>…
Sharmila
  • 1,637
  • 2
  • 23
  • 30
6
votes
2 answers

Implementing Read typeclass where parsing strings includes "$"

I've been playing with Haskell for about a month. For my first "real" Haskell project I'm writing a parts-of-speech tagger. As part of this project I have a type called Tag that represents a parts-of-speech tag, implemented as follows: data Tag = CC…
svoisen
  • 3,292
  • 1
  • 18
  • 8
6
votes
1 answer

PHP implementation of Bayes classificator: Assign topics to texts

In my news page project, I have a database table news with the following structure: - id: [integer] unique number identifying the news entry, e.g.: *1983* - title: [string] title of the text, e.g.: *New Life in America No Longer Means a New Name* …
caw
  • 30,999
  • 61
  • 181
  • 291
6
votes
2 answers

Natural language grammar and user-entered names

Some languages, particularly Slavic languages, change the endings of people's names according to the grammatical context. (For those of you who know grammar or studied languages that do this to words, such as German or Russian, and to help with…
Owen Blacker
  • 4,117
  • 2
  • 33
  • 70
1 2
3
21 22