A word is a single distinct meaningful element of a data. Programming-related questions concerning Microsoft Word should NOT use this tag - use the tag [ms-word] instead. Questions on general usage of Microsoft Word are off-topic for Stack Overflow and should be asked on Super User instead.
Questions tagged [words]
692 questions
15
votes
4 answers
Split sentence into words but having trouble with the punctuations in C#
I have seen a few similar questions but I am trying to achieve this.
Given a string, str="The moon is our natural satellite, i.e. it rotates around the Earth!"
I want to extract the words and store them in an array.
The expected array elements…

Richard N
- 895
- 9
- 19
- 36
14
votes
6 answers
Generating random words in Java?
I wrote up a program that can sort words and determine any anagrams. I want to generate an array of random strings so that I can test my method's runtime.
public static String[] generateRandomWords(int numberOfWords){
String[] randomStrings = new…

Mr_CryptoPrime
- 628
- 2
- 11
- 25
13
votes
3 answers
Finding the most popular words in a list
I have a list of words:
words = ['all', 'awesome', 'all', 'yeah', 'bye', 'all', 'yeah']
And I want to get a list of tuples:
[(3, 'all'), (2, 'yeah'), (1, 'bye'), (1, 'awesome')]
where each tuple is...
(number_of_occurrences, word)
The list should…

Maciej Ziarko
- 11,494
- 13
- 48
- 69
13
votes
2 answers
Calculating a relative Levenshtein distance - make sense?
I am using both Daitch-Mokotoff soundexing and Damerau-Levenshtein to find out if a user entry and a value in the application are "the same".
Is Levenshtein distance supposed to be used as an absolute value? If I have a 20 letter word, a distance of…

Joseph Tura
- 6,290
- 8
- 47
- 73
12
votes
4 answers
Vim: Invert string (by words)
This is my string:
"this is my sentence"
I would like to have this output:
"sentence my is this"
I would like to select a few words on a line (in a buffer) and reverse it word by word.
Can anyone help me?

Reman
- 7,931
- 11
- 55
- 97
12
votes
3 answers
how to generate list of (unique) words from text file in ubuntu?
I have an ASCII text file. I want to generate a list of all "words" from that file using one or more Ubuntu commands. A word is defined as an alpha-num sequence between delimiters. Delimiters are by default whitespaces but I also want to experiment…

I Z
- 5,719
- 19
- 53
- 100
11
votes
20 answers
Calculating frequency of each word in a sentence in java
I am writing a very basic java program that calculates frequency of each word in a sentence so far i managed to do this much
import java.io.*;
class Linked {
public static void main(String args[]) throws IOException {
BufferedReader…

Sigma
- 742
- 2
- 9
- 24
10
votes
4 answers
Can I tell if a std::string represents a number using stringstream?
Apparently this is suposed to work in showing if a string is numerical, for example "12.5" == yes, "abc" == no. However I get a no reguardless of the input.
std::stringstream ss("2");
double d; ss >> d;
if(ss.good())…

alan2here
- 3,223
- 6
- 37
- 62
9
votes
1 answer
Postgres word_similarity not comparing words
"Returns a number that indicates how similar the first string to the most similar word of the second string. The function searches in the second string a most similar word not a most similar substring. The range of the result is zero (indicating…

Cristiano Coelho
- 1,675
- 4
- 27
- 50
9
votes
1 answer
Python regex for finding all words in a string
Hello I am new into regex and I'm starting out with python.
I'm stuck at extracting all words from an English sentence.
So far I have:
import re
shop="hello seattle what have you got"
regex = r'(\w*) '
list1=re.findall(regex,shop)
print list1
This…

TNT
- 480
- 1
- 4
- 11
8
votes
3 answers
Extract Images and Words with coordinates and sizes from PDF
I've read much about PDF extractions and libraries (as iText) but i just haven't found a solution to extract images and text (with coordinates) from a PDF.
The task is to scan PDF with catalog of products and extract each image. There is an image…

Alex
- 1,237
- 3
- 18
- 29
8
votes
5 answers
where can I find a good wordlist
I'm looking for a file that is a wordlist and also is set up by type of word. For example something in this format
Nouns: {
bus
car
deck
elephant
...
}
Adjectives {
awful
bashful
...
}
Advervb {
...
}
Any ideas?

qwertymk
- 34,200
- 28
- 121
- 184
7
votes
2 answers
Regular Expression - Exclude list of words for a name
I'm trying to make a regular expression that accepts this:
Only a-z, 0-9, _ chars, with a minimum length of 3
admin, static, my and www are rejected.
For the first part, I already managed to do it with :
^[a-zA-Z0-9\\_]{3,}$
But I don't know how…

Cyril N.
- 38,875
- 36
- 142
- 243
7
votes
7 answers
What's a good measure for classifying text documents?
I have written an application that measures text importance. It takes a text article, splits it into words, drops stopwords, performs stemming, and counts word-frequency and document-frequency. Word-frequency is a measure that counts how many times…

bodacydo
- 75,521
- 93
- 229
- 319
7
votes
4 answers
How do I count the total number of words in a Pandas dataframe cell and add those to a new column?
A common task in sentiment analysis is to obtain the count of words within a Pandas data frame cell and create a new column based on that count. How do I do this?

muninn
- 473
- 1
- 4
- 12