Questions tagged [mining]

235 questions
0
votes
0 answers

R tm package - Removing whole paragraph

I am trying to remove whole paragraph that keeps repeating in different documents. It is a disclaimer that goes at the end of an email e.g : " any review, retransmission dissemination or other use of this…
Sir Oliver
  • 57
  • 8
0
votes
1 answer

How to Produce a file with Results with Text Mining using K Means Clustering in R

I have a set of data which has a text field which I am trying to automatically label as relevant or not based on the text field. I have already manually labelled the data but am trying to compare the automatic labels to the manual labels to…
harriet
  • 11
  • 1
0
votes
1 answer

How to Calculate Values from Tables, Based On the RowNames in Matlab

In Matlab, I have 2 tables, 1 table contains all of other tables' values. First table is named T1 freq = [2;3;4;5;6;54;3;4]; words = {'finn';'jake';'iceking';'marceline';'shelby';'bmo';'naptr';'simon'}; T1 = table(freq,... …
0
votes
0 answers

R text mining association from CSV

I am using R for text mining and I have a question. I am importing in a CSV file with 4 columns. Two of the columns have strings in them one is the user input and the other is an official response, both are like sentences. Each row is a certain…
0
votes
0 answers

TypeError: processing xml format wikipedia to text format

I am learning Python, mainly for text mining, following guidance of (http://textminingonline.com/training-word2vec-model-on-english-wikipedia-by-gensim). I want to extract wikipedia English text from xml returned by api. However, an error appears: …
Ryan
  • 1
0
votes
0 answers

How do I make a shell script start at boot in OSX through command line only?

So I have a very brief .sh script in OSX designed to start the system mining at boot (not login!). #! /bin/sh DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/" CMD=$DIR"m-minerd -o url -u user:worker -p pass" $CMD I know this works from…
velvetxcat
  • 13
  • 5
0
votes
1 answer

Rapid Miner Not Saving Crawl Web Results

I am trying to crawl review for a particular movie review from IMDB website. For this I am using crawl web which i have embedded inside loop as there are 74 pages. Attached are the images of configuration. Please help. Am badly stuck in this. URL…
Kartik Solanki
  • 161
  • 1
  • 10
0
votes
1 answer

Twitter TypeError: 'int' object has no attribute '__getitem__'

I am trying to follow a tutorial on twitter data mining, the steps emulated as as follows: tweets_data_path = '/home/ambijat/ipythonnbs/twitter/twitter_data.txt' tweet_data = [] tweets_file = open(tweets_data_path, "r") for line in…
ambrish dhaka
  • 689
  • 7
  • 27
0
votes
0 answers

Naive Bayes Model Making No Predictions (Specificity is 0); Sentiment Analysis in R

I have been attempting to analyze a dataset (about 7000 entries) for twitter sentiment analysis. I've been trying to use a Naive Bayes model, in order to predict whether a tweet is negative or not. Confusion matrix has no prediction, just the base…
gamelanguage
  • 103
  • 10
0
votes
0 answers

multiple neural network or neural network with output layer

If outputs have correlation, what it better between output layer or multiple neural network? like disease detection.(correlation between diabetes mellitus and obesity is high. but obesity and eye disease is low)
0
votes
2 answers

What are these mystery characters

This might not be a programming question, but I could not find any answer for it on Google. I have some text mining task and doing data cleaning at the moment. I have come across some mystery characters far to often which are not in readable…
Keval Shah
  • 393
  • 1
  • 4
  • 14
0
votes
0 answers

Mining multiple hashtags simultaneously in R

I'm a bit new to R, so this question might seem very basic. I would like to produce lists based on more than one hashtags in R. I designed an application which mines the tweets for #AT&T when I put it in the search box or #Verizon when I put that…
oliver_48
  • 129
  • 1
  • 4
0
votes
1 answer

JUNG : How can we do Graph Clustering based on some properties of vertex?

I have graph database with 500+ vertices and 700+ edges. The vertex in my graph represent object of the Class 'Paper', which have members like ID, title, year, publisher, publisherID, author, authorID etc. I want to cluster the sub graphs based on…
Parvez Kazi
  • 660
  • 5
  • 13
0
votes
1 answer

C++ BOOST library and bundled properties

Iam trying to make a graph mining program using Boost, so I started by graph structure, here is the code tha I make: #include #include using namespace std; using namespace boost; //vertex struct…
0
votes
0 answers

Text mining algorithm similar text

Hi I am writing a small app using Facebook to group people by social networks. The main problem I face is grouping similar texts together. Some people have the education as Anna University, Guindy while others put it as Anna University. How do I…
Niranjan Sonachalam
  • 1,545
  • 1
  • 20
  • 29