Questions tagged [data-mining]

Data mining is the process of analyzing large amounts of data in order to find patterns and commonalities.

Data mining, also known as knowledge discovery, is the process of digging through and analyzing enormous sets of data and then extracting the meaning of the data. Data mining tools like SQL Server Analysis Services, predict behaviors and future trends, allowing businesses to make proactive, knowledge-driven decisions. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations. Input to learning mining algorithms is called cases, samples, examples, instances, events, and observations.

3094 questions
1
vote
1 answer

Storing textual information for query by location in document

I am currently attempting to store a document in a database to be able to quickly pull up what words are in a certain location. Example query: /doc1?start=2,end=5 This would retrieve the second to fifth word in that document. I am open to using any…
technoSpino
  • 510
  • 4
  • 12
1
vote
3 answers

Difference between GSP and the General Apriori method

GSP Algorithm is an Apriori based method with some enhancements. After reading several description, I still could not figure out the enhancements brought by GSP in regards to the general Apriori algorithm. Is it the itemset order that is taken into…
Omar Jaafor
  • 161
  • 3
  • 12
1
vote
4 answers

DecisionTree Predict

i'm a bit newbie in R data mining algorithms and I need to develop a script that help me to predict an event. So, i've chosen a decision tree model to help with this task. My dataset has this structure: _____________________________ ATTR1 | ATTR2 |…
cmnlima
  • 53
  • 2
  • 9
1
vote
2 answers

ELKI tool - outlier detection results for ABOD

I am trying to run ELKI for Outlier Detection using ABOD method. I see the various visualizations as result, but not the outlier scores or rankings. What should I do to say get top 10 outliers using ELKI?
1
vote
1 answer

Using k-means clustering on web log data

I have a data set from a access web log file which I'm interested in finding similar clusters. (I'm an absolute beginner of data mining). So far I have referred many research papers on the same problem domain. An Efficient Approach for Clustering…
Nilani Algiriyage
  • 32,876
  • 32
  • 87
  • 121
1
vote
2 answers

how can R automate regression using all covariate combinations in a set?

Suppose we have a dataset with an outcome variable y and 5 covariates. Suppose we want to fit a regression model where y is regressed on each possible combination of covariates. So, since we have 5 covariates we have 5! = 120 regression equations.…
hubert_farnsworth
  • 797
  • 2
  • 9
  • 21
1
vote
2 answers

How do I update a trained model (weka.classifiers.functions.MultilayerPerceptron) with new training data in Weka?

I would like to load a model I trained before and then update this model with new training data. But I found this task hard to accomplish. I have learnt from Weka Wiki that Classifiers implementing the weka.classifiers.UpdateableClassifier…
1
vote
2 answers

How to use Decision Tree Classification Matlab?

I have data in form of rows and columns where rows represent a record and column represents its attributes. I also have the labels (classes) for those records. I know about decision trees concept and I would like to use matlab for classification of…
Kedar Joshi
  • 1,182
  • 1
  • 20
  • 27
1
vote
1 answer

How to let J48 fit data

I have a small question about the J48 from Weka. I run this algorithm from R, using RWeka. Probably an easy solution, but i can't seem to find it on the web. A very small example: require(RWeka) Attr1 <- as.factor(c('0302','0302','0320')) Attr2 <-…
Freddy
  • 419
  • 8
  • 16
1
vote
1 answer

Hierarchical clustering in Orange tool for data mining

I am a beginner in Python and Orange tool for data mining. I have been trying out a few examples which worked as expected. KMeans clustering also works fine. But when i tried the standard example of Hierarchical clustering given in the documentation…
Anu145
  • 115
  • 1
  • 10
1
vote
1 answer

Words comparison in different phrases

Is there a way to tell if two words are the same in two different phrases? for instance "fat" is equal to "weight" in these two phrases, I want to loose fat I want to loose weight
1
vote
3 answers

Web Scraping, data mining, data extraction

I am tasked with creating a web scraping software, and I don't know where to even begin. Any help would be appreciated, even just telling me how this data is organized, or what "type" of data layout the website is using would help, because I would…
Jay
  • 15
  • 1
  • 7
1
vote
2 answers

Error in accessing Rapid Miner API from java program

I have a demo data that i need to cluster. The utility is supposed to send the data to rapid miner algorithm and then retrieve the result. I used Rapid Miner API to use the existing algorithms of rapid miner. However I am facing the problem using…
Rajeev Singh
  • 504
  • 2
  • 7
  • 21
1
vote
2 answers

ad hoc query tool patterns

I'm looking for common patterns of implementing ad-hoc querying capabilites graphically. I've looked at SQL query builders in Access and TOAD, but I'm interested if anyone is aware of products that have build such a tool against a domain specific…
wsb3383
  • 3,841
  • 12
  • 44
  • 59
1
vote
3 answers

Remove noisy and redundant features

I have extracted features from a video sequence based on facial markers as means and standard deviations of those markers over a video sequence. They need to be classified into four different classes based on those markers. In all I have a feature…