Questions tagged [data-mining]

Data mining is the process of analyzing large amounts of data in order to find patterns and commonalities.

Data mining, also known as knowledge discovery, is the process of digging through and analyzing enormous sets of data and then extracting the meaning of the data. Data mining tools like SQL Server Analysis Services, predict behaviors and future trends, allowing businesses to make proactive, knowledge-driven decisions. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations. Input to learning mining algorithms is called cases, samples, examples, instances, events, and observations.

3094 questions
1
vote
0 answers

Error occurs while using SPADE method in R

I'm currently mining sequence patterns using SPADE algorithm in R. SPADE is included in arulesSequence package of R. I'm running R on my CentOS 6.3 64bit. For an exercise, I've tried an example presented in…
Yuwon Lee
  • 67
  • 1
  • 1
  • 6
1
vote
1 answer

Open source concept mining tools?

Are there to day any concept mining open source tools available? I have only be coming across like Leximancer, which although seem to fit the role is not open source and quite expensive for a undergraduate student. I have been unsuccessful so far…
1
vote
1 answer

Cluto sparse matrix clustering

I downloaded cluto and I want to send a text file includes sparse data as input and want to get the output of clustered data. For example: 4 3 9 1 0.4 2 0.4 1 0.4 2 0.4 2 1.2 3 1.2 1 0.4 2 0.4 3 0.4 is my input and ı want to get the…
JoshuaJeanThree
  • 1,382
  • 2
  • 22
  • 41
1
vote
1 answer

assigning new observation to a cluster

Suppose I have a user/item feature matrix in Mahout and I have derived the users' loglikelihood similarity and have identified three user clusters. Now I have a new user with a set of items (same format and same set of items), how can I assign the…
user1848018
  • 1,086
  • 1
  • 14
  • 33
1
vote
2 answers

How to compare similarity between two concept maps

I am doing a CV short listing project for a company. I have a concept map for whole company documents. And also i have extracted the data from CV and now I have a concept map for each CV. I want to compare each CV with the Company Concept map for…
user1743483
1
vote
2 answers

MapReduce project with data mining

I am planning to do a MapReduce project involving Hadoop libraries and testing it on big data uploaded at AWS. I have not finalized an idea yet. But I am sure it will involve some kind of data processing, MapReduce design patterns and possibly Graph…
user1655719
1
vote
2 answers

Data mining/BI/Analytics/ML : Can a mathematically challenged person move into this field?

I have recently become interested in the field(s) of data mining and machine learning. The idea of going through huge datasets and trying to correlate hidden patterns and trends is fascinating. So far I have done the following Used Weka to load…
user88595
  • 99
  • 2
  • 9
1
vote
0 answers

Topic Identification with WEKA

I am completely new to the field of Data mining and WEKA tool (just installed it today). I need to do topic identification based on short text sentences. Let say I have several categories: - politics - sports - other I am thinking of doing the…
Naz
  • 215
  • 3
  • 12
1
vote
1 answer

What kind of stats and info can I get (mine) from time series data?

I have a database with time series data of different solar power plants: how strong was the sun and how much power that plant created / harvested. This data is in 15 min increments. I would like to use data mining to get new insights and to then…
duality_
  • 17,738
  • 23
  • 77
  • 95
1
vote
1 answer

Timeline Detection

I am trying to do a timeline detection problem using text classification. As a newbie I am confused as to how I can go about with this. Is this a classification problem? i.e, Can I use the years(timelines) as outcomes and solve this as a…
stackuser
  • 4,081
  • 4
  • 21
  • 27
1
vote
2 answers

Designing a clustering process using RapidMiner

I haven't had much experience with machine learning or clustering, so I'm at a bit of a loss as to how to approach this problem. My data of interest consists of 4 columns, one of which is just an id. The other 3 contain numerical data, values >=…
1
vote
1 answer

Best practices for handling non decimal variables. [ACM KDD 2009 CUP]

For practice I decided to use neural network to solve problem of classification (2 classes) stated by ACM Special Interest Group on Knowledge Discovery and Data Mining at 2009 cup. The problem I have found is that the data set contains a lot of…
1
vote
2 answers

Algorithms for Mining Tuples of Data on huge sample space

I read that Apriori algorithm is used to fetch association rules from the dataset like a set of tuples. It helps us to find the most frequent 1-itemsets, 2-itemsets and so-on. My problem is bit different. I have a dataset, which is a set of tuples,…
1
vote
1 answer

Similarity measure to identify similar log files

I want to implement a similarity function that can accurately identify the similar log files. So far, I am unable to find a suitable similarity metric for my problem. I have log files generated from several PCs (around 300 PCs), where each file…
Maggie
  • 5,923
  • 8
  • 41
  • 56
1
vote
5 answers

fetching information from data - data mining practical techniques

i am developing an online book store using php and mysql. now i want to implement some data mining techniques like recommending related books and so on. i want to know what are the best resources to get some useful practical techniques to implement…
rahim asgari
  • 12,197
  • 10
  • 43
  • 53
1 2 3
99
100