Questions tagged [weka]

Weka (Waikato Environment for Knowledge Analysis) is an open source machine learning library written in Java.

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

Weka is open source software issued under the GNU General Public License.

Weka's main user interface is the Explorer, but essentially the same functionality can be accessed through the component-based Knowledge Flow interface and from the command line. There is also the Experimenter, which allows the systematic comparison of the predictive performance of Weka's machine learning algorithms on a collection of datasets.

The Explorer interface features several panels providing access to the main components of the workbench:

  • The Preprocess panel has facilities for importing data from a database, a CSV file, etc., and for preprocessing this data using a so-called filtering algorithm. These filters can be used to transform the data (e.g., turning numeric attributes into discrete ones) and make it possible to delete instances and attributes according to specific criteria.
  • The Classify panel enables the user to apply classification and regression algorithms (indiscriminately called classifiers in Weka) to the resulting dataset, to estimate the accuracy of the resulting predictive model, and to visualize erroneous predictions, ROC curves, etc., or the model itself (if the model is amenable to visualization like, e.g., a decision tree).
  • The Associate panel provides access to association rule learners that attempt to identify all important interrelationships between attributes in the data.
  • The Cluster panel gives access to the clustering techniques in Weka, e.g., the simple k-means algorithm. There is also an implementation of the expectation maximization algorithm for learning a mixture of normal distributions.
  • The Select attributes panel provides algorithms for identifying the most predictive attributes in a dataset.
  • The Visualize panel shows a scatter plot matrix, where individual scatter plots can be selected and enlarged, and analyzed further using various selection operators.

Online Resources:

Use Weka in your Java Code

Weka on Sourceforge

Weka on GitHub

3033 questions
14
votes
2 answers

Java Weka: How to specify split percentage?

I have written the code to create the model and save it. It works fine. My understanding is data, by default, is split in 10 folds. I want data to be split into two sets (training and testing) when I create the model. On Weka UI, I can do it by…
rishi
  • 2,564
  • 6
  • 25
  • 47
13
votes
2 answers

Weka simple K-means clustering assignments

I have what feels like a simple problem, but I can't seem to find an answer. I'm pretty new to Weka, but I feel like I've done a bit of research on this (at least read through the first couple of pages of Google results) and come up dry. I am using…
machine yearning
  • 9,889
  • 5
  • 38
  • 51
13
votes
5 answers

.arff files with scikit-learn?

I would like to use an Attribute-Relation File Format with scikit-learn to do some NLP task, is this possible? How can use an .arff file with scikit-learn?
tumbleweed
  • 4,624
  • 12
  • 50
  • 81
13
votes
3 answers

WEKA Tutorials / Examples for a Newbie

In a follow-up to this answer I want to ask if any of you know any good (and more importantly easy to understand) tutorials and / or examples of data mining with the Weka toolkit. I've been very interested in Data Mining ever since I've first heard…
Alix Axel
  • 151,645
  • 95
  • 393
  • 500
13
votes
3 answers

weka.core.UnassignedDatasetException when creating an unlabeled instance

I trained an IBK classifier with some training data that I created manually as following: ArrayList atts = new ArrayList(); ArrayList classVal = new…
TeFa
  • 974
  • 4
  • 15
  • 37
12
votes
5 answers

How to use LibSVM with Weka in my Java code?

I want to use LibSVM classifier with Weka in my application. How can I (or where can I find good examples to) do this?
ruwanego
  • 427
  • 2
  • 7
  • 18
12
votes
2 answers

Choose the right classification algorithm. Linear or non-linear?

I find this question a little tricky. Maybe someone knows an approach to answer this question. Imagine that you have a dataset(training data) which you don't know what it is about. Which features of training data would you look at in order to infer…
Ahmet Keskin
  • 1,025
  • 1
  • 15
  • 25
12
votes
2 answers

How to use Weka for predicting results

I'm new to Weka and I'm confused with the tool. I have a data set about fruit prices and related attributes. I'm trying to predict the specific fruit price using the data set. Since I'm new to Weka, I couldn't figure out how to do this task. Please…
12
votes
3 answers

Disabling Eclipse code formatting for part of a javadoc

I have a Java class for which part of the javadoc is actually generated as part of the build process: the return value of a method (a static String value) is inserted into the source file, much like $Revision: $ tags work in some version control…
Kristóf Marussy
  • 1,202
  • 8
  • 18
12
votes
3 answers

What is pruned and unpruned tree in Weka?

In decision tree J48 example, when we say tree pruned or unpruned, what is the difference?
London guy
  • 27,522
  • 44
  • 121
  • 179
11
votes
2 answers

ERROR While using WEKA API in java code: Class Attribute Not Set?

I'm trying to use weka API in my java code. I use J48 tree classification to Classify my dataset in MySQL database, but I have this error: Trying to add database driver (JDBC): RmiJdbc.RJDriver - Error, not in CLASSPATH? Trying to add database…
Angga Raditya
  • 111
  • 1
  • 1
  • 4
11
votes
2 answers

Web/browser-oriented open source machine learning projects?

Applying machine learning techniques, more specifically text mining techniques, in browser environment (mainly Javascript) or as a web application is not a very widely discussed topic. I want to build my own web application / browser extension that…
Flake
  • 4,377
  • 6
  • 30
  • 29
11
votes
3 answers

Weka only changing numeric to nominal

I have a CSV file that I am importing into Weka. All variables are importing as numeric. I need to change 3 of them to nominal. However when I place numerictonominal filter on it- all variables change. I only want to change 3. 1) Is there a way to…
mpg
  • 3,679
  • 8
  • 36
  • 45
11
votes
1 answer

How to interpret Weka Logistic Regression output?

Please help interpret results of logistic regression produced by weka.classifiers.functions.Logistic from Weka library. I use numeric data from Weka examples: @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature…
Anton Ashanin
  • 1,817
  • 5
  • 30
  • 43
11
votes
3 answers

Production architecture for big data real time machine learning application?

I'm starting to learn some stuff about big data with a big focus on predictive analysis and for that I have a case study I would like to implement: I have a dataset of servers health information that is polled every 5sec. I want to show the data…
AlfaTeK
  • 7,487
  • 14
  • 49
  • 90