You can load WEKA GUI steps partially with RWeka or with Weka command line tools that are are far more extensive than the available functions in RWeka. So you can extend the RWeka with the command line commands through the system command in R. Luckily, the parameters in WEKA GUI and the WEKA commandline are the same. I recommend extracting the weka-src.jar
with jar xf weka-src.jar
to read the source.
There exist many functions for the MultiFilter
java weka.filters.MultiFilter --help
java weka.filters.unsupervised.attribute.PartitionedMultiFilter --help
where the second allows you specify the attribute range. Otherwise, they seem to be identical.
Then you can run your first discretize filter with
java weka.filters.unsupervised.attribute.Discretize -F -B 20 -M -1.0 -R 27 -i yourFile.arff
and then direct its output to next Discretize
, eventually to NumericTransform
and Resample
. The command line provides fabulous instructions on the commands in the following way
java weka.filters.unsupervised.attribute.NumericTransform --help
java weka.filters.unsupervised.attribute.Remove --help
java weka.filters.unsupervised.instance.Resample --help
java weka.filters.supervised.instance.Resample --help
and you can check them from the directory structure or the index.
RWeka
RWeka package provides the functions
- Discretize()
- Normalize()
- make_Weka_filter() to create R interfaces to Weka filters
and there is no NumericTransform and Remove functions. You need to use their arguments so not directly just by copy-pasting a java code from WEKA GUI. Perhaps, one solution could be use the system command and execute the Java code with it, without having to need to learn the RWeka itself. There seems to be some gap between the WEKA GUI and the R package.
Running Weka on Commandline
Even though the commands are missing through RWeka interface, you can also use the system commands in R. For example, you can run the remove command
java weka.filters.unsupervised.attribute.Remove -i yourfile.arff
such that
system("java weka.filters.unsupervised.attribute.Remove -i yourfile.arff")
I have the following setup here so we can run Discretize with the following way.
$ cat $WEKAINSTALL/data/iris.arff |tail
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
%
%
%
$ java weka.filters.unsupervised.attribute.Discretize -i $WEKAINSTALL/data/iris.arff |tail
'\'(6.46-6.82]\'','\'(2.96-3.2]\'','\'(5.13-5.72]\'','\'(2.26-inf)\'',Iris-virginica
'\'(6.82-7.18]\'','\'(2.96-3.2]\'','\'(4.54-5.13]\'','\'(2.26-inf)\'',Iris-virginica
'\'(5.74-6.1]\'','\'(2.48-2.72]\'','\'(4.54-5.13]\'','\'(1.78-2.02]\'',Iris-virginica
'\'(6.46-6.82]\'','\'(2.96-3.2]\'','\'(5.72-6.31]\'','\'(2.26-inf)\'',Iris-virginica
'\'(6.46-6.82]\'','\'(3.2-3.44]\'','\'(5.13-5.72]\'','\'(2.26-inf)\'',Iris-virginica
'\'(6.46-6.82]\'','\'(2.96-3.2]\'','\'(5.13-5.72]\'','\'(2.26-inf)\'',Iris-virginica
'\'(6.1-6.46]\'','\'(2.48-2.72]\'','\'(4.54-5.13]\'','\'(1.78-2.02]\'',Iris-virginica
'\'(6.46-6.82]\'','\'(2.96-3.2]\'','\'(5.13-5.72]\'','\'(1.78-2.02]\'',Iris-virginica
'\'(6.1-6.46]\'','\'(3.2-3.44]\'','\'(5.13-5.72]\'','\'(2.26-inf)\'',Iris-virginica
'\'(5.74-6.1]\'','\'(2.96-3.2]\'','\'(4.54-5.13]\'','\'(1.78-2.02]\'',Iris-virginica
$
Some useful information
Use Weka in your Java code
Download the Linux Developer version, unzip it and read the README with many fabulous examples about using WEKA particularly on command line.
Wiki here
Maybe irrelevant: Generating source code from WEKA classes