Numerical classifier (?) Weka / R

Question

I have a numeric dataset (as a database table) with a "n to m" relation. For example:

A | B
-----
1 | 1
1 | 2
1 | 9
4 | 2
7 | 8
7 | 11

And I would like to "train" a classifier (using weka?) to tell me which A is the most likely for a set of B's. As an example: [1,2,8] should tell me something along the lines of {1:2, 4:1, 7:1}, that is: "The set was found in A N times"

I can of course implement this in sql and my favourite scripting language. However, I wanted to know if there is a more - I guess "formal" - way to do it. I have weka running, and I have my db connected, however I am lost which classifier to take (many refuse service) and I would also greatly appreciate some links for a theoretical background (for instance what is it called what I like to do and what improvements exists).

I would also appreciate the "R" way if you are more versatile in R. (However, I am interested not only in solving my problem but understanding what kind of problem this is - which I would via a weka solution.) I am sorry if this is in any way a duplicate question, but sadly I lack the information to specify what I am looking for here. Visualization and other output to learn and study would be great, thou.

I thank you kindly in advance, just for reading and hope you can help.

I may be able to reduce the "m to n" nature of my data by removing duplicate B's, however this should be seen optional. And I am able to reorganize my data of course. — Jonny H., Nov 18 '12 at 23:04

score 1 · Accepted Answer · answered Nov 18 '12 at 23:05

1

In R you can do as follows:

foo = data.frame(A=c(1,1,1,4,7,7),B=c(1,2,9,2,8,11))
foo
#   A  B
# 1 1  1
# 2 1  2
# 3 1  9
# 4 4  2
# 5 7  8
# 6 7 11

table(foo[foo$B %in% c(1,2,8),]$A)

# 1 4 7 
# 2 1 1

Do you need something more?

answered Nov 18 '12 at 23:05

Ali

9,440
12
62
92

Thanks for your answer! Could you tell me how this is called? I was especially interested in weka in order to find out how one would call this, that is, what kind of "classification" it is (I should have expressed this more). Otherwise, thanks for your R solution, I will accept it as solution if nobody can answer my question in weka or provide more insight into the topic. – Jonny H. Nov 18 '12 at 23:24
These are just the very basic operators/functions of R. I have not used a real "classifier", nor your question requires a classifier - something like svm - to answer. I think - but I can not guarantee - that R is more flexible than weka for providing numerous packages that a developer can use, although weka is really great. – Ali Nov 18 '12 at 23:26
Yes I was suspecting this, thank you. That's why a weka solution (if it exists) would tell me more about the nature of the problem and if it is a classifier or not. But I do appreciate your R solution and will probably use it if I can not find out more! – Jonny H. Nov 18 '12 at 23:34

Numerical classifier (?) Weka / R

1 Answers1