I have a numeric dataset (as a database table) with a "n to m" relation. For example:
A | B
-----
1 | 1
1 | 2
1 | 9
4 | 2
7 | 8
7 | 11
And I would like to "train" a classifier (using weka?) to tell me which A is the most likely for a set of B's. As an example: [1,2,8] should tell me something along the lines of {1:2, 4:1, 7:1}, that is: "The set was found in A N times"
I can of course implement this in sql and my favourite scripting language. However, I wanted to know if there is a more - I guess "formal" - way to do it. I have weka running, and I have my db connected, however I am lost which classifier to take (many refuse service) and I would also greatly appreciate some links for a theoretical background (for instance what is it called what I like to do and what improvements exists).
I would also appreciate the "R" way if you are more versatile in R. (However, I am interested not only in solving my problem but understanding what kind of problem this is - which I would via a weka solution.) I am sorry if this is in any way a duplicate question, but sadly I lack the information to specify what I am looking for here. Visualization and other output to learn and study would be great, thou.
I thank you kindly in advance, just for reading and hope you can help.