12

How would one design a neural network for the purpose of a recommendation engine. I assume each user would require their own network, but how would you design the inputs and the outputs for recommending an item in a database. Are there any good tutorials or something?

Edit: I was more thinking how one would design a network. As in how many input neurons and how the output neurons point to a record in a database. Would you have say 6 output neurons, convert it to an integer (which would be anything from 0 - 63) and that is the ID of the record in the database? Is that how people do it?

Louis
  • 4,172
  • 4
  • 45
  • 62
  • 2
    Concerning your edit: No, you're missing the point. The point of an NN is classification based on statistical properties. They are NOT Bayesean in nature, but you can think of it that way if it helps you: "when I have input A of a certain value, input B of a certain value, input C of a certain value... what is the likelihood that this specific input set belongs a certain group (more accurately, you ask to which group it belongs)? That is the purpose of a NN. You can flex this model to be used in more than simple classification, but at its heart, that's what it does. – San Jacinto Feb 23 '10 at 12:58
  • 1
    Thanks, your answer explains clearly how to input the data but not what the outputs should/would look like and how they mean anything. That is where I'm confused. – Louis Feb 23 '10 at 13:13
  • 3
    The outputs are going to be numeric, but those numbers must have meaning. "What they mean" is up to YOU, the designer of the network, but it's not going to work to simply map them to a database ID (which is a label with no math meaning) in the way you want. The outputs cannot be directly converted to a label. The outputs are the values of a statistical model. It would be like you trying to predict a quarterback's performance based upon the number on his jersey. Unless the jersey is assigned on specific, observable criteria about the player, any such prediction would be meaningless. – San Jacinto Feb 23 '10 at 15:51
  • Yeah I thought my guess was way off. – Louis Feb 23 '10 at 22:11

2 Answers2

8

I would suggest looking into neural networks using unsupervised learning such as self organising maps. It's very difficult to use normal supervised neural networks to do what you want unless you can classify the data very precisely for learning. self organising maps don't have this problem because the network learns the classification groups all on their own.

have a look at this paper which describes a music recommendation system for music http://www.springerlink.com/content/xhcyn5rj35cvncvf/

and many more papers written about the topic from google scholar http://www.google.com.au/search?q=%09+A+Self-Organizing+Map+Based+Knowledge+Discovery+for+Music+Recommendation+Systems+&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a&safe=active

Charles Ma
  • 47,141
  • 22
  • 87
  • 101
2

First you have to decide what exactly you are recommending and under what circumstances. There are many things to take into account. Are you going to consider the "other users who bought X also bought Y?" Are you going to only recommend items that have a similar nature to each other? Are you recommending items that have a this-one-is-more-useful-with-that-one type of relationship?

I'm sure there are many more decisions, and each one of them has their own goals in mind. It would be very difficult to train one giant network to handle all of the above.

Neural networks all boil down to the same thing. You have a given set of inputs. You have a network topology. You have an activation function. You have weights on the nodes' inputs. You have outputs, and you have a means to measure and correct error. Each type of neural network might have its own way of doing each of those things, but they are present all the time (to my limited knowledge). Then, you train the network by feeding in a series of input sets that have known output results. You run this training set as much as you'd like without over or under training (which is as much your guess as it is the next guy's), and then you're ready to roll.

Essentially, your input set can be described as a certain set of qualities that you believe have relevance to the underlying function at hand (for instance: precipitation, humidity, temperature, illness, age, location, cost, skill, time of day, day of week, work status, and gender may all have an important role in deciding whether or not person will go golfing on a given day). You must therefore decide what exactly you are trying to recommend and under what conditions. Your network inputs can be boolean in nature (0.0 being false and 1.0 being true, for instance) or mapped in a pseudo-continuous space (where 0.0 may mean not at all, .45 means somewhat, .8 means likely, and 1.0 means yes). This second option may give you the tools to map confidence level for a certain input, or simple a math calculation you believe is relevant.

Hope this helped. You didn't give much to go on :)

San Jacinto
  • 8,774
  • 5
  • 43
  • 58