1

My aim is to write an intelligent ChatBot. He should save known informations likely to the human brain.

That is why I am looking for a filetype wich stores data as a net of connected keywords. What filetype or database system could reach this?

Further Informations:

The information input will be wikipedia, google search, and facts teached by a human during a conversation.

I could give specific informations about my requirements and wishes but I don't know if there exists even any approach to this. Maybe there are more useful specifications as my thoughts.

Just one example: the connections should have weights. Requesting an information net should increase the weights of the used connections.

What I expect is that the ChatBot could get real associations (or ideas) using the data net.

danijar
  • 32,406
  • 45
  • 166
  • 297
  • The idea of a "net of connected keywords" seems to map pretty directly to the computing concept of a graph. Graphs' edges often have weights associated with them. There are a few popular ways to implement them, all of which can be easily found online. Is there something I'm missing here? – prelic Jan 19 '12 at 18:33
  • @prelic: Thanks a lot. This brought up my research! But what's about speed? I want to store tons of data. – danijar Jan 19 '12 at 18:36
  • It could indeed! For a given set of nodes of a graph, each member of that set may have an edge to each other node of that set, including itself, if you so allow. Both adjacency list and adjacency matrix representations will allow a node to connect to all other nodes, including itself. Of course, constraints can be added to limit valid edges according to a set of rules. – prelic Jan 19 '12 at 18:41
  • Thanks. But what's about speed? I want to store tons of data. BTW: Post this as an answer and it will be accepted. – danijar Jan 19 '12 at 18:42

1 Answers1

2

As an extension to my above comments:

A graph is definitely the way you want to go in terms of data representation...it maps perfectly to your problem description.

What you seem to be asking is how you can [persistently] store this information on disk (rather than memory). That completely depends on what constraints you need. There is a "Graph Database" which is more geared to storing graphs than say relational or hierarchical databases, and would be perform far better than say pushing your adjacency matrix or list to a flat file. Here's the wikipedia entry:

http://en.wikipedia.org/wiki/Graph_database

Now, there is the issue of what happens when you have so many nodes and edges that you can't load them all into memory at once, and unfortunately if you have nodes that are connected to every other node, that can be a problem (because you won't be able to load the complete/valid graph. I can't answer that right now, but I'm sure there are paradigms to address this problem. I will update my answer after some digging.

Edit-You'll probably have to consult someone who knows more about graph databases. It's possible that there are ways to load chunks of the graph from the database without loading the whole thing. If that's what your issue is, you may want to reform a question about working with large graphs stored on graph databases and post it again, tagged with graphs,databases,algorithms, stuff like that, and just post it again in a more specific manner.

prelic
  • 4,450
  • 4
  • 36
  • 46