0

I'm trying to infer a Markov chain of a process I can only simulate. The amount of states/vertices that the final graph will contain is very large, but I do not know the amount of vertices in advance.

Right now I have the following:

  • My simulation outputs a boost::dynamic_bitset containing 112 bits every timestep.
  • I use the bitset as a key in a Google Sparse Hash to map to an integer value that can be used as an index to the adjacency matrix I want to construct.

Now I need a good/fast matrix or two-dimensional array to store integers. It should:

  • Use the integer values I stored in the Google Sparse Hash as row/column numbers. (Eg. I want to access/change a stored integer by doing something like matrix(3,4) = 3.
  • I do not know the amount of rows or columns I will need in advance. So it should be able to just add rows and columns on the fly.
  • Most values will be 0, so it should probably be a sparse implementation of something.
  • The amount of rows and columns will be very large, so it should be very fast.
  • Simple to use. I don't need a lot of mathematical operations, it should just be a fast and simple way to store and access integers.

I hope I put my question clear enough.

jlmr
  • 165
  • 10

1 Answers1

0

I'd recommend http://www.boost.org/doc/libs/1_54_0/libs/numeric/ublas/doc/matrix_sparse.htm -- boost UBLAS sparse matrices. There are several different implementations of sparse matrix storages, so reading the documentation can help you choose a type that's right for your purpose. (TLDR: sparse matrices have either fast retrieval or fast insertion.)

Sven
  • 1,748
  • 1
  • 14
  • 14
  • I tried the boost sparse matrix. However it seems that the `resize()` function while preserving data hasn't been implemented yet in the Boost library. (Or I can't get it to work, which is always a possibility to consider.) Without the `resize()` function I'm in trouble, because the amount of entries I need is unpredictable. – jlmr Oct 29 '13 at 14:56
  • Ask on the mailing list. The developers can and do answer your question, they're a really polite group of people. Algorithmically, you could use an unordered_map of unordered_maps to store your data: a top level unordered_map to model the first dimension, and an unordered_map to store your integers. (YMMV, spontaneous thoughts... ;) – Sven Oct 29 '13 at 15:07