3

Is there a Java parser for SVN dump files, similar to svndumpfilter? I know that svnkit has such a script, but I did not find an API documentation or an example for actually reading the content of the dumpfile.

I am writing a Java app that has to analyze SVN dumps. Ideally, I would like to traverse the dump file entry-by-entry (it's too large to be read in whole). Is there an off-the-shelf tool to use, or should I implement the parser myself, based on the dump file grammar?

Little Bobby Tables
  • 5,261
  • 2
  • 39
  • 49

1 Answers1

3

The first question which comes to my mind is why are you trying to read the dump file and the repository? Furthermore the documentation of such kind can be found in the javadocs which is offered by svnkit.com and also there are examples for command line handling which implements a full svnadmin load part.

khmarbaise
  • 92,914
  • 28
  • 189
  • 235
  • I am not trying to read the repository, just the dump file, for data mining purposes. Is there an example of loading a repository from a dump using svnkit? The documentation is pretty terse. – Little Bobby Tables Dec 26 '12 at 08:07
  • 1
    I ended up finding an internal class called `SVNDumpStreamParser` that does more-or-less what I wanted, just not documented. So SvnKit has the solution, it's just very well hidden :( – Little Bobby Tables Dec 26 '12 at 09:53
  • Hm...for data mining i would extract the information from the repository and not from the dump file, cause the dump does not exist usually but the repository exists always and can be read easily but that might be an other discussion. – khmarbaise Dec 26 '12 at 10:05
  • Tried that. SVN log diff operations take a **long** time, while in our settings, the dump is created regularly, for backups, so parsing the dump is cheaper, time-wise. – Little Bobby Tables Dec 27 '12 at 10:01