3

I'm attempting to develop a simple online editor that allows for real-time collaboration (written in Java). In this editor, I want clients to be able to edit the source code at arbitrary points (e.g. add the letter 'd' to the source code file at row 11, column 20). I'm not sure how to design these source code file objects in an efficient way, while still allowing for letter-by-letter client-server synchronization (similar to how Google Docs works).

I considered using a RandomAccessFile, but after reading this post, I don't think that would be an efficient approach. Inserting a letter near the beginning of the file would involve changing everything after it.

My current plan is to represent both the source files on the server and client using a StringBuilder object and its insert/delete/append methods. On the server-side, this StringBuilder would be converted to an actual file as necessary.

I'm curious as to whether there might be a better approach for solving this problem. Any ideas?

Community
  • 1
  • 1
jtan
  • 129
  • 8
  • You can use a memory chunk of data that is the file (which is backup-ed up not to loose data), and write it back to a file every now and then (maybe with revision extensions for rollbacks). – IllegalArgumentException Oct 21 '12 at 03:27

2 Answers2

4

You will want something like Ropes as a fundamental data structure. This will enable O(log n) edits, inserts, appends, concatenation etc. so you don't need to worry about edits in the middle of a large data structure.

Two open source libraries to consider:

On top of this you will need to build logic for merging and publishing synchronous changes. This is actually the tricky part: you'll need to decide on the logic for resolving conflicts etc. and how to transmit "deltas" to the client.

I would treat persistence / copying to permanent storage as a separate problem - best to get everything working well with in-memory data structures first. Then at periodic points you can flush the data out to persistent storage. I'd suggest something like Git, or if you are particularly adventurous you could try something like Datomic (which is essentially a database that works like Git, and keeps a history of all updates)

mikera
  • 105,238
  • 25
  • 256
  • 415
  • Sounds like the exact I was looking for. Never knew about ropes/cords before, so thanks. – jtan Oct 22 '12 at 01:23
  • The plan right now for managing changes is to have all the tasks from the clients (insert/delete/etc) go through a single-thread ExecutorService on the server. After the server handles the task, it'll pipe out the results in parallel (using a multi-thread ExecutorService) to the various clients. I'm hoping that the design will be efficient enough to prevent noticeable lag for the clients. Also, we actually were planning on using Git for persistent storage (though Datomic sounds really cool). – jtan Oct 22 '12 at 01:35
0

Maybe a better approach will be to use a distributed version control just like Git. User saves local copy of the repository, and they can pull from remote which will merge locally, commit will change local repository and push to update remote. That means you will need the privilege of users to save documents on their local machine.

  • 1
    Git is great, but this usage pattern wouldn't really fit the requirement for an "online editor" as posed in the question. – mikera Oct 21 '12 at 03:41