1

I was reading paper published on Google File System and find out that GFS supports append and updates in arbitrary locations of an existing file.

As far as I know that HDFS wont support update operation because it is meant for write once and read many time functionality. HDFS do support append operation nowadays. They have dfs.support.append is set to false by default for recent releases.

So my question is there any way in which we can some kind of update operation. I had tried looking but all I had figure out isthat HDFS does not support update operation.

Hope to hear from you soon.

FYI: I had read many posts on claudera and other about this. I was able to find in some blogs by hadoop contributors that there are probability that HDFS does support update operation. But no one mentioned or made exact statement as how it do update operation.

user1188611
  • 945
  • 2
  • 14
  • 38

1 Answers1

2

The current major release, Apache Hadoop 2.0 offers several significant HDFS improvements including new append-pipeline. You can find detailed info on the append design document. And these are the related tickets :

HADOOP-1700

HDFS-265

BTW, I have tried append on 1.x as well. It works, but isn't safe.

Tariq
  • 34,076
  • 8
  • 57
  • 79
  • Thanks for replying. So is there any tickets or request you know generated for Update operation on HDFS and are they really planning to have those operation included in Hadoop recent releases. Because if they do there are lot many things which will change they way we operations are performed in hadoop currently. – user1188611 Sep 26 '13 at 21:49
  • What do you mean by update?Append as well is an update. Yes, 2.x comes with this feature. And things will definitely change with update feature being operational. – Tariq Sep 26 '13 at 21:54
  • By update I meant that once I had written something in a file present in HDFS, can I overwrite the content present in that file (append wil allow to add more content at end of file not overwrite the content)?? You are saying that 2.x allows to overwrite the content in file present in hdfs with new content. – user1188611 Sep 26 '13 at 21:59
  • AFAIK, HDFS does not have random-write support, i.e you can jump to a selected offset in the file. Append feature allows us to add data at the end of a previously closed file. – Tariq Sep 26 '13 at 22:21
  • Ok Thanks let me know if you find anything else. – user1188611 Sep 26 '13 at 22:30