2

I have a small program that writes 10 records to a block compressed SequenceFile on HDFS every second, and then run sync() every 5 minutes to ensure that everything older than 5 minutes are available for processing.

As my code is quite a few lines, I have only extracted the important bits:

// initialize

Configuration hdfsConfig = new Configuration();

CompressionCodecFactory codecFactory = new CompressionCodecFactory(hdfsConfig);
CompressionCodec compressionCodec = codecFactory.getCodecByName("default");

SequenceFile.Writer writer = SequenceFile.createWriter(
    hdfsConfig,
    SequenceFile.Writer.file(path),
    SequenceFile.Writer.keyClass(LongWritable.class),
    SequenceFile.Writer.valueClass(Text.class),
    SequenceFile.Writer.compression(SequenceFile.CompressionType.BLOCK;, compressionCodec)
);

// ...


// append

LongWritable key = new LongWritable((new Date).getTime());
Text val = new Text("Some value");
writer.append(key, val);

// ...

// then every 5 minutes...

logger.info("about to sync...");
writer.hsync();
logger.info("synced!");

From the logs alone, the sync operation appears to work just as expected, however, the file on HDFS remains small. After a while, there may be added some headers and some events, but even close to the frequency as I hsync(). Once the file is closed, then everything is flushed at once.

After each expected sync have also tried to manually check the content of the file to see if the data is there, however, the file appears empty here as well: hdfs dfs -text filename

Is there any known reasons why writer.hsync() does not work, and if so, are there any workarounds for this?

Further test case for this issue:

import java.util.HashMap;
import java.util.Map;
import java.util.Date;
import java.util.Calendar;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.Writable;

import java.io.IOException;

import java.text.SimpleDateFormat;
import java.text.DateFormat;
import java.text.ParseException;
import java.util.Calendar;
import java.util.Date;
import java.util.Locale;

public class WriteTest {
    private static final Logger LOG = LoggerFactory.getLogger(WriteTest.class);

    public static void main(String[] args) throws Exception {

        SequenceFile.CompressionType compressionType = SequenceFile.CompressionType.RECORD;
        CompressionCodec compressionCodec;
        String compressionCodecStr = "default";
        CompressionCodecFactory codecFactory;
        Configuration hdfsConfig = new Configuration();

        codecFactory = new CompressionCodecFactory(hdfsConfig);
        compressionCodec = codecFactory.getCodecByName(compressionCodecStr);

        String hdfsURL = "hdfs://10.0.0.1/writetest/";

        Date date = new Date();

        Path path = new Path(
            hdfsURL,
            "testfile" + date.getTime()
        );

        SequenceFile.Writer writer = SequenceFile.createWriter(
            hdfsConfig,
            SequenceFile.Writer.keyClass(LongWritable.class),
            SequenceFile.Writer.valueClass(Text.class),
            SequenceFile.Writer.compression(compressionType, compressionCodec),
            SequenceFile.Writer.file(path)
        );

        for(int i=0;i<10000000;i++) {

            Text value = new Text("New value!");
            LongWritable key = new LongWritable(date.getTime());

            writer.append(key, value);
            writer.hsync();

            Thread.sleep(1000);
        }

        writer.close();
    }
}

Result is that there is one fsync at the beginning writing the sequencefile headers, and then no more fsyncs. Content is written to disc once the file is closed.

agnsaft
  • 1,791
  • 7
  • 30
  • 49

1 Answers1

0

There are multiple issues here.

  1. Block Compression

When you use block compression with sequence files, that means a number of entries will be buffered in memory and then written in block compressed form when a limit is reached or sync is called manually.

When you call hsync on the writer it calls hsync on its underlying FSDataOutputStream. However that will not write the data sitting in the compression buffer in memory. So to get that data to the Datanode reliably you have to call sync first and then call hsync.

Note that doing that means the block compressed portion sent to the Datanode contains fewer entries than it usually would have. That has a negative impact on compression quality and will probably lead to more disc usage. (I guess that's why hsync does not call sync internally.)

  1. File Size reported to Namenode

Calling fsync sends data to the Datanodes, but does not report the new file size to the namenode. Technical discussions of this can be found here and here. Apparently it would be bad for performance to update the length every time. There is a special version of hsync which allows to update the Namenode info, but it is not exposed by SequenceFile.Writer.

    * @param syncFlags
    *          Indicate the semantic of the sync. Currently used to specify
    *          whether or not to update the block length in NameNode.
    */
    public void hsync(EnumSet<SyncFlag> syncFlags) throws IOException {
        flushOrSync(true, syncFlags);
    }

On the one hand the size issue means that even though some tools report an unchanged file size the data has nevertheless safely reached the Datanodes and can be read when opening an InputStream on them. On the other hand there is a bug in SequenceFile.Reader for compression type Record and None. With these compression types the Reader uses length info to determine how far to read. Since this length info is not updated by hsync it will incorrectly stop reading even though the data is actually available. Block compressed reading apparently does not use the length info and does not suffer from this bug.

Joe23
  • 5,683
  • 3
  • 25
  • 23