-1

We are using JAVA 8 and

here is the context of my issue:

We have a map in our program like this :

<Key, object containing (record-offset, record-lentgh)

We have to calculate length of each record in a file that should include line-separator characters to calculate record-offset of each record. For example :

record-offset of 1st record in the file  = 0
record-offset of 2nd record in the file  = 
                                record-offset of 1st record in the file 
                                + record length of 1st record

and so on...

In a latter process we will use these record-offset and record-length information to read each record from the file with RandomAccessFile.

This process is fast and saves memory during run time for us.

Now the problem is:

This record-offset calculation is being messed up as I was using BefferedReader.readLine() to read each record in file and was calculating the record-length and record-offset from the length of the returned string. BefferedReader stripes out the line-separator characters. And Line separator for DOS files are \r\n and for Unix/MAC files are \n. Hence, my latter process of reading file using RandomAccessFile is messed up because of the wrong offsets.Looks like to fix that I have to calculate the offsets, starting from the 2nd records this way:

line-separator-length  = 2;\\for DOS or 1 for UNix type file 
record-offset of 2nd record in the file  = 
                 record-offset of 1st record in the file 
                 + record length of 1st record 
                 + line-separator-length

Hence, My question is :

  • Is there anyway to read each line from a file that includes line-separator characters ( In that way I do not have to worry about the type of the file)?

OR

  • Is there any way to figure out what kind of file it is from JAVA? (DOS/UNIX/MAC)

OR

  • Is there any way I can check what are the line separator characters in a file?

Thanks in advance.

VictorGram
  • 2,521
  • 7
  • 48
  • 82

2 Answers2

0

Is there anyway to read each line from a file that includes line-separator characters?

Sure. Extend the abstract class Reader using BufferedReader as a model. Include the line separator characters.

Is there any way to figure out what kind of file it is from?

Sure. Unix ends with a line feed (\n), Windows ends with a carriage return, line feed (\r\n), and Mac (OS 10+) ends with a line feed (\n).

Older Macs end with a carriage return (\r).

Is there any way I can check what are the line separator characters in a file?

Your Reader class will return the line separator characters in the last or last 2 positions of the String.

Gilbert Le Blanc
  • 50,182
  • 6
  • 67
  • 111
0

This is how I resolved my issue: Thanks to discussion in: How to find out which line separator BufferedReader#readLine() used to split the line?

public int getLineTerminatorLength( String filePath ) throws FileUtilitiesException
{
    try (BufferedReader tempreader = FileUtilities.getBufferedReader( new File( filePath ) ))
    {

        String l = "";
        char termChar = ' ';

        while ( ( termChar = (char) tempreader.read() ) != -1 )
        {

            if ( ( termChar == '\n' ) || ( termChar == '\r' ) )
            {
                char ctwo = ' ';
                if ( ( ctwo = (char) tempreader.read() ) != -1 )
                {
                    if ( ( ctwo == '\n' ) || ( ctwo == '\r' ) )
                        return 2;
                }

                return 1;

            }

        }

    }
    catch ( Exception e )
    {
        String errMsg = "Error reading file  " + filePath;
        throw new FileUtilitiesException( errMsg );
    }

    //Will reach here if it is empty file
    return 0;
}
Community
  • 1
  • 1
VictorGram
  • 2,521
  • 7
  • 48
  • 82