I'm writing a PeekReader class that I plan to share as public available library. The question is how to handle correctly line separators.
I plan to create a filter that transforms the line separators in new line.
/* Not actual code, just to let people visualize the problem,
doesn't need to be corrected */
String seq = readEOL();
return switch(seq) {
case "\n" -> "\n"; // standard new line
case "\r" -> "\n"; // Carrige return
case "\f" -> "\n"; // Form feed
case "\v" -> "\n"; // Vertical tabulation
case "\r\n" -> "\n"; // New line in Microsoft
case "\n\r" -> "\n"; // New line in RISC OS
case "\u0015" -> "\n"; // NL used by IBM mainframes
case "\u001E" -> "\n"; // RS used by QNX as line separator
case "\u0085" -> "\n"; // Unicode next line
case "\u2028" -> "\n"; // Unicode line separator
case "\u2029" -> "\n"; // Unicode paragraph separator
default -> seq;
}
The question is: how should I handle different line separators or similar objects? Is it correct to filter those and make them new lines or should I let them as they are, and jut manage them when readLine() is called?
I'm not asking opinions. I need to know the correct practice to be compliant subclassing Reader class.
I placed all the known characters/sequences to understand which ones I need to actually consider new lines and which ones I should change in "\n"
(if any at all).
I do have already my private version of this, thus I don't need suggestions on how write it. What I'm asking is how to write a function that meets the standards for sharing with the public.
Also, if you have some links about building libraries to share with the public, please feel free to post them. I'm really interested.
Addendum: PeekReader
extends Reader and it's aimed to improve/replace BufferedReader, especially when tokenizing/parsing streams. It uses the function peek()
that permits to preview the characters in the buffer without consuming them.