-3

I'm looking if anyone has a good way of splitting a line of text into even sized chunks on white space. Specifically, I'm looking to build a function that takes a string and number of chunks. The goal would be that each line of the split has the same number of characters, or as close as possible (character delta between all lines as close to 0 as possible). For example, if the string is:

text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum augue sapien, varius a leo vel, tincidunt lobortis ipsum. Vivamus ex lectus, efficitur nec lorem id, elementum volutpat libero."

chunksSize = 2, would be:

Line 1: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum augue sapien, varius a leo vel

Line 2: tincidunt lobortis ipsum. Vivamus ex lectus, efficitur nec lorem id, elementum volutpat libero.

chunksSize = 3, would be:

Line 1: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum

Line 2: augue sapien, varius a leo vel, tincidunt lobortis ipsum. Vivamus

Line 3: ex lectus, efficitur nec lorem id, elementum volutpat libero.

chunkSize = 4, would be:

Line 1: Lorem ipsum dolor sit amet, consectetur adipiscing

Line 2: elit. Vestibulum augue sapien, varius a leo vel,

Line 3: tincidunt lobortis ipsum. Vivamus ex lectus,

Line 4: efficitur nec lorem id, elementum volutpat libero.

These splits might not be exactly accurate as I did them by eye. Anyone want to try their hand at this in Java?

cdubbs
  • 97
  • 12

1 Answers1

0
int noChunks = 3;//get this from user
String line = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum augue sapien, varius a leo vel, tincidunt lobortis ipsum. Vivamus ex lectus, efficitur nec lorem id, elementum volutpat libero.";

String []arr = line.split(" ");

int chunkSize = arr.length/noChunks;

List<String> chunks = new ArrayList<String>();

int index = 0;

for(int i=0;i<noChunks;i++){
    StringBuilder builder = new StringBuilder();
    for(int x = 0;x<chunkSize;x++){
        builder.append(arr[index]+" ");
        index++;
    }
    chunks.add(builder.toString());
}
Max08
  • 955
  • 1
  • 7
  • 16
  • Thanks Ankit - I did make one correction, moving the StringBuilder declaration into the outer for loop, otherwise all the chunks are appending. That being said, the difficulty here is if one of the lines has a really long word in it, for example: Line1: "Lorem ipsum dolor sit amet, consectetursssssssssssssssssssssssssssss adipiscing elit. Vestibulum" Line2: "augue sapien, varius a leo vel, tincidunt lobortis ipsum." Line3: "Vivamus ex lectus, efficitur nec lorem id, elementum volutpat" In this cases the lines do not come out evenly. – cdubbs Jun 16 '17 at 20:23
  • Instead of using word count, use character count. Append the words until you reach (surpass) line.length/noChunks, then step to the next item. This way you also eliminate the problem of missing words at the end. – tevemadar Jun 16 '17 at 20:28
  • @cdubbs From your question it seems you wanted to split on white spaces i.e each chunk should have equal no. of words. Length of a word should not be an issue for this use case. As tevemadar said you can do a character count. If you think you have your answer please mark this as correct. – Max08 Jun 16 '17 at 21:18
  • @ANKIT GAUR I did specify by minimum character delta in the original posting, but furthermore doing this by character count is why I posted here initially since that is the much more difficult problem. – cdubbs Jun 17 '17 at 14:12