2

Background: I'm taking large Strings (500-5000 characters) and put them in a PPTX file. My slides only have enough space for around 1000 characters each slide. I would like to split the text after the slide is full and continue on the next slide. As far as I know there is no way (pretty sure) to check the slide if the text is already outside of the slide/textbox (I'm using "OpenXML SDK in combination with "Open-XML PowerTools").

Process so far: I managed to write a method that splits the string after exactly 1000 characters without triggering an error when the maxLength is smaller than 1000 characers (as it happend with Substring(startIndex, maxLength)). This is my Truncate Method:

public static string Truncate(string text, int startIndex, int maxLength)
        {

            if (string.IsNullOrEmpty(text)) return text;
            if (text.Length > startIndex + maxLength)
            {
                return text.Substring(startIndex, maxLength) + "-";
            }
            return text.Substring(startIndex);
        }

Problematic: The problem now is that some strings have a lot of newlines and others have few or none. This results into some strings that take a lot of height and being to large to fit into the slide.

Possible idea: I thought about estimating the string height counting the newlines and adding 30 characters to the number of characters for each newline. This gives a more exact approximation. (A line full of characters contains usually about 50 characters but sometimes the newline is in the middle of the line so I thought +30 characters would be a good guess). This is my method so far:

public int CountCharacters(string text)
{
    var numLines = text.Length;
    numLines += (text.Split('\n').Length) * 30;
    return numLines;
}

Resulting question: How do I now combine these approaches to split the string after I reach a 1000 taking the newlines into account? Or is this the wrong approach after all?

Tobias Mayr
  • 196
  • 4
  • 16
  • 1
    I'm not sure if this would help you, but there's an option in the `Split` method to specify how many substrings to return: [Split - MSDN](https://msdn.microsoft.com/en-us/library/c1bs0eda(v=vs.110).aspx) – Michael Armes Nov 02 '16 at 15:20
  • 2
    I wonder is using [Graphics.MeasureString](http://stackoverflow.com/questions/721168/how-to-determine-the-size-of-a-string-given-a-font) would help? I'm not sure if it handles new lines though. – stuartd Nov 02 '16 at 15:31
  • 1
    @Meghan I would need to define a character as the point where the split happens right? For example a whitespace. Then I could use my CountCharacers methods return value divided by a 1000 to find the number of splits that I need. But then I don't know how to define which whitespace exactly will be used to do the split. I imagine it takes only the first ones. Lets find out, I will do my research. Thanks for the contribution! – Tobias Mayr Nov 02 '16 at 15:36

3 Answers3

1

You can measure a string, and determine needed space. Keep in mind the font is a factor so if you use multiple fonts you will need to do multiple calculations.

https://msdn.microsoft.com/en-us/library/6xe5hazb(v=vs.110).aspx

  • This sounds like a very elegant and exact solution. However, I lack knowledge of that matter. You think I can compare the width and height of my pptx slide and use it to calculate when my textbox would be full(given the font size etc)? I would have to create a rectangle internally just to measure it right? Maybe you could tip me off! Thanks! – Tobias Mayr Nov 03 '16 at 14:17
  • I know this can work, I used it to draw list box controls. The height is fixed (account for jg letter that drop). Use can feed it a set of characters to get a good average letter width or measure when you think you are close to the limit. –  Nov 03 '16 at 14:57
1

How about something like this:

        string test; //your input string
        int j;  //variable that holds the slide content length
        int max_lines = 10;  //desired maximum amount of lines per slide
        int max_characters = 1000; //desired maximum amount of characters
        char[] input = test.ToCharArray();  //convert input string into an array of characters
        string slide_content;    //variable that will hold the output of each single slide

        while (input.Length > 0)
        {
            //reset slide content and length
            slide_content = string.Empty;
            j = 0;

            //loop through the input string and get a 'j' amount of characters
            while ((slide_content.Split('\n').Length < max_lines) && (j < max_characters))
            {
                j = j + 1;
                slide_content = new string(input.Take(j).ToArray());
            }

            //Output slide content
            Console.WriteLine(slide_content);
            Console.WriteLine("=================== END OF THE SLIDE =====================");          

            //Remove the previous slide content from the input string
            input = input.Skip(j).ToArray();       
        }

        Console.Read();
Innat3
  • 3,561
  • 2
  • 11
  • 29
  • Wouldn't the while condition be something like `(slide_content.Split('\n').Length + j/50 < max_lines)` that way I always will end up with max_lines as the amount of lines. _(Given that a full line consists of ~50 Characters)_? But thanks a lot! I feel like I will include your approach in my project. I will mark it as the answer as soon as it works like intended! – Tobias Mayr Nov 03 '16 at 13:25
  • As a bonus I edited the while-loop so that it only splits the current text at white-spaces or if there is a newline at the last two lines. It slows the program a bit down but it puts the result is a very nice format. `while (slide_content.Split('\n').Length + j / 50 < max_lines || input[j] != ' ') { j = j + 1; slide_content = new string(input.Take(j).ToArray()); if (j == input.Length || (slide_content.Split('\n').Length + j / 50 > max_lines - 2 && input[j] == '\n')) { break; } }` – Tobias Mayr Nov 03 '16 at 15:24
  • @Tobias Mayr, yes, I agree some editing may be in order, I was certain I wasn't giving you a working solution to your problem, but instead a different approach from which you could get ideas or adapt to your will – Innat3 Nov 03 '16 at 15:45
1

I think you're better just to manually count the newlines in your truncate function. Those Split calls are not cheap, and all you are using them for is to count newlines. Here's my attempt (CHARS_PER_NEWLINE is a global constant corresponding to the 30 you suggested):

public static string Truncate(string text, int startIndex, int maxLength)
    {

        if (string.IsNullOrEmpty(text)) return text;
        int remaining = maxLength;
        int index = startIndex;
        while (remaining > 0 && index<text.Length)
        {
            if (text[index] == '\n')
            {
                remaining -= CHARS_PER_NEWLINE;
            }
            else
            {
                remaining--;
            }
            index++;
        }
        return text.Substring(startIndex, index - startIndex) + '-';
    }