7

I'd like to be able to cleanly cut a paragraph larger than certain number of characters without cutting a word in the middle.

So for example this:

It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English.

Should become:

It is a long established fact that a reader will be distracted by the readable content ...

Here is the function that I came up with:

 func truncateText(s string, max int) string {
    if len(s) > max {
        r := 0
        for i := range s {
            r++
            if r > max {
                return s[:i]
            }
        }
    }
    return s
}

But it just brutally cuts the text. I'm wondering how can I modify (or replace it with a better solution) in order to cut the text elliptically?

Smn
  • 145
  • 1
  • 3
  • 9

6 Answers6

9

Slicing strings can be problematic because slicing works with bytes, not runes. Range, however, works with runes:

lastSpaceIx:=-1
len:=0
for i,r:=range str {
  if unicode.IsSpace(r) {
     lastSpaceIx=i
  }
  len++
  if len>=max {
    if lastSpaceIx!=-1 {
        return str[:lastSpaceIx]+"..."
    }
    // If here, string is longer than max, but has no spaces
  }
}
// If here, string is shorter than max
Burak Serdar
  • 46,455
  • 3
  • 40
  • 59
6

The range is totally unnecessary as written; as it is now, your whole function could just be:

func truncateText(s string, max int) string {
    return s[:max]
}

Which is so simple it shouldn't even be a function; but of course it also will cut off words, which you said you don't want. So instead you could:

func truncateText(s string, max int) string {
    if max > len(s) {
        return s
    }
    return s[:strings.LastIndex(s[:max]," ")]
}

Or if you want to use multiple characters as word boundaries not just spaces:

func truncateText(s string, max int) string {
    if max > len(s) {
        return s
    }
    return s[:strings.LastIndexAny(s[:max]," .,:;-")]
}
Adrian
  • 42,911
  • 6
  • 107
  • 99
1

I improved on the answer from Burak. This implementation returns the exact input string if len(text)=maxLen instead of adding an ellipsis, and if text has no spaces, it just does a hard truncation at maxLen.

func EllipticalTruncate(text string, maxLen int) string {
    lastSpaceIx := maxLen
    len := 0
    for i, r := range text {
        if unicode.IsSpace(r) {
            lastSpaceIx = i
        }
        len++
        if len > maxLen {
            return text[:lastSpaceIx] + "..."
        }
    }
    // If here, string is shorter or equal to maxLen
    return text
}

Test Case

func TestEllipticalTruncate(t *testing.T) {
    assert.Equal(t, "...", EllipticalTruncate("1 2 3", 0))
    assert.Equal(t, "1...", EllipticalTruncate("1 2 3", 1))
    assert.Equal(t, "1...", EllipticalTruncate("1 2 3", 2))
    assert.Equal(t, "1 2...", EllipticalTruncate("1 2 3", 3))
    assert.Equal(t, "1 2 3", EllipticalTruncate("1 2 3", 5))
}
Mark Lakata
  • 19,989
  • 5
  • 106
  • 123
0

The following solution avoids range but takes multibytes runes into account:

func ellipsis(s string, maxLen int) string {
    runes := []rune(s)
    if len(runes) <= maxLen {
        return s
    }
    if maxLen < 3 {
        maxLen = 3
    }
    return string(runes[0:maxLen-3]) + "..."
}

See https://go.dev/play/p/ibj6aK7N0rc

Cyril C
  • 1
  • 1
0

I offer very simple variant.

https://go.dev/play/p/Pbk5DchjReT

func ShortText(s string, i int) string {
    if len(s) < i {
        return s
    }

    if utf8.ValidString(s[:i]) {
        return s[:i]
    }
    return s[:i+1]

}
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 22 '23 at 03:49
-2

To split according to whitespace and more, you can use regex :

func splitString(str string) []string {
    re := regexp.MustCompile("[\\s\\n\\t\\r ]+") //split according to \s, \t, \r, \t and whitespace. Edit this regex for other 'conditions'

    split := re.Split(str, -1)
    return split
}

func main() {
    var s = "It is a long\nestablished fact that a reader\nwill\nbe distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English."
    var maxLen = 40

    arr := splitString(s)

    totalLen := 0
    finalStr := ``
    for _, each := range arr {
        if (totalLen + len(each) > maxLen) {
            fmt.Print(strings.TrimSpace(finalStr) + `...`)
            break
        }
        totalLen += len(each)
        finalStr += each + ` `

    }
}

//Old 2

You can do this kind of things : Split your string in slice and loop through the slice until the total length of your string is above the max allowed length.

    var s = "It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English."
    var maxLen = 30

    arr := strings.SplitAfter(s, ` `)

    totalLen := 0
    finalStr := ``
    for _, each := range arr {
        if (totalLen + len(each) > maxLen) {
            fmt.Print(strings.TrimSpace(finalStr) + `...`)
            break
        }
        totalLen += len(each)
        finalStr += each

    }

It is a long established fact...


//old bad answer
You have to work with strings and slices :

    var s = "It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English."

    newS := s[:30 - 3] 
    newS += `...`
    fmt.Print(newS)

Result : It is a long established fa...

TBouder
  • 2,539
  • 1
  • 14
  • 29
  • First line of the question: "I'd like to be able to cleanly cut a paragraph larger than certain number of characters without cutting a word in the middle." – Adrian Jan 28 '20 at 19:00
  • My bad, I updated my answer with a working solution – TBouder Jan 28 '20 at 19:09
  • Easy to understand, but does not take care of situation where there is newline in the text. So not robust enough. – Smn Jan 28 '20 at 19:56
  • If you want to split on whitespace, newline, tab, etc. you need to split the original string after the wanted values. I updated my original answer. – TBouder Jan 28 '20 at 20:41