2

I trying for personal skills improvement to solve the hacker rank challenge:

There is a string, s, of lowercase English letters that is repeated infinitely many times. Given an integer, n, find and print the number of letter a's in the first n letters of the infinite string.

1<=s<=100 && 1<=n<=10^12

Very naively I though this code will be fine:

fs := strings.Repeat(s, int(n)) // full string
ss := fs[:n]                    // sub string
fmt.Println(strings.Count(ss, "a"))

Obviously I explode the memory and got an: "out of memory".

I never faced this kind of issue, and I'm clueless on how to handle it. How can I manipulate very long string to avoid out of memory ?

Ben
  • 63
  • 2
  • 9
  • 1
    Do not materialise the string of characters as a Go string variable. – Volker Jun 07 '22 at 04:59
  • I wish to understand better, if I don't have my string as string variable, how can i get the first n letters of my string? – Ben Jun 07 '22 at 05:09

2 Answers2

2

I hope this helps, you don't have to actually count by running through the string. That is the naive approach. You need to use some basic arithmetic to get the answer without running out of memory, I hope the comments help.

var answer int64

// 1st figure out how many a's are present in s.
aCount := int64(strings.Count(s, "a"))

// How many times will s repeat in its entirety if it had to be of length n
repeats := n / int64(len(s))
remainder := n % int64(len(s))

// If n/len(s) is not perfectly divisible, it means there has to be a remainder, check if that's the case.
// If s is of length 5 and the value of n = 22, then the first 2 characters of s would repeat an extra time.
if remainder > 0{
aCountInRemainder := strings.Count(s[:remainder], "a")
answer = int64((aCount * repeats) + int64(aCountInRemainder))
} else{ 
answer = int64((aCount * repeats))
}
 
return answer

There might be other methods but this is what came to my mind.

Dhiwakar Ravikumar
  • 1,983
  • 2
  • 21
  • 36
  • thanks for the tips, I'm actually trying from this example to get a general approach when I face such issue – Ben Jun 07 '22 at 05:47
1

As you found out, if you actually generate the string you will end up having that huge memory block in RAM.

One common way to represent a "big sequence of incoming bytes" is to implement it as an io.Reader (which you can view as a stream of bytes), and have your code run a r.Read(buff) loop.


Given the specifics of the exercise you mention (a fixed string repeated n times), the number of occurrence of a specific letter can also be computed straight from the number of occurences of that letter in s, plus something more (I'll let you figure out what multiplications and counting should be done).


How to implement a Reader that repeats the string without allocating 10^12 times the string ?

Note that, when implementing the .Read() method, the caller has already allocated his buffer. You don't need to repeat your string in memory, you just need to fill the buffer with the correct values -- for example by copying byte by byte your data into the buffer.

Here is one way to do it :

type RepeatReader struct {
    str   string
    count int
}

func (r *RepeatReader) Read(p []byte) (int, error) {
    if r.count == 0 {
        return 0, io.EOF
    }

    // at each iteration, pos will hold the number of bytes copied so far
    var pos = 0
    for r.count > 0 && pos < len(p) {
        // to copy slices over, you can use the built-in 'copy' method
        // at each iteration, you need to write bytes *after* the ones you have already copied,
        // hence the "p[pos:]"
        n := copy(p[pos:], r.str)
        // update the amount of copied bytes
        pos += n

        // bad computation for this first example :
        // I decrement one complete count, even if str was only partially copied
        r.count--
    }

    return pos, nil
}

https://go.dev/play/p/QyFQ-3NzUDV

To have a complete, correct implementation, you also need to keep track of the offset you need to start from next time .Read() is called :

type RepeatReader struct {
    str    string
    count  int
    offset int
}

func (r *RepeatReader) Read(p []byte) (int, error) {
    if r.count == 0 {
        return 0, io.EOF
    }

    var pos = 0
    for r.count > 0 && pos < len(p) {
        // when copying over to p, you should start at r.offset :
        n := copy(p[pos:], r.str[r.offset:])
        pos += n

        // update r.offset :
        r.offset += n
        // if one full copy of str has been issued, decrement 'count' and reset 'offset' to 0
        if r.offset == len(r.str) {
            r.count--
            r.offset = 0
        }
    }

    return pos, nil
}

https://go.dev/play/p/YapRuioQcOz


You can now count the as while iterating through this Reader.

LeGEC
  • 46,477
  • 5
  • 57
  • 104
  • thanks, I try with io.reader. I guess that's what I was looking for. When I find the solution, I post it. – Ben Jun 07 '22 at 05:45
  • io.reader is a good approach but I believe it don't work for this case...The problem is I can't even repeat the string n times without generating an _out of memory_. As a result I can't pass my big string to io.reader. But thanks again, at least I practice with r.read(buf) – Ben Jun 07 '22 at 07:42
  • @Ben : "I can't even repeat the string n times without generating an *out of memory*" here is a question for you : what is n in your case ? – LeGEC Jun 07 '22 at 08:01
  • when I've got low value of n, I can nicely use reader, but when n= 10^12...can't figure out how to do – Ben Jun 07 '22 at 08:04
  • you are still creating that `strings.Repeat(...)` string, aren't you ? ;) Look at the signature of your `.Read(...)` function, and think for a moment about what you actually *need* to allocate. – LeGEC Jun 07 '22 at 08:07
  • indeed, I realize lately it's very naive from me to continue with this approach...I just wanted to find a way to nicely split this huge string with reader...but from the very beginning, I'm wrong – Ben Jun 07 '22 at 08:11
  • have you found a way to code a way to copy over and over the same string to the read buffer ? do you need assistance with that ? – LeGEC Jun 07 '22 at 11:56
  • No, when I realized that I can't use strings.Repeat, I started to find a way with a _for_ but I was turning around with this big n number (10^12). I would indeed appreciate to see how you can repeat a string 10^12 times to chunk it to the first 10^12 characters – Ben Jun 07 '22 at 23:59
  • 1
    Thanks a lot @LeGec for your time and explanation. It is much appreciated and helped me a lot. That's indeed the solution I was looking for...anyway the time complexity still obviously very high when n=10^12 but I don't get _out of memory_ error message. – Ben Jun 13 '22 at 04:02