How selecting M < n and gcd(M,p) > 1 increases collision?
In your hash function formula, M might reasonably be used to restrict the hash result to a specific bit-width: e.g. M=216 for a 16-bit hash, M=232 for a 32-bit hash, M=2^64 for a 64-bit hash. Usually, a mod/% operation is not actually needed in an implementation, as using the desired size of unsigned integer for the hash calculation inherently performs that function.
I don't recommend it, but sometimes you do see people describing hash functions that are so exclusively coupled to the size of a specific hash table that they mod the results directly to the table size.
The text you quote from says:
A good requirement for a hash function on strings is that it should be difficult to find a pair of different strings, preferably of the same length n, that have equal fingerprints. This excludes the choice of M < n.
This seems a little silly in three separate regards. Firstly, it implies that hashing a long passage of text requires a massively long hash value, when practically it's the number of distinct passages of text you need to hash that's best considered when selecting M.
More specifically, if you have V distinct values to hash with a good general purpose hash function, you'll get dramatically less collisions of the hash values if your hash function produces at least V2 distinct hash values. For example, if you are hashing 1000 values (~210), you want M to be at least 1 million (i.e. at least 2*10 = 20-bit hash values, which is fine to round up to 32-bit but ideally don't settle for 16-bit). Read up on the Birthday Problem for related insights.
Secondly, given n is the number of characters, the number of potential values (i.e. distinct inputs) is the number of distinct values any specific character can take, raised to the power n. The former is likely somewhere from 26 to 256 values, depending on whether the hash supports only letters, or say alphanumeric input, or standard- vs. extended-ASCII and control characters etc., or even more for Unicode. The way "excludes the choice of M < n" implies any relevant linear relationship between M and n is bogus; if anything, it's as M drops below the number of distinct potential input values that it increasingly promotes collisions, but again it's the actual number of distinct inputs that tends to matter much, much more.
Thirdly, "preferably of the same length n" - why's that important? As far as I can see, it's not.
I've nothing to add to templatetypedef's discussion on gcd.