0

Given a URL, I want to be able to get the number of characters (s if non digit character and d for digit character) after each special character. For example, for a URL like this:

url="https://imag9045.pexels1.pre45p.com/photos/414612/pexels-photo-414612.jpeg"

I want the output to be: '4s.4d.6s.1d.4s.2d.com/6s/6d/6s-5s-6d.'

The code I have below only generates the desired result before the domain (before '.com'). I am having issues generating the rest.

     How can I manipulate it to get the desired output (`'4s.4d.6s.1d.4s.2d.com/6s/6d/6s-5s-6d.'`)? 
user872009
  • 428
  • 2
  • 4
  • 18

1 Answers1

1

You will need to loop on every character, as in

import string
def mysplit(path):
    s=d=0
    out=''
    for c in path:
        if c in string.punctuation:
            if s:
                out += f'{s}s'
                s=0
            if d:
                out += f'{d}d'
                d=0
            out += c
        elif c in string.digits:
            d+=1
        else:
            s+=1
    if s:
        out += f'{s}s'
    if d:
        out += f'{d}d'
    return out

>>> mysplit('/photos/414612/pexels-photo-414612.jpeg')
'/6s/6d/6s-5s-6d.4s'

Apart from handling the top level domain name, the above function may be used for the first section of the url as well

>>> mysplit('https://imag9045.pexels1.pre45p.com')
'5s://4s4d.6s1d.4s2d.3s'
gimix
  • 3,431
  • 2
  • 5
  • 21