How can i write a function to create a list of indexes of selected characters in a string

Question

I've been given a string and I need to create a function of a list containing the indexes of the spaces between the words.

What have you tried so far? Please provide some code that we can see what the problem is. — , Oct 24 '21 at 15:11
1. this is not a place, where we solve your homework. 2. You should provide us with what you have already tried so that we can help you solve your problem. Also to solve your problem, you can iterate through the string like you would in C (using indexes) and check whether the character on the index is space, if yes, add it to the list, if no, continue. — StyleZ, Oct 24 '21 at 15:11

mysterymachines · Answer 1 · 2021-10-24T16:22:05.363

To do this with efficiency, you can use regular expression library:

import re
[match.start() for match in re.finditer(' ', 'my string is here')]
# [2, 9, 12]

re.finditer will search for you all occurences of first parameter, here a space, in your second string parameter
Then you can find start index of your match in match object with match.start()

This is to find all spaces and will not work if you want first index even if there are several spaces between two words, you may want to clarify your question

To compare to another solution timing:

import re
from time import time

text = "long text my test text was about 20000 chars"
start = time()
indices = [match.start() for match in re.finditer(' ', text)]
print(f'regex time :: {time() - start}')

start = time()
indices = []
text_length = len(text)
index = 0
while index < text_length:
    index = text.find(' ', index)
    if index == -1:
        break
    indices.append(index)
    index += 1
print(f'str.find :: {time() - start}')

start = time()
indices = [i for i, c in enumerate(text) if c == " "]
print(f'enumerate :: {time() - start}')

# regex time :: 0.0004987716674804688
# str.find :: 0.0010783672332763672
# enumerate :: 0.0011782646179199219

On this example first solution is about two times faster (20000 chars text)

I am not quite sure whether it would be "efficient" ... as far as I know, regex is very often slow. — StyleZ, Oct 24 '21 at 15:22
If you have a faster solution, I would love to have it ! I'm editing my answer to add a speed comparison — mysterymachines, Oct 24 '21 at 15:48
you are right, I was unable to make it more efficient :) ... good job, thank you for the clarification — StyleZ, Oct 24 '21 at 16:44

score 0 · Answer 2 · answered Oct 24 '21 at 15:11

Your question is not fully clear.

Assuming you have a string "abcdefgh" and a list [2,3,6] and you want to insert spaces before each position in the list, you could do:

s = 'abcdefgh'
idx = [2,3,6]
start = 0
out = []
for stop in idx+[len(s)]:
    out.append(s[start:stop])
    start = stop
' '.join(out)

output: 'ab c def gh'

score 0 · Answer 3 · answered Oct 24 '21 at 16:17

0

You could use enumerate in a list comprehension:

s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit"

[i for i,c in enumerate(s) if c==" "]

[5, 11, 17, 21, 27, 39, 50]

answered Oct 24 '21 at 16:17

Alain T.

40,517
4
31
51

How can i write a function to create a list of indexes of selected characters in a string

3 Answers3