-1

I want to create an empty list in python so that I can add items into it later by a function. But when I tried to add items into it through function it showed me "TypeError: Can't convert 'tuple' object to str implicitly". Why am getting this?

page = "There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, " \
       "or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure there isn't " \
       "anything embarrassing hidden in the middle of text. All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, " \
       "making this the first true generator on the Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence " \
       "structures, to generate Lorem Ipsum which looks reasonable. The generated Lorem Ipsum is therefore always free from repetition, injected humour, " \
       "or non-characteristic words etc."

find_word = "the"
word_positions = []
pos = 0

while page.find(find_word) != -1:
        word_positions.append(page.find((find_word, pos)))
        pos = pos + len(find_word)

print(word_positions)

2 Answers2

1

In the expression word_positions.append(page.find((find_word, pos))), page.find((find_word, pos)) passes a tuple to page.find, but page.find is expecting the first argument to be a string (the word to find). You want:

page.find(find_word, pos)

(notice that I dropped one set of parenthesis)


There are some other logic errors in your code as well. First, your loop might go on forever because page.find(find_word) will always find something if it found something the first time. Change it to:

while page.find(find_word, pos) != -1:

Second, you'll end up with duplicates in your list from:

pos = pos + len(find_word)

The number of found words has nothing to do with what position you expect to find them at. You probably want:

pos = word_positions[-1] + 1

since you want to continue looking immediately after the last found item.


Finally, this task can also be accomplished almost trivially using re. (You don't even have to write a regular expression since you're looking for a literal word!):

import re
word_positions = []
for match in re.finditer(find_word, page):
    word_positions.append(match.start())

print(word_positions)

Note that this can also be written in 1 line as a list-comprehension:

word_positions = [m.start() for m in re.finditer(find_word, page)]
mgilson
  • 300,191
  • 65
  • 633
  • 696
0

How about:

import re

page = "There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, " \
       "or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure there isn't " \
       "anything embarrassing hidden in the middle of text. All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, " \
       "making this the first true generator on the Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence " \
       "structures, to generate Lorem Ipsum which looks reasonable. The generated Lorem Ipsum is therefore always free from repetition, injected humour, " \
       "or non-characteristic words etc."

find_word = "the"
word_positions = []
pos = 0

for match in re.finditer(find_word, page):
    word_positions.append( (find_word, match.start()) )

print(word_positions)

It outputs:

[('the', 68), ('the', 273), ('the', 317), ('the', 341), ('the', 371), ('the', 443), ('the', 471), ('the', 662)]
advance512
  • 1,327
  • 8
  • 20