0

I was looking to collect each word from a list that is included in a string in python. I found some solutions but so far i get:

data = "Today I gave my dog some carrots to eat in the car"
tweet = data.lower()                             #convert to lower case
split = tweet.split()

matchers = ['dog','car','sushi']
matching = [s for s in split if any(xs in s for xs in matchers)]
print(matching)

The result is

['dog', 'carrots', 'car']

How do I fix that the result is only dog and car without adding spaces to my matchers?

Also how would I remove any $ signs (as example) from the data string but no other special characters like @?

appletree3
  • 23
  • 3
  • Also can anybody explain how the line "matching = [s for s in split if any(xs in s for xs in matchers)]" works? What is the meaning of s and xs? – appletree3 Aug 16 '22 at 14:23
  • 1
    Do an equality test: `xs == s` instead of `xs in s`? – jarmod Aug 16 '22 at 14:25
  • _"can anybody explain how the line "matching = ..." works:"_ Look up [list comprehensions](https://www.google.com/search?q=python+list+comprehension) – Pranav Hosangadi Aug 16 '22 at 14:27
  • _"how would I remove any $ signs:"_ Look up [how to replace characters in a string](https://www.google.com/search?q=how+to+replace+characters+in+a+string+python) – Pranav Hosangadi Aug 16 '22 at 14:28
  • [How much research effort is expected of Stack Overflow users?](//meta.stackoverflow.com/a/261593/843953) Please take the [tour], and read [what's on-topic here](/help/on-topic), [ask], and the [question checklist](//meta.stackoverflow.com/q/260648/843953). Please restrict yourself to one question per post. Welcome to Stack Overflow! – Pranav Hosangadi Aug 16 '22 at 14:29

1 Answers1

0
How do I fix that the result is only dog and car without adding spaces to my matchers?

To do this with your current code, replace this line:

matching = [s for s in split if any(xs in s for xs in matchers)]

With this:

matching = []
# iterate over all matcher words
for word in matchers:
    if word in split:  # check if word is in the split up words
        matching.append(word)  # add word to list

You also mention this:

Also how would I remove any $ signs (as example) from the data string but no other special characters like @?

To do this, I would create a list that contains characters you want to remove, like so:

things_to_remove = ['$', '*', '#']  # this can be anything you want to take out

Then, simply strip each character from the tweet string before you split it.

for remove_me in things_to_remove:
    tweet = tweet.replace(remove_me, "")

So a final code block that demonstrates all of these topics:

data = "Today I@@ gave my dog## some carrots to eat in the$ car"
tweet = data.lower()                             #convert to lower case

things_to_remove = ['$', '*', '#']

for remove_me in things_to_remove:
    tweet = tweet.replace(remove_me, "")
print("After removeing characters I don't want:")
print(tweet)

split = tweet.split()

matchers = ['dog','car','sushi']

matching = []
# iterate over all matcher words
for word in matchers:
    if word in split:  # check if word is in the split up words
        matching.append(word)  # add word to list
print(matching)
Mitchnoff
  • 495
  • 2
  • 7