1

imagine I type the following code into the interpreter:

var1 = 'zuuzuu'

now suppose i type:

var1.find('a')

the interpreter returns -1. which i understand because the substring has not been found. but please help me understand this:

var1.find('a' or 'z') #case 1

returns -1

but

var1.find('a' and 'z') #case 2

returns 0

According to the logic in my head the interpreter should return -1 for case 2 because the substrings 'a' AND 'z' are NOT located in the string. While in case 1, 0 should be returned since 'z' is a substring.

thanks

3 Answers3

8

Expression 'a' or 'z' always yields 'a'. Expression 'a' and 'z' always yields 'z'. It's not some kind of DSL for making queries into containers, it's a simple boolean expression (and find is called with its result). If you want to say "is there 'a' or 'z' in the string", you need to do

var1.find('a') != -1 or var.find('z') != -1

And for the second one (both 'a' and 'z' in the string):

var1.find('a') != -1 and var.find('z') != -1
Cat Plus Plus
  • 125,936
  • 27
  • 200
  • 224
2

This is because the find method does not in fact support or and and, it only supports querying for a string.

So, what is really going on? Well, it turns out that or and and are operators that can be performed on strings.

'a' and 'z' --> 'z'
'a' or 'z'  --> 'a'

So there you have it, you're basically just searching for 'a' and 'z' as normal.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
0
def count_tokens(text):
   
    #Tokenizes the given text and returns a dictionary with the count of each distinct token.
   
    # First, split the text into individual words
    words = text.split()

    # Next, create an empty dictionary to hold the token counts
    token_counts = {}

    # Loop over the words and count how many times each one appears
    for word in words:
        if word in token_counts:
            token_counts[word] += 1
        else:
            token_counts[word] = 1

    # Finally, return the token counts dictionary
    return token_counts

text = "This is a clock. This is only a clock."
counts = count_tokens(text)
print(counts)


### stopword function
import nltk
from nltk.corpus import stopwords

def count_tokens(text):
   
    #Tokenizes the given text, removes stopwords, and returns a dictionary with the count of each distinct token.

    # First, split the text into individual words
    words = text.split()

    # Next, remove stopwords from the words
    stop_words = set(stopwords.words('english'))
    words = [word for word in words if word.lower() not in stop_words]

    # Next, create an empty dictionary to hold the token counts
    token_counts = {}

    # Loop over the words and count how many times each one appears
    for word in words:
        if word in token_counts:
            token_counts[word] += 1
        else:
            token_counts[word] = 1

    # Finally, return the token counts dictionary
    return token_counts

text = "This is a clock. This is only a clock."
counts = count_tokens(text)
print(counts)
S.B
  • 13,077
  • 10
  • 22
  • 49
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 10 '23 at 13:51