0

I have the following dataframes in python that are part of a list

dataframe_list= []## CREATE AN EMPTY LIST
import pandas as pd
A=pd.DataFrame()
A["name"]=["A", "A", "A"]
A["att"]=["New World", "Hello", "Big Day now"]
B=pd.DataFrame()
B["name"]=["A2", "A2", "A2"]
B["Col"]=["L", "B", "B"]
B["CC"]=["old", "Hello", "Big Day now"]
C=pd.DataFrame()
C["name"]=["Brave old World", "A", "A"]

The above dataframes are of different sizes. these are stored as a list as follows

 dataframe_list.append(A)
 dataframe_list.append(B)
 dataframe_list.append(C)

I am trying to extract two dataframes that contain the word world(irrespective of case). I have tried the following code

list1=["World"]
result=[x for x in dataframe_list if any(x.isin(list1) ) ]

This however is yielding all the dataframes. The expected output is dataframes A, C. Am not sure where I am making a mistake here

Raghavan vmvs
  • 1,213
  • 1
  • 10
  • 29

1 Answers1

2

Use DataFrame.stack for Series and test by Series.str.contains by word w instead one element list, also is added words boundaries for match only whole words:

w="World"
result=[x for x in dataframe_list if x.stack().str.contains(rf"\b{w}\b", case=False).any()]
print (result)
[  name          att
0    A    New World
1    A        Hello
2    A  Big Day now,               name
0  Brave old World
1                A
2                A]

EDIT: For list of words is used | for regex or:

list1=["World",'Hello']
pat = '|'.join(rf"\b{x}\b" for x in list1)
result=[x for x in dataframe_list if x.stack().str.contains(pat, case=False).any()]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thank you. Quick Question. Is it mandatory to split a string to match the small substring as done here – Raghavan vmvs Sep 25 '20 at 09:09
  • 1
    Thank you again. Is it possible to pass a list of words instead of a string as you have done – Raghavan vmvs Sep 25 '20 at 09:14
  • 1
    @Raghavanvmvs - Answer was edited. Also is added word bondaries here to both solutions. It means if `Brave old WorldNo` or `Brave old World1` it is no match, but `Brave old World` matching – jezrael Sep 25 '20 at 09:20