I'm trying to do something simple, but don't know how to read the actual rows from the dataframe. I want to run some regex on each string.
The .csv file has no header, it's just one column full of a bunch of strings.
csv_data = pd.read_csv('list.csv', sep=',', header=None)
pattern = re.compile(r'(.*\/)(?!\/)(.*)', flags=re.DOTALL)
url_file = {
pattern.findall(row)[0]:
pattern.findall(row)[1]
for index, row in csv_data.iterrows()
}
But I just get
TypeError: expected string or bytes-like object
Edit 1
I do not believe this to be a duplicate, the other suggested SO question/solution is different context and has headers and multiple columns.
Edit 2
print(csv_data.dtypes)
0 object
dtype: object
print( csv_data.head())
0 https://...
1 https://...
2 https://...
3 https://...
4 https://...
Edit 3
Doing this:
for row in csv_data.iterrows():
print(row.dtypes)
gave the error AttributeError: 'tuple' object has no attribute 'dtypes'
So, it seems the contents are tuples, therefore just need to figure out how to get the string out of it.