//EDIT: This question is kind of a sub-question. For a shorter and better example, which has better replies, check This Post
I'm very new into python and even newer into pandas. I'm working with it for at least a month and I think I got most of the basics together.
My current task is to write values into a certrain cell, in a certain space inside of an xslx-file.
Situation
I have a very big excel-file including various data, from names to email-adresses and everything. As well I have two lists (.txt-files) with the same email-adresses of the excel-file in it, yet those emails got verified either if they match certain security-cheks or not. Depending on the outcome, they got stored inside of the "Secured.txt" or the "Unsecured.txt" file.
To write and read in the excel-file, I use pandas.
Task
Next to the 'Emails'-column in the excel file, there is a column in which you mark with an entry either if the email is secured, or unsecured. My actual task is to insert those entrys, depending in which text-file the email lies.
Possible Solution
My approach to solve this problem is to read out each .txt-file and store each email-adress in a variable using a list and a for-loop. Iterating through those emails, I know want to look for the location of the email-adress inside of the excel-file and access the cell right next to it. Same row, different column. Since the emails got sorted matching for their security-validation before, I just can put in the according value into the validation-cell right next to the email.
Question
My question is the following: How do I approach a specific row based on a value in it? I want to find the place of the cell which includes the actual content of the variable "mails", so I can move over to the cell right next to it. Since I know all the names of the columns, I actually just need the index of the row in which the email lies. I got the x-coordinate and need the y-coordinate.
Example
What I have up until now is the readout of the .txt-file:
import pandas as pd
import os
import re
#fetching the mail adress through indexnumber out of the list
with open('Protected/Protected G.txt', 'r') as file:
#creating the regex pattern to sort out the mail adresses
rgx = '\S+@\S+'
#read the file and convert the list into a string
content = file.readlines()
content_str = ''.join(content)
#get the mails out of the "list" with regex
mails = re.findall(rgx, content_str)
#put each mailadress in a variable
for item in mails:
print(item)
This dummy-dataframe represents the excel sheet I'm working with:
Dummy-Dataframe:
Forename Last Name Email Protection
1 John Kennedy John@gmx.net
2 Donald Trump Donald@gmx.net
3 Bill Clinton Bill@gmx.net
4 Richard Nixton Richard@gmx.net
I know want to pass the actual adress, stored in the variable 'item', to some kind of "locate"-function of pandas in order to find out in which row the actual email lies. As soon as I know in which row the adress lies, I can now tell pandas to write either an "x", saying the mail is protected, or an "o", meaning the mail is unprotected, in the very next column.
My finished dataframe could look like this:
Finished Dataframe:
Forename Last Name Email Protection
1 John Kennedy John@gmx.net x
2 Donald Trump Donald@gmx.net o
3 Bill Clinton Bill@gmx.net x
4 Richard Nixton Richard@gmx.net x
I really appreciate the help.