A Sample line of my data:
12808|08.12.2008|13:44:35|-0.05||||||||0.26|1.53|2.94|0.81|1.75|5.53|79.56||||2|K:\Path\to\File\TE08-08-Chla-12.08.2008.xls|19.01.2009 09:34:57|9|15||
The search patterns and function:
oldpatdat='[|][0-3][0-9]\.{1}[0-1][0-9]\.{1}[1-2][0-9][0-9][0-9][|]'
#Date like |30.12.2009|
oldpatdat='\|{1}[0-3][0-9]\.{1}[0-1][0-9]\.{1}[1-2][0-9][0-9][0-9]\|{1}'
#same, works finding, but substituting... nope
oldpatdattim='[0-3][0-9]\.{1}[0-1][0-9]\.{1}[1-2][0-9][0-9][0-9]\ {1}[0-5][0-9]\:{1}[0-5][0-9]\:{1}[0-5][0-9]'
#DateTime like 24.10.2010 12:34:56 works, as I don't need the use of Vertical lines
def setdef(line):
print("line in setdef: "+line)
if re.search(oldpatdattim,line):
print("oldpatdattim found")
klem=re.findall(oldpatdattim,line)
print("Klem+time: "+str(klem))
for ele in klem:
print("ele: "+str(ele))
new=ele[6:10]+"-"+ele[3:5]+"-"+ele[0:2]+ele[10:]
print("new: "+new)
line=re.sub(ele,new,line)
print("line after sub1(dattim): "+str(line))
if re.search(oldpatdat,line):
print("oldpatdat found")
klem=re.findall(oldpatdat,line)
print("Klem: "+str(klem))
for ele in klem:
print("ele: "+str(ele))
new="|"+ele[7:11]+"-"+ele[4:6]+"-"+ele[1:3]+"|"
print("new: "+new)
line=re.sub(ele,new,line)
print("line after sub2(dat): "+str(line))
Output:
bash>line in setdef: 12806|08.12.2008|13:43:34|-0.06||||||||0|1.53|3.54|0.36|1.66|5.44|79.59||||2|K:\Path\to\File\TE08-08-Chla-12.08.2008.xls|19.01.2009 09:34:57|9|15||
bash>oldpatdattim found
bash>Klem+time: ['19.01.2009 09:34:57']
bash>ele: 19.01.2009 09:34:57
bash>new: 2009-01-19 09:34:57
bash>line after sub1(dattim): 12806|08.12.2008|13:43:34|-0.06||||||||0|1.53|3.54|0.36|1.66|5.44|79.59||||2|K:\Path\to\File\TE08-08-Chla-12.08.2008.xls|2009-01-19 09:34:57|9|15||
bash>oldpatdat found
bash>Klem: ['|08.12.2008|']
bash>ele: |08.12.2008|
bash>new: |2008-12-08|
bash>line after sub2(dat): |2008-12-08|1|2008-12-08|2|2008-12-08|8|2008-12-08|0|2008-12-08|6|2008-12-08|||2008-12-08||2008-12-08||2008-12-08|||2008-12-08|1|2008-12-08|3|2008-12-08|:|2008-12-08|4|2008-12-08|3|2008-12-08|:|2008-12-08|3|2008-12-08|4|2008-12-08|||2008-12-08|-|2008-12-08|0|2008-12-08|.|2008-12-08|0|2008-12-08|6|2008-12-08|||2008-12-08|||2008-12-08|||2008-12-08|||2008-12-08|||2008-12-08|||2008-12-08|||2008-12-08|||2008-12-08|0|2008-12-08|||2008-12-08|1|2008-12-08|.|2008-12-08|5|2008-12-08|3|2008-12-08|||2008-12-08|3|2008-12-08|.|2008-12-08|5|2008-12-08|4|2008-12-08|||2008-12-08|0|2008...etc
I'm working with python 3.8.2 and having a problem with regular expressions sub()
.
I have a CSV table with dates and times seperated by vertical lines. As I need to get the dates and times in the right format for my database, I tried regular expressions to look them up and work on them.
Because I don't want to edit dates from filenames, I'm searching the lines for dates like '|30.12.2009|'
to be sure that it is a table column and not part of a filename. So I'm trying to replace it by '|2009-12-30|'
. The first if
clause works as expected, as there are no vertical lines in the search/replace pattern. The problem I encounter is with re.sub()
. The element I'm looking for is found, the new string is prepared, but using re.sub()
squashes the newly prepared date between every char of the line.
Escaping the vertical lines didn't work out for me, the way I did it. In fact it did, but only for the lookup.
I'm a bit puzzled about it and haven't found any solution so far.
How do I get my re.sub()
doing the right thing?