6

I am trying to add some random data into text file and I am successful in that but I am facing problem with the header line. I want to add header line once and then every time I run my script , it should add just data into file and ignore the header line if exists. I tried something like this but I fail. I try to look at this example code in SO python csv, writing headers only once, but couldn't implement properly. If somebody help me to correct my code. I will be thankful.

import random
import csv
import os.path
from time import gmtime, strftime

filename = '/home/robdata/collection1.dat'

file_exists = os.path.isfile(filename)

v = random.randint(0, 100)

with open(filename, "a") as csvfile:
    headers = ['DATE', 'value']
    writer = csv.DictWriter(csvfile, delimiter=',', lineterminator='\n',fieldnames=headers)
    if not file_exists:
        writer.writeheader()  # file doesn't exist yet, write a header

    writer.writerow({'DATE': strftime("%Y-%m-%d %H:%M:%S", gmtime()), 'value': v})

it insert data but without adding any header line. I want to include headers on the first run of script, and next time when I run script several times , it should only add data and not header line. thank a lot for any idea or help.

Community
  • 1
  • 1
rob
  • 153
  • 2
  • 6
  • 13
  • Could you edit into the question a description of what your code is doing incorrectly? – glibdud Mar 23 '17 at 13:38
  • @glibdud I add below in my question. I am able to insert data into file but without any headers. – rob Mar 23 '17 at 13:42
  • The first time you run it (before the file exists), you should get an AttributeError. Take a look at the differences between your `csv.writer` object and what was used in the question you linked. – glibdud Mar 23 '17 at 13:47
  • @glibdud I tried that example also which I mention. I change my question according to that example. as i am new to programming , might it is very simple mistake I am making here. – rob Mar 23 '17 at 13:56
  • The code you have there now, once you indent it properly, works perfectly for me. – glibdud Mar 23 '17 at 13:59
  • @glibdud yes it works but it includes no header when I run the script, and my problem is that i want to include header when i first run the script and then after on second run it should add data and ignore the headers. – rob Mar 23 '17 at 14:02
  • works for me too, includes header – Bill Bell Mar 23 '17 at 14:03
  • I'm not able to reproduce that behavior. I copied and pasted the code and ran it, and the first run it created headers. The second run it didn't. – glibdud Mar 23 '17 at 14:03
  • I would suggest that you simply copy the code that's available above and try it. – Bill Bell Mar 23 '17 at 14:04
  • @glibdud I will try it again and will come back soon. thanx for the help. – rob Mar 23 '17 at 14:05
  • @BillBell I simply copy the above code and run it and it add data but does not include the header . I don't know where the problem is because glibdud said that he used the same code and get the headers on the first attempt. – rob Mar 23 '17 at 14:10
  • Did you delete any existing version of `catalog1.dat` before you ran the code for the first time? – Bill Bell Mar 23 '17 at 14:12
  • @BillBell I just remove the data and make the file empty. – rob Mar 23 '17 at 14:14
  • 1
    That won't do it, will it? Because the file still exists. Notice that your code asks whether the file exists, not whether it's empty. – Bill Bell Mar 23 '17 at 14:16
  • @BillBell thanx a lot. i got the point. It works perfectly fine. .you save my efforts. – rob Mar 23 '17 at 14:20

5 Answers5

8

A slightly simpler alternative to Mr Evans approach would be to use the following test in place of the test for existence:

fileEmpty = os.stat('collection1.dat').st_size == 0

This obviates the need to do a seek, etc.

EDIT: Complete code:

import random
import csv
import os.path
from time import gmtime, strftime

filename = '/home/robdata/collection1.dat'

fileEmpty = os.stat(filename).st_size == 0

v = random.randint(0, 100)

with open(filename, "a") as csvfile:
    headers = ['DATE', 'value']
    writer = csv.DictWriter(csvfile, delimiter=',', lineterminator='\n',fieldnames=headers)
    if fileEmpty:
        writer.writeheader()  # file doesn't exist yet, write a header

    writer.writerow({'DATE': strftime("%Y-%m-%d %H:%M:%S", gmtime()), 'value': v})
Jason Roman
  • 8,146
  • 10
  • 35
  • 40
Bill Bell
  • 21,021
  • 5
  • 43
  • 58
  • thank you so much for your help. the above mention code in the question works perfectly fine according to my requirements. thanx for the help. – rob Mar 23 '17 at 14:36
  • would you like to paste the answer so that I accept, because you first edit my question code and it works fine. I ask you because you edit it first. – rob Mar 23 '17 at 14:40
  • You're most welcome. If this answer is the one that helped you then the etiquette on SO is for you to mark it 'accepted'. When you have more points you will be allowed to up-vote. – Bill Bell Mar 23 '17 at 14:40
  • Sure, I can do that. – Bill Bell Mar 23 '17 at 14:42
  • I mean the code in my question was edited by you and it works perfectly fine. If you just paste that code, I will accept your answer – rob Mar 23 '17 at 14:43
  • Done ----------- – Bill Bell Mar 23 '17 at 14:44
  • Thanks. Mr Anil and Mr Evans solutions also works fine that why I click them as useful and I accept your answer because you replied first. Once again thank you so much. – rob Mar 23 '17 at 14:50
5

Rather than testing if the file exists, you could instead check to see if has zero length by seeking to the end and asking for the file location (which is of course still needed to append).

This approach will then also write the header for the case when the file is empty but still exists. It also avoids needing to import os to make a separate os.stat() to determine the file size:

import random
import csv
from time import gmtime, strftime
    
headers = ['DATE', 'value']
v = random.randint(0, 100)
    
with open('collection1.dat', 'a', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=headers)
    f_output.seek(0, 2)
    
    if f_output.tell() == 0:
        csv_output.writeheader()

    csv_output.writerow({'DATE': strftime("%Y-%m-%d %H:%M:%S", gmtime()), 'value': v})        

writeheader() is then used to write the header only when the file is empty.


Note: If Python 2.x is still in use, you should always open the file in binary mode e.g. ab as follows:

with open('collection1.dat', 'ab') as f_output:
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
1

From the documentation of Python CSV module, writeheader can be used only with DictWriter. [https://docs.python.org/2/library/csv.html#csv.DictWriter.writeheader][Documentation of writeheader]

Hence, the code should not be using writeheader. You can try this.

if not file_exists:
    writer.writerow(headers)
else:
    writer.writerow([strftime("%Y-%m-%d %H:%M:%S", gmtime()), v, w, z])
Roopak A Nelliat
  • 2,009
  • 3
  • 19
  • 26
  • I edit my question. but I tried this idea before and i got this error. ValueError: dict contains fields not in fieldnames: '2017-03-23 13:59:39', 94 – rob Mar 23 '17 at 14:00
1

Looks like you edited your question while we were posting answer. Please add any changes as edit or mention what you have changed. Anyway, while dict fields are different, concept is same.

On top of dictwriter, you need to pass row data as values to the dict keys for each column.

I added a loop for 5 writes delayed by 5 sec as sample.
See below.

Working Code:

import random
import csv
import os.path
from time import gmtime, strftime, sleep

filename = 'collection1.csv'

file_exists = os.path.isfile(filename)

v = random.randint(0, 100)
w = random.randint(0, 100)
z = random.randint(0, 100)
with open(filename, "a") as csvfile:
    for i in range(5):        
        headers = ['timestamp','val1', 'val2', 'val3']
        writer = csv.DictWriter(csvfile, delimiter=',', lineterminator='\n',fieldnames=headers)
        if not file_exists:
            writer.writeheader()
        else:
            mytime = strftime("%Y-%m-%d %H:%M:%S", gmtime())
            writer.writerow({'timestamp':mytime, 'val1': v,'val2': w, 'val3':z})

        sleep(5)

collection1.csv

timestamp,val1,val2,val3
2017-03-23 14:07:20,83,60,11
2017-03-23 14:07:25,83,60,11
2017-03-23 14:07:30,83,60,11
2017-03-23 14:07:47,55,44,88
2017-03-23 14:07:52,55,44,88
2017-03-23 14:07:57,55,44,88
2017-03-23 14:08:02,55,44,88
2017-03-23 14:08:07,55,44,88
Anil_M
  • 10,893
  • 6
  • 47
  • 74
  • I tried your code but I get the same results. wait I will examine the code again. I will come back soon to update . thank you very much for your help. – rob Mar 23 '17 at 14:16
  • Just copy/paste my code as it is. It's working on my end. And this is your old code (modified).Let me know. – Anil_M Mar 23 '17 at 14:18
  • @accept my apology that I didn't inform and added that I edit my code. next time I will be careful. – rob Mar 23 '17 at 14:26
  • your code works fine. it also create in the run the headers and then in the subsequent run, it adds the data. thank you so much for your help. – rob Mar 23 '17 at 14:35
  • @rob , Glad to help. If it resolves your question, do you mind accepting the answer so that loop is closed. – Anil_M Mar 23 '17 at 14:37
  • once again thank you so much for your time and help. your code works fine but Bill Bell edited my code in the first place and it works accordingly , so I ask him if he want to paste his answer, then i will accept his answer otherwise I will accept your answer because it also fullfill my requirements. – rob Mar 23 '17 at 14:45
1

In case that you are appending pandas DataFrame to a csv file, and want to keep the header only at the first time you write out, here is what worked for me, much simpler than solutions before:

start = 100
with open('path_to_your_csv', "r",encoding = 'UTF-8') as myfile:
    reader = csv.DictReader(myfile)
    for i,line in itertools.islice(enumerate(reader) , start, 105): # iterate through row 100:105
    df # this is the data frame you want to export
    if i == start:
        df.iloc[0:0].to_csv(recent,sep = ",") # keep only header if this is the first you export
    df.to_csv(total,sep = ",",header =False)                                  
    del df
Jia Gao
  • 1,172
  • 3
  • 13
  • 26