3

We have a local website that tracks the number of people using a certain license. I have create a scraper with that should run every hour. The only issue I have it's creating data that looks like this.

active_users,date,time
35,22/03/2022,11:38:30.397745
active_users,date,time
36,22/03/2022,11:44:04.753589

the issue I find is that every time scrapy crawl users is ran it adds that header. I know scrapy has CsvItemExporter() that can remove the header but I'm not too sure how to use it.

I just need the output csv to look like

active_users,date,time
35,22/03/2022,11:38:30.397745
36,22/03/2022,11:44:04.753589
jack
  • 31
  • 3
  • hi, perhaps this might be of interest https://stackoverflow.com/questions/36710262/scrapy-how-to-output-to-csv-without-column-headings – jspcal Mar 22 '22 at 00:57
  • @jspcal ohh thanks! I also found a related page from the link you sent was also pretty useful. https://stackoverflow.com/questions/34485789/scrapy-csv-output-without-header?noredirect=1&lq=1 – jack Mar 22 '22 at 01:41
  • https://stackoverflow.com/questions/34485789/scrapy-csv-output-without-header?noredirect=1&lq=1 in case someone comes across this post looking for the same answer. I found the link above pretty useful and solved my issue – jack Mar 22 '22 at 01:44

1 Answers1

1

If you are using scrapy version 2.4 and above you can directly change this setting when defining the FEED such as below.

custom_settings = {"FEEDS": { "items.csv": {"format": "csv", "item_export_kwargs": {"include_headers_line": False}}}}

This will append to the contents of the file instead of adding a new header each time.

msenior_
  • 1,913
  • 2
  • 11
  • 13