CSV file value comparison producing a new CSV file with the results

Question

I have two .csv files with a single row but with many columns. I wish to compare the data in the columns (except the first 3 columns) and output a new .csv containing the subtraction of the files, computed as baseline - test.

test1.csv

20170223, 433000000, 8k, -50, -50, -10, -50, -50

baseline.csv

20170223, 433000000, 8k, -50, -50, -50, -50, -50

The resultant .csv file should read something like:

20170223, 433000000, 8k,   0,   0, -40,  -0,  -0

I am able to bring up the .csv files but it is the column position and calculation that is proving difficult.

This is what i have so far:

import csv
with open('test001.csv', 'r') as f:
reader = csv.reader(f, delimiter = ',')
first_list = list(reader)
f.close()


with open('test002.csv', 'r') as f:
reader = csv.reader(f)
second_list = list(reader)
f.close()

result_list = list()
list_a = list()
list_b = list()

for row in first_list:
    for x in range(0, 6):
    result_list.append(row[x])

for x in range(6, len(row)-1):
    list_a.append(row[x])

for row in second_list:
 for x in range(6, len(row)-1):
    print(row[x])

    list_b.append(row[x])

for x in range(0, len(list_a)-1):    
 a = float(list_a[x])
 b = float(list_b[x])
 c = a-b
result_list.append(c)

myfile = open('difference.csv', 'w')
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)

wr.writerow(result_list)

myfile.close()

Please post what you have tried so far -- code and error it gives -- so we can try to help! — nigel222, Feb 23 '17 at 12:52
You can use pandas to read the csv files.. then, you can have two dataframes (df1 -> with test.csv / df2 -> with baseline) you can create df3 = df1 - df2 — Ika8, Feb 23 '17 at 13:15
Could you please show the code you're using and where specifically your difficulty is? Which tools are you using to load the `.csv`? Would it be acceptable if somebody suggested different tools? — Dev-iL, Feb 23 '17 at 15:29
may you edit your code so the indents are preserved? Also I see no point in printing the content of the rows. — abukaj, Feb 23 '17 at 21:15
Why do you iterate over rows of your lists when you do know there is exactly one row in each? — abukaj, Feb 23 '17 at 21:24

score 0 · Answer 1 · answered Feb 23 '17 at 13:00

Let's say you have read those files into two lists one and two

then you can compare these lists element by element using zip as follows:

>>> one = [1, 2, 3]
>>> two = [4, 5, 6]
>>> for o, t in zip(one, two):
...     print(o, t)
... 
(1, 4)
(2, 5)
(3, 6)
>>>

instead of print implement you own logic. To start from 4th column just use

`zip(one, two)[3:]`

zipa · Answer 2 · 2017-03-02T13:06:56.273

0

You can use pandas like this:

import pandas as pd
df1 = pd.read_csv('test1.csv', header=None)
df2 = pd.read_csv('baseline.csv', header=None)

diff = df1.copy()
diff[diff.columns[3:]] -= df2[df2.columns[3:]]
diff.to_csv('difference.csv', index=False, header=None)

edited Mar 02 '17 at 13:06

answered Feb 24 '17 at 14:48

zipa

27,316
6
40
58

Will give Pandas a try. Going to download Anaconda3-4 as I get a **"No module named 'pandas'"** error. – Anon Feb 24 '17 at 15:43
in cmd type `conda install -c anaconda pandas=0.19.2` – zipa Feb 24 '17 at 17:02
with the code above I get this error **AttributeError: 'function' object has no attribute 'columns'** – Anon Feb 28 '17 at 00:52
Forgot to call copy method :) – zipa Mar 02 '17 at 13:07

CSV file value comparison producing a new CSV file with the results

2 Answers2