-1

I have one file and I am trying to populate second file using its data in Python. While populating I am applying some if statements to manipulate one column in the 2nd file.

What I want to perform using these if statements: Check 31st column in file and if its value is 0 then return 0 value in 32nd column, if its value is blank then return 2 value in 32nd column and for every other non-zero and non-blank value return 1 value in 32nd column.

with open('DE_Combined_' + time.strftime("%Y-%m-%d")+".csv", "rb") as in_file, open('DE_Combined_PD_' + time.strftime("%Y-%m-%d")+".csv", "wb") as out_file:
   reader = csv.reader(in_file)
   writer = csv.writer(out_file)
   headers = next(reader, None)  # returns the headers or `None` if the input is empty
   if headers:
        writer.writerow(headers)
   for row in reader:
       if row[30] != 0:
           row[31] = 1
       else:
           row[31] = 0
       if row[30] == "":
           row[31] = 2
       writer.writerow(row)

This seems like a pretty straightforward problem but my output file is giving me the wrong results. It has 1 value in 32nd column for 0 value in 31st column. It should have been 0 in both columns. Although 2 value is coming against the blank value in 31st column.

Please help me in resolving this problem as I am clueless now after several attempts to find something wrong with the logic of if statements I have used.

Snippet of actual O/P: ( last 2 entries in both the rows are the columns concerned)

05/23/2013  May 2013    2013    4   2   Thursday    UK  O2  £   NOKIA   100 NOKIA 100   Smartphone  Symbian NA          9.99            N       Pay & Go    Pay & Go                    0.64222 0   1
05/23/2013  May 2013    2013    4   2   Thursday    UK  O2  £   NOKIA   100 NOKIA 100   Smartphone  Symbian NA  Pink        9.99            N       Pay & Go    Pay & Go                    0.64222 0   1

I/P Snippet:(First 2 rows) ( Last 2 columns are the concerned columns)

Date,Month,Fiscal_Year,Calendar_Year,FY_Quarter,CY_Quarter,Day_of_Week,Geography,MO,Currency,Device_OEM,Device_Name,GDN,Device_Type,Device_OS,Device_Franchise,Device_Color,Device_Storage_in_GB,Device_Price,Device_Monthly_Price,Additional,Device_Refurb,Plan_COMPARISON,Plan_Name,Plan_Contract_Length,Plan_Monthly_Price,Plan_Data_in_GB,Plan_Minutes,Plan_Texts,Exchange_Rate_vs_1USD,Difference,Difference_Flag
05/23/2013,May,2013,2013,4,2,Thursday,UK,O2,£,NOKIA,100,NOKIA 100,Smartphone,Symbian,NA,,,9.99,,,N,,Pay & Go,Pay & Go,,,,,0.64222,0.0,
05/23/2013,May,2013,2013,4,2,Thursday,UK,O2,£,NOKIA,100,NOKIA 100,Smartphone,Symbian,NA,Pink,,9.99,,,N,,Pay & Go,Pay & Go,,,,,0.64222,0.0,
atams
  • 2,739
  • 3
  • 16
  • 14
  • 2
    add a `print row` just after the for-loop, does it look like you expect it to do? Or update your question with the actual input. Right now we can do nothing but guess... – Fredrik Pihl May 23 '13 at 12:38
  • have you considered splitting your read and write functions? first read the input csv, populate a list with the data you want to write, manipulate as desired, then write the ouput file. i am thinking something is going on between the `row[30] != 0 `and `row[31] = 1` – pedram May 23 '13 at 12:54
  • @multiphrenic Can you help me in devise that approach? I am very new to programming – atams May 23 '13 at 13:05
  • the `O/P` part contains 28 and 29 columns. Is that part the input or the output? We need to input to be able to help you... – Fredrik Pihl May 23 '13 at 13:08
  • it looks like 28 or 29 because of empty columns present in the row. – atams May 23 '13 at 13:09
  • if it is the input, the data isn't in csv-format which require it to be separated with a comma (`,`) (or another delimiter) – Fredrik Pihl May 23 '13 at 13:11
  • last 2 entries in the output rows snippet are the columns concerned. – atams May 23 '13 at 13:11
  • we don't care about the output, we would like to see you input-data, please! – Fredrik Pihl May 23 '13 at 13:12
  • @FredrikPihl I think I have found the problem, I opened the csv file in notepad to look at the entries and found out that entries are mentioned as '0.0' and I also have to write my conditions according to that. – atams May 23 '13 at 13:20
  • good for you, but if you had posted you input 50 minutes ago, your problem whould have been solved 50 minutes ago... – Fredrik Pihl May 23 '13 at 13:21
  • @atams, I'd still recommend splitting your tasks, even if you found the cause of your errors. If you have any other problems, you will be able to peek into your data structures, something not possible with the way you've written your code. See answer below. – pedram May 23 '13 at 13:25
  • @FredrikPihl Sorry for that, I have attached the i/p now. I tried using 0.0 in the code but it didn't work – atams May 23 '13 at 13:35

4 Answers4

3

From the docs:

Each row read from the csv file is returned as a list of strings. No automatic data type conversion is performed.

Perhaps your if row[30] != 0 should be changed to if row[30] != "0"

E.Z.
  • 6,393
  • 11
  • 42
  • 69
  • This method works partially, around half of my rows are still showing `1` against the `0` value. – atams May 23 '13 at 12:37
1

Separate your tasks into functions that read, manipulate, then write the data.

To read your table into a list:

def importCSV(fname):
    """ Returns a tuple using a CSV as input. First tuple is the
    header. 
    Second tuple is a dictionary with all the data.
    """
    data = []
    with open(fname, 'rb') as csvfile:
        reader = csv.DictReader(csvfile, delimiter=',')
        header = reader.fieldnames
        for lines in reader:
            data.append(lines)
        return (header, data)

header, data = importCSV('myfile.csv')

now manipulate your data:

for line in data:
    do something to line...
....

then write your data:

def CreateCSVFromList(header, source, outputfile):
    """Creates a CSV from a 2D List.
    Header should be a simple list for each column in the resulting list.
    Source is the list.
    Output file is what is written. Include the .csv extension please.
    No error-checking is being done at the moment.
    """
    import csv
    with open(outputfile, 'wb') as csvfile:
        writer = csv.writer(csvfile, dialect='excel')
        writer.writerow(header)
        for row in source:
            writer.writerow(row)


CreateCSVFromList(header, data, 'output.csv')
pedram
  • 2,931
  • 3
  • 27
  • 43
  • I tried using this and this part for populating the last column ` row[31] = 1 if row[30] == "0.0": row[31] = 0 if row[30] == "0": row[31] = 0 if row[30] == "": row[31] = 2` ` It is giving me this error ` if row[30] == "0.0": KeyError: 30` – atams May 23 '13 at 13:36
  • I don't understand what you are doing. Did you create a header and data using `importCSV()`? Did you give the results from that function a name, like header, data? If so, you will have a list called `data`. You can then modify items in that list as you wish. If you wanted to validate the 30th item in that list and replace the 31st item with 1, you could do: `if data[29] == "0.0": data[30] = 1` – pedram May 23 '13 at 13:59
  • Turns out, I should use float() and int() while comparing numbers. float() resolved my problem but thanks for your answer as I know now a new approach to perform this task. – atams May 23 '13 at 14:01
1

If I change your logic to this (remember, everything is strings unless you transform them using int(), float() etc.)

for row in reader:
    print "'%s'" % row[30]
    if float(row[30]) != 0:
        row[31] = '1'
    else:
        row[31] = '0'
    if row[30] == "":
        row[31] = 2
    writer.writerow(row)

I get this output:

05/23/2013,May,2013,2013,4,2,Thursday,UK,O2,£,NOKIA,100,NOKIA 100,Smartphone,Symbian,NA,,,9.99,,,N,,Pay & Go,Pay & Go,,,,,0.64222,0.0,0
05/23/2013,May,2013,2013,4,2,Thursday,UK,O2,£,NOKIA,100,NOKIA 100,Smartphone,Symbian,NA,Pink,,9.99,,,N,,Pay & Go,Pay & Go,,,,,0.64222,0.0,0

where the last elements are 0; that was the problem, right?

Fredrik Pihl
  • 44,604
  • 7
  • 83
  • 130
  • What I want is 0 in last column for values which are 0.0 or 0 in second last column and 1 in last column if values are non-zero and non-blank. For blanks I want 2 in last column – atams May 23 '13 at 13:53
  • Issue resolved, I converted it to float. Got the hint from your comment. I used `float(row[30]) != 0` and it worked for whole file – atams May 23 '13 at 14:00
  • Can you make the change in your answer using this line `float(row[30]) != 0` As it will help the other users better. – atams May 23 '13 at 14:15
0

Try this:

_buffer = ''
for row in reader:
    if reader.line_num == 31:
        if row == ['']:             #check whether 31st row is empty
            _buffer = 2               #if that's true write `2`to line 32

        elif row == ['0']:      #if that's false check wheter 31st row = 0
            _buffer = 1           #if true, write `1`to line 32

        else:                 
            _buffer = 1 

    if reader.line_num == 32:
        row = _buffer

    writer.writerow(row)
MaxPowers
  • 5,235
  • 2
  • 44
  • 69
  • Nopes, didn't work. Can there be issue with the csv file processing? – atams May 23 '13 at 12:40
  • I guess the problem is the `for`loop. You don't have to scale the whole file just to check for one row! Try something like `if reader[30]==""` ... – MaxPowers May 23 '13 at 12:47
  • I am populating 2nd file in `wb` mode row by row. How do I perform this check and change outside the for loop? – atams May 23 '13 at 12:50
  • 1
    @atams - please update your question with some lines from your actual input; without it we can only guess! – Fredrik Pihl May 23 '13 at 12:51
  • That's because you're not using the same input as the OP, apparently the data is in format "0.0". OP didn't describe that though... – Fredrik Pihl May 23 '13 at 13:32
  • `writer.writerow(row) Error: sequence expected` I am getting this error. ANd I have both type of entries in the data `0` and `0.0` – atams May 23 '13 at 13:42