How to delete a particular column in csv file without pandas library

Question

I am trying to delete particular columns in csv file.

CSV file:

Name,Age,YearofService,Department,Allocation
Birla,49,12,Welding,Production
Robin,38,10,Molding,Production

I am trying to delete Entire column having column header "Department" and "Allocation".

My code :

 with open(input.csv,'r') as i:
    with open(output.csv,'w',new line='') as o:
       reader=csv.reader(i)
       writer = csv.writer(o)
       for row in reader:
          for i in range(len(row)):
            if row[i]!="Department" and row[i]!="Allocation":
              writer.writerow(row)

My output:

Name
Birla
Robin
Age
49
38
YearofService
12
10

Expected output:

Name,Age,YearofService
Birla,49,12
Robin,38,10

We cannot gaurantee Department and Allocation will be in column header position "3" and "4". thats y am using iteration through length of row

Read first row (the one with the headers). Find index numbers for the columns you want to delete. Delete them in the first row (larger index first) and write row to output CSV. Then read the remaining rows one by one and for each delete the unwanted columns and write the row. — Michael Butscher, Aug 10 '22 at 12:24

Hai Vu · Accepted Answer · 2022-08-10T14:14:30.717

In this case, the csv.DictReader and csv.DictWriter classes are very handy:

import csv

with open("input.csv") as instream, open("output.csv", "w") as outstream:
    # Setup the input
    reader = csv.DictReader(instream)

    # Setup the output fields
    output_fields = reader.fieldnames
    output_fields.remove("Department")
    output_fields.remove("Allocation")

    # Setup the output
    writer = csv.DictWriter(
        outstream,
        fieldnames=output_fields,
        extrasaction="ignore",  # Ignore extra dictionary keys/values
    )

    # Write to the output
    writer.writeheader()
    writer.writerows(reader)

Notes

For input, each of the row will be a dictionary such as

{'Name': 'Birla', 'Age': '49', 'YearofService': '12', 'Department': 'Welding', 'Allocation': 'Production'}

For output, we remove those columns (fields) that we don't need, see output_fields
The extraaction parameter tells DictReader to ignore extra keys/values from the dictionaries

Update

In order to remove columns from a CSV file we need to

Open the input file, reader all the rows, close it
Open it again to write.

Here is the code, which I modified from the above

import csv

with open("input.csv") as instream:
    # Setup the input
    reader = csv.DictReader(instream)
    rows = list(reader)

    # Setup the output fields
    output_fields = reader.fieldnames
    output_fields.remove("Department")
    output_fields.remove("Allocation")

with open("input.csv", "w") as outstream:
    # Setup the output
    writer = csv.DictWriter(
        outstream,
        fieldnames=output_fields,
        extrasaction="ignore",  # Ignore extra dictionary keys/values
    )

    # Write to the output
    writer.writeheader()
    writer.writerows(rows)

If i dont need a seperate file(output).Am i allowed to remove those columns in same input file.Do we need writer on that condition. — Balaji R B, Aug 10 '22 at 14:07
@BalajiRB Would you please post a portion of your input data file? Are there empty lines in there? — Hai Vu, Aug 10 '22 at 20:00
Your code worked as expected but it gives empty rows inbetween values.I am checking on those.My csv file doesnt have empty values.Thanks — Balaji R B, Aug 11 '22 at 03:36

score 1 · Answer 2 · answered Aug 10 '22 at 12:49

1

The quickest and most easy way is to just open in it in excel and delete the column that you want, i know it isn't what you ask for but it is the first thing it comes to my mind for a workaround.

answered Aug 10 '22 at 12:49

Aherontas

45
6

score 1 · Answer 3 · answered Aug 10 '22 at 13:32

You can write something like this (but still it's better to use pandas):

import csv

def delete_cols(file: str, cols_to_delete: list):
    cols_to_delete = set(cols_to_delete)
    with open(file) as file, open('output.csv', 'w') as output:
        reader = list(csv.reader(file))
        headers = reader[0]

        indexes_to_delete = [idx for idx, elem in enumerate(headers) if elem in cols_to_delete]
        result = [[o for idx, o in enumerate(obj) if idx not in indexes_to_delete] for obj in reader]

        writer = csv.writer(output)
        writer.writerows(result)


delete_cols('data.csv', ['Department', 'Allocation'])

file output.csv:

Name,Age,YearofService
Birla,49,12
Robin,38,10

It also create a empty rows..My csv file doesnt have any empty rows and columns — Balaji R B, Aug 11 '22 at 06:42

How to delete a particular column in csv file without pandas library

3 Answers3

Update