2

I have a csv file, 300 lines:

ID,HEIGHT,MEAN WEIGHT,20-Nov-2002,05-Mar-2003,09-Apr-2003,23-Jul-2003 1,1.80,80,78,78,82,82 2,1.60,58,56,60,60,56 3,1.90,100,98,102,98,102

I want a file to delete all lines where the column MEAN WEIGHT> 75 and obtain another new file

ID,HEIGHT,MEAN WEIGHT,20-Nov-2002,05-Mar-2003,09-Apr-2003,23-Jul-2003 1,1.80,80,78,78,82,82 3,1.90,100,98,102,98,102

Pedro Sousa
  • 147
  • 1
  • 2
  • 5

5 Answers5

2

if you're open to non Python solutions and access to bash shell or awk

$ awk -F, '$3>75' filename 

ID,HEIGHT,MEAN WEIGHT,20-Nov-2002,05-Mar-2003,09-Apr-2003,23-Jul-2003
1,1.80,80,78,78,82,82
3,1.90,100,98,102,98,102
karakfa
  • 66,216
  • 7
  • 41
  • 56
1

Using plain python:

orig = open('original.csv', 'r')
modi = open('modified.csv', 'w')

#header
modi.write(orig.readline())

# data lines
for line in old:
    if float(line.split(',')[2]) <= 75:
        modi.write(line)

orig.close()
modi.close()
kikocorreoso
  • 3,999
  • 1
  • 17
  • 26
1

as @Vignesh Kalai suggested, use pandas

import pandas as pd

df = pd.read_csv("yourfile.csv", sep=",")

df[ df["MEAN WEIGHT"] > 75 ].to_csv("yournewfile.csv", index=False)

And it's done.

P.S. You're asking for values less than 75 but you're displaying the opposit .If it is the first case replace "> 75" by "<= 75".

YOBA
  • 2,759
  • 1
  • 14
  • 29
  • It works but add a new column ,ID,HEIGHT,MEAN WEIGHT,20-Nov-2002,05-Mar-2003,09-Apr-2003,23-Jul-2003 0,1,1.8,80,78,78,82,82 2,3,1.9,100,98,102,98,102 – Pedro Sousa Sep 08 '15 at 12:31
  • @PedroSousa Sure, use index = False (See Edit) – YOBA Sep 08 '15 at 12:41
0

You can use the Python csv library as follows:

import csv

with open('input.csv', 'r') as f_input, open('output.csv', 'wb') as f_output:
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)

    # Write the header
    csv_output.writerow(next(csv_input))

    for cols in csv_input:
        if int(cols[2]) <= 75:    # Keep weights <= 75
            csv_output.writerow(cols)

So with the data you have given, you will get the following output.csv file:

ID,HEIGHT,MEAN WEIGHT,20-Nov-2002,05-Mar-2003,09-Apr-2003,23-Jul-2003
2,1.60,58,56,60,60,56
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
0

Perl solution which prints to screen, similar to karakfa's Awk solution:

perl -F, -ane 'print if $. == 1 or $F[4] > 75' filename

The @F autosplit array starts at index $F[0] while awk fields start with $1

This variation edits the file in-place:

perl -i -F, -ane 'print if $. == 1 or $F[4] > 75' filename

This variation edits the file in-place, and makes a backup filename.bak

perl -i.bak -F, -ane 'print if $. == 1 or $F[4] > 75' filename
Chris Koknat
  • 3,305
  • 2
  • 29
  • 30