1

I'm very new to Python, and I have a CSV file with three columns. They represent a transmission time in milliseconds, signal amplitude, and FM radio frequency in kHz. There's a lot of lines, but they look something like this:

enter image description here

My task is to find out which radio frequency is generating random noise and which is a structured signal. For how to do this, I'm trying to first find the unique values in the frequency column of my data file (column 3) and then plot them individually to find the structured data. My guess is that the 71.231012 frequency is the white noise (it seemed less frequent in the file), and so I'm basically trying to plot both frequencies to see if my guess is somewhat correct.

So far, this is my code:

from __future__ import division
import matplotlib.pyplot as mplot
import numpy as np

file=open("data.csv", "r")
data=file.read()
data=data.replace(" ", ",")
data=data.split("\n")

xscatter=[]
yscatter=[]

for row in data:
    row=row.split(",")
    row[2]=float(row[2])
    if row[2] == 71.231012:
        xscatter.append(row[2])
        yscatter.append(row[1])

mplot.scatter(xscatter, yscatter, color="blue", marker="o")
mplot.show()

But I keep getting this error:

row[2]=float(row[2])

IndexError: list index out of range

I'm not sure why this is the case; I thought that, with the split, I would have three indexes per row (0,1,2). And because I'm so new to Python, I'm also not sure how accurate or efficient my code is at doing what I want, but it's a start. I'd greatly appreciate some help.

EDIT: Here is a sample of my output after splitting the file, before the for loop:

enter image description here

  • I see how you converted the data. I think that's where your mistake is, could you tell me what `print(data)` prints out? – kiyah Feb 04 '18 at 22:23
  • Oh, also you're re-declaring variable `row`, you've used the variable `row` already in the for-loop, I prefer using a temporary variable to avoid confusion. (I don't really remember if that matters though) – kiyah Feb 04 '18 at 22:25
  • Sure -- do you mean printing the data within the for loop or before? Before the split or after? –  Feb 04 '18 at 22:27
  • Before the for loop, after the split. :) – kiyah Feb 04 '18 at 22:30
  • 1
    Sure, I've edited my original post and included a screenshot (it's an awful lot of output). –  Feb 04 '18 at 22:32
  • don't post images of your data. post your data – Paul H Feb 04 '18 at 23:15

2 Answers2

1

The code row=row.split(",") sets the row variable to something like ['0.000000', '', '0.000000', '', '0.000000']. Your code is giving index error because there are no index 2 from the string ''.

There are 2 ways of doing this:

  1. My idea is to remove those annoying empty strings in the array by changing your row=row.split(",") to row=row.split(",,"), this will work perfectly.

  2. Change your data=data.replace(" ", ",") to data=data.replace(" ", ",") (two whitespaces), that will also work perfectly.

kiyah
  • 1,502
  • 2
  • 18
  • 27
  • 1
    I tried this, but unfortunately I still get an index out of range error. –  Feb 04 '18 at 23:07
  • @Garrett McClure I'm sorry to hear that, but do you have an extra newline at the end of your file? – kiyah Feb 04 '18 at 23:10
  • 1
    It looks like it does (I originally had opened my file in Excel by default and didn't see it). When I opened it in a text editor, there was an extra line. I'll delete the line and try again. –  Feb 04 '18 at 23:28
1

If you have an input csv file like the following,

0,1.62435,7.61417
0,-0.611756,7.61417
0,-0.528172,71.231
0,-1.07297,71.231
0,0.865408,7.61417
0,-2.30154,7.61417
0,1.74481,7.61417
0,-0.761207,7.61417
0,0.319039,71.231
0,-0.24937,71.231
1,1.46211,71.231
1,-2.06014,7.61417
1,-0.322417,71.231
1,-0.384054,7.61417
1,1.13377,7.61417
1,-1.09989,71.231
1,-0.172428,71.231
1,-0.877858,7.61417
1,0.0422137,71.231
1,0.582815,71.231

You can read it in using numpy.loadtxt and plot it separated by frequency value by looping over the respective unique frequencies in the last column.

import numpy as np
import matplotlib.pyplot as plt

data = np.loadtxt("data/filename.csv", delimiter=",")

for freq in np.unique(data[:,2]):
    thisdata = data[data[:,2] == freq]
    plt.scatter(thisdata[:,0], thisdata[:,1], label="{}".format(freq))

plt.legend()
plt.show()
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712