0

I have a csv file with few rows and columns of data. Now I intend to extrapolate or interpolate new data if the input values are not matching in the csv.

Let me describe my csv as follows.

type,depth,io,mux,enr
perf,1024,32,4,103.8175
perf,1024,64,4,85.643125
perf,1024,128,4,76.5559375
perf,1024,256,4,72.01246094
dense,1024,32,4,107.391875
dense,1024,64,4,88.99640625
dense,1024,128,4,79.79851563
dense,1024,256,4,75.19976563

If the input does not match with the depth or io value present in the csv. I would like to generate the output after extrapolation/interpolation.

For that I need to build an array from the column.

Is there anyway to store one of the columns from this csv or storing the csv as a list and from there?

I tried the following.

import os
import pandas as pd

this_dir, this_filename = os.path.split(__file__)
memory_file_path = os.path.join(this_dir, 'memory.csv')
memData = pd.read_csv(memory_file_path, delimiter= ',',)

class InvecasMem:
    def csvImport(self,depth,io):

        csvmem = memData.loc[(memData["type"] == "perf")
                 & (memData['depth'] == depth)
                 & (memData['io'] == io)]

        if len(csvmem) == 0:
            print("Error: Wrong configuration")
            memArr = memData.loc[(memData["type"] == "perf")]
            l = [list(row) for row in memArr.values]
            x=len(l)
        return l

However, I am unable to store the column into an array?

Also, is it possible to produce extrapolated values from multiple inputs as in this case?

Thanks in advance.

Edit: The desired output as in io = [32,64,128,256,32,64,128,256] Whereas I am going to compute for depth = 512 and io = 256

1 Answers1

1

To interpolate, I'll start by loading the data

import pandas
import numpy
from io import StringIO # https://stackoverflow.com/a/43312861/1164295

myfile="""type,depth,io,mux,enr
perf,1024,32,4,103.8175
perf,1024,64,4,85.643125
perf,1024,128,4,76.5559375
perf,1024,256,4,72.01246094
dense,1024,32,4,107.391875
dense,1024,64,4,88.99640625
dense,1024,128,4,79.79851563
dense,1024,256,4,75.19976563"""

df = pandas.read_csv(StringIO(myfile))

then insert a Nan:

df.at[4,'enr'] = numpy.nan

Now we can use the interpolate method

df.interpolate()

See the interpolation section of the Pandas guide to missing data

Ben
  • 563
  • 1
  • 5
  • 12
  • Hi, I tried your solution but after interpolate I am getting the following output. /home/saikatchatrg/tmp/memory.csv enr 4 NaN NaN Here I do not understand how I feed the inputs which are not available in csv. – Saikat Chatterjee Jun 07 '21 at 19:56
  • I don't have sufficient context to understand the specific problem you're seeing. Are you asking how to insert a new row? Also, the idea of interpolation relies on the adjacent values being relevant to the new (missing) entry. – Ben Jun 07 '21 at 20:01
  • The data in myfile is stored into a csv. Now my intention is to to look into this csv and get the 'enr' value for a particular combination. When not found, it will be interpolated. – Saikat Chatterjee Jun 08 '21 at 07:52