1

I have a 4 column array and in one column and they are made up of about 6 or 7 different repeating value. What i wish to do is create a set of smaller arrays by removing one type of number and the associated row for each value.

For example:

1 2 3 4
3 6 5 4
3 2 9 8
5 3 0 8
4 6 9 5
7 3 4 7

In the second column, 2, 3 and 6 are repeated, how would I got about extracting all the rows which have 3 in the second column and then placing this result into a new array?

EDIT: I forgot to mention the data is located in a .dat file as a 2D array

Krios101
  • 43
  • 4

2 Answers2

0

Using Python list

# construct a list contain all rows with column 'colm' value  'value' for a matrix 
# matrix is a list contain rows from file.dat eg formated [[],[],[]...]
def construct(colm, value, matrix):
    result = [] 
    for row in matrix:
        if row[colm-1] == value:
            result.append(row)
    return result

# Read file.dat and return list matrix
def read():
    var = []
    try:
        file = open("file.dat", 'r')
        for line in file:
            # append a list with a file.dat row, and convert each item to integer
            var.append([int(y) for y in line.strip().split(' ')])
    finally:
        file.close()
    return var

So you can use it as

constuct(2, 3, read()) # all rows with second column with value 3

NOTE: I haven't explore much about NumPy but if your .dat file contains large amount of data its a good idea to use NumPy instead of lists for efficient operations.

Community
  • 1
  • 1
Emmanuel Mtali
  • 4,383
  • 3
  • 27
  • 53
  • This is useful and I'll learn it for future usage which I thank you for but contains many aspects i'm not familiar with. I have worked with numpy but for this case – Krios101 Dec 08 '16 at 11:56
  • (Have to re add comment as something failed) The matrix value is that the whole matrix from the file correct?, Is the result = [] a dummy variable and how does the line.strip.split work? I have worked with numpy before and know some aspects of it but for taking out rows from a matrix based on a certain number I don't know how to do that in numpy. – Krios101 Dec 08 '16 at 12:08
  • I have added few comments. Most of things there a pretty basic so try to work on your basic python skills first before diving to complicated stuffs. It will save from unnecessary headaches. I recommend "The Quick Python Book" @Krios101 – Emmanuel Mtali Dec 08 '16 at 13:38
  • Book link -> https://github.com/mhcrnl/Python/blob/master/PythonBookInRead/The%20Quick%20Python%20Book,%20Second%20Edition%20(2010).pdf – Emmanuel Mtali Dec 08 '16 at 13:38
0

You can use numpy's boolean indexing feature

>>> import numpy as np
>>> data = np.array([[1, 2, 3, 4], 
                     [3, 6, 5, 4], 
                     [3, 2, 9, 8], 
                     [5, 3, 0, 8], 
                     [4, 6, 9, 5], 
                     [7, 3, 4, 7]])

>>> print(data[data[:,1] == 3, :])
[[5 3 0 8]
 [7 3 4 7]]
Chris Mueller
  • 6,490
  • 5
  • 29
  • 35