1

I have this 4x10 (nxm) data matrix in csv:

1, 5, 19, 23, 7, 51, 18, 20, 35, 41
15, 34, 17, 8, 11, 93, 13, 46, 3, 10
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10, 9, 8, 7, 6, 5, 4, 3, 2, 1

First, I try to get a list of all possible sums from the first n/2 rows. With remaining last n/2 rows I do the same.

Under all possible sums of first rows I mean the following:

Example:
Row 1: 1, 2, 3
Row 2: 3, 2, 1

All possible sums list: 1 + [3, 2, 1]; 2 + [3, 2, 1]; 3 + [3, 2, 1]

Final list: [4, 3, 2, 5, 4, 3, 6, 5, 4] (At the moment I do not want to remove duplicates)

For my logic I have this code:

import csv

def loadCsv(filename):
    lines = csv.reader(open(filename, "rb"))
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset

data = loadCsv('btest2.txt')
divider = len(data)/2

firstPossibleSumsList = []
secondPossibleSumsList = []


#Possible sum list for the first n/2 rows:
for i in range(len(data[0])):
    for j in range(len(data[0])):
        firstPossibleSumsList.append(data[0][i] + data[1][j])

#Possible sum list for the last n/2 rows:
for i in range(len(data[0])):
    for j in range(len(data[0])):
        secondPossibleSumsList.append(data[2][i] + data[3][j])

The problem is that I divided rows manually by using data[0][i], data[1][i], data[2][i] and so on. I want to do it more efficiently and by involving divider variable, but I can't figure out how. In my code I depend on integers 0, 1, 2, 3, but I wanted to split matrix rows into halves regardless of matrix dimensions.

Erba Aitbayev
  • 4,167
  • 12
  • 46
  • 81
  • I could edit my answer depening on exactly how you plan to treat eg. a `6x10` matrix, as that will determine how the outer loop should look like. Would you also try to sum the third row with both the first and second? – M.T Feb 22 '16 at 08:22
  • @M.T Is it possible to implement such logic that will work on any specified number of rows? I may need to get all possible sums list for 2, 3 or may be 4 or 5 rows if overall number of rows in matrix is 4, 6, 8 or 10, respectively (since I am trying to get sum list for a half of some matrix). For example instead of providing 4x10 matrix I could provide 6x10 matrix and it still returns all possible sum list of first 3 rows (instead of 2 rows). – Erba Aitbayev Feb 22 '16 at 08:41

1 Answers1

1

One option is to think of it as a sum of a vector and transposed vector. Then you could do:

import numpy as np

data = np.array(loadCsv('btest2.txt'))

firstPossibleSumsArray = (data[0,:,np.newaxis] + data[1]).flatten()

#output for first two columns:
array([  15,   34,   17,    8,   11,   93,   13,   46,    3,   10,   75,
    170,   85,   40,   55,  465,   65,  230,   15,   50,  285,  646,
    323,  152,  209, 1767,  247,  874,   57,  190,  345,  782,  391,
    184,  253, 2139,  299, 1058,   69,  230,  105,  238,  119,   56,
     77,  651,   91,  322,   21,   70,  765, 1734,  867,  408,  561,
   4743,  663, 2346,  153,  510,  270,  612,  306,  144,  198, 1674,
    234,  828,   54,  180,  300,  680,  340,  160,  220, 1860,  260,
    920,   60,  200,  525, 1190,  595,  280,  385, 3255,  455, 1610,
    105,  350,  615, 1394,  697,  328,  451, 3813,  533, 1886,  123,
    410])

The last flatten is to turn it from a 10x10 array to a 100x1 array, which should not be necessary.

Downside of using arrays is that they are not as flexible when it comes to resizing/appending data.

Edit:

The full code could be something like:

div = int(data.shape[0])
row_len_squared = int(data.shape[1]**2)

firstPossibleSumsArray = np.empty( int((div*(div-1))/2 * row_len_squared), dtype=int )

idx = 0
for row in range(div):
    for col in range(row+1,div):
        firstPossibleSumsArray[idx:idx+row_len_squared] = \
            (data[row,:,np.newaxis] + data[col]).flatten()
        idx += row_len_squared
#reapeat process for second possible sums array by replacing the range 
#in the first loop from range(div) to range(div,2*div)            

This will go through each row, and sum it with the remaining rows in matrix half (row #1 + row #2, ..., row #1 + row #n, row #2 + row #3 etc.)

M.T
  • 4,917
  • 4
  • 33
  • 52