-1

I want to write data with headers into a file. The first three lines are unique, and can be considered as a 'block' which are then repeated with increments in x and y (0.12, 1) respectively. The data in the file should look like:

#X  #Y  Xmin    Ymin    Z
1   1   0.0000  0.000   0.0062
1   2   0.0000  0.350   0.0156
1   3   0.0000  0.750   0.0191
1   4   0.0000  1.000   0.0062
1   5   0.0000  1.350   0.0156
1   6   0.0000  1.750   0.0191
1   7   0.0000  2.000   0.0062
1   8   0.0000  2.350   0.0156
1   9   0.0000  2.750   0.0191
2   1   0.1200  0.000   0.0062
2   2   0.1200  0.350   0.0156
2   3   0.1200  0.750   0.0191
2   4   0.1200  1.000   0.0062
2   5   0.1200  1.350   0.0156
2   6   0.1200  1.750   0.0191
2   7   0.1200  2.000   0.0062
2   8   0.1200  2.350   0.0156
2   9   0.1200  2.750   0.0191
3   1   0.2400  0.000   0.0062
3   2   0.2400  0.350   0.0156
3   3   0.2400  0.750   0.0191
3   4   0.2400  1.000   0.0062
3   5   0.2400  1.350   0.0156
3   6   0.2400  1.750   0.0191
3   7   0.2400  2.000   0.0062
3   8   0.2400  2.350   0.0156
3   9   0.2400  2.750   0.0191

I tried to make the first three lines as 3 lists and write the first two columns and headers by two nested for loops but failed to write the repeating 3 line block.

l1 = [0.0000, 0.000, 0.0062]
l2 = [0.0000, 0.350, 0.0156]
l3 = [0.0000, 0.750, 0.0191]
pitch_x = 0.12
pitch_y = 1

with open('dataprep_test.txt', 'w') as f:
    f.write('#x #y  Xmin    Ymin    Z   \n')
    for i in range(1,4,1):
        k =1
        for j in range (1,4,1):
            d_x = pitch_x*(i-1)
            d_y = pitch_y*(j-1)
            f.write('%d %d  %f  %f  %f  \n'%(i,k,(l1[0]+d_x),(l1[1]+d_y), l1[2]))
            f.write('%d %d  %f  %f  %f  \n'%(i,k+1,(l2[0]+d_x),(l2[1]+d_y), l2[2]))
            f.write('%d %d  %f  %f  %f  \n'%(i,k+2,(l3[0]+d_x),(l3[1]+d_y), l3[2]))
            k=k+3

Is there a smarter way to do it using the python built-in functions and structures and methods (lists, dictionaries etc.)?

  • Yes, absolutely. there are a few methods. (your method is good, but just not executed well). Alternatively, it could be faster / more convenient using a `pandas` dataframe. – D.L Aug 30 '22 at 10:14
  • Thanks I am working on the loop s, I do not want to involve Pandas since this is not really some big data science task. I will shortly update the code..I think if i make the 1st three rows as instance of a class it would be convenient to print – rohan kundu Aug 30 '22 at 10:45
  • okay, then go with `DictWriter`. probably more suitable. – D.L Aug 30 '22 at 11:10
  • Here is my implementation which does work however I am not happy with the noodle code. Will check out Dictwriter! Thanks for the hint! – rohan kundu Aug 30 '22 at 11:31
  • Honestly, I don't see much wrong with this code. You could rewrite generating the 5-tuples for a given range and initial data as a generator. – AKX Aug 30 '22 at 11:32
  • DictWriter won't buy you much here, imo! – AKX Aug 30 '22 at 11:32

2 Answers2

1

I'd just refactor the data generation into a generator function. You can also easily accept an arbitrary number of vectors.

def generate_data(initial_vectors, pitch_x, pitch_y, i_count=4, j_count=4):
    for i in range(i_count):
        for j in range(j_count):
            d_x = pitch_x * i
            d_y = pitch_y * j
            for k, (x, y, z) in enumerate(initial_vectors, 1):
                yield (i + 1, k, (x + d_x), (y + d_y), z)


def main():
    l1 = [0.0000, 0.000, 0.0062]
    l2 = [0.0000, 0.350, 0.0156]
    l3 = [0.0000, 0.750, 0.0191]
    with open('dataprep_test.txt', 'w') as f:
        f.write('#x #y  Xmin    Ymin    Z   \n')
        for i, k, x, y, z in generate_data([l1, l2, l3], pitch_x=0.12, pitch_y=1):
            f.write(f'{i:d} {k:d} {x:f} {y:f} {z:f}\n')


if __name__ == '__main__':
    main()

Furthermore, if a future version of your project might want to use JSON files instead, you could just json.dumps(list(generate_data(...)), etc.

AKX
  • 152,115
  • 15
  • 115
  • 172
  • i like the generator function. Still think that a dataframe is cleaner than both of our methods... – D.L Aug 30 '22 at 11:47
  • @D.L If OP wanted to use a dataframe, they'd still need to generate it somehow. Happily, my method works for that too: `pd.DataFrame(generate_data(...), columns=["nx", "ny", "x", "y", "z"])`... – AKX Aug 30 '22 at 11:56
  • Thanks!! This is a nice and cleaner implementation! Of course Dataframe might be even better but I am trying to avoid spezialised packages :) – rohan kundu Aug 30 '22 at 11:59
0

You could do this, which gives every part:

file = 'F:\code\some_file.csv'  
some_headers = ['x#', 'y#', 'Xmin','Ymin','Z']

# new lists
list_x = [1,1,1]
list_y = [1,2,3]
list_xmin = [0,0,0]
list_ymin = [0,0.35,0.75]
list_z = [0.0062,0.0156,0.0191]

# build new lists with whatever rules you need
for i in range(10):
    list_x.append(i)
    list_y.append(i*2)
    list_xmin.append(i)
    list_ymin.append(i*3)
    list_z.append(i)


# write to file
with open(file, 'w') as csvfile:

    # write headers
    for i in some_headers:
        csvfile.write(i + ',')
    csvfile.write('\n')
    
    # write data
    for i in range(len(list_x)):
        line_to_write = str(list_x[i]) + ',' + str(list_y[i]) + ',' + str(list_xmin[i]) 
        line_to_write = line_to_write + ',' + str(list_ymin[i]) + ',' + str(list_z[i])
        line_to_write = line_to_write + '\n'
        csvfile.writelines(line_to_write)

# finished
print('done')

The result would be a csv file like this:

enter image description here

D.L
  • 4,339
  • 5
  • 22
  • 45