5

I'm using SPI reading data from IMU LSM9DS1. I want to store the data to a file. I have tried to save as a txt file using with open as file and .write. the speed is 0.002s.

while flag:
    file_path_g = '/home/pi/Desktop/LSM9DS1/gyro.txt'
    with open(file_path_g, 'a') as out_file_g:
        dps = dev.get_gyro()
        out_file_g.write(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f'))
        out_file_g.write(" {0:0.3f}, {1:0.3f}, {2:0.3f}\n".format(dps[0], dps[1], dps[2]))

    file_path_a = '/home/pi/Desktop/LSM9DS1/accel.txt'
    with open(file_path_a, 'a') as out_file_a:
        acc = dev.get_acc()
        out_file_a.write(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f'))
        out_file_g.write(" {0:0.3f}, {1:0.3f}, {2:0.3f}\n".format(acc[0], acc[1], acc[2]))
    # time.sleep(0.2)

print("interrupt occured")

dev.close()

I also tried to use pandas to save the data as a .csv file. the speed is slower than the first one.

while flag:
    t = time.time()
    acc = dev.get_acc()
    dps = dev.get_gyro()
    ax = acc[0]
    ay = acc[1]
    az = acc[2]
    gx = dps[0]
    gy = dps[1]
    gz = dps[2]
    result = pd.DataFrame({'time':t, 'ax':ax,'ay':ay,'az':az,'gx':gx,'gy':gy,'gz':gz},index=[0])
    result.to_csv('/home/pi/Desktop/LSM9DS1/result.csv', mode='a', float_format='%.6f',
    header=False, index=0)

dev.close()

what can I do to improve the reading speed?

I update the code, outside the path.

file_path = '/home/pi/Desktop/LSM9DS1/result.txt'
while flag:
    with open(file_path, 'a') as out_file:
        acc = dev.get_acc()
        dps = dev.get_gyro()
        out_file.write(datetime.datetime.now().strftime('%S.%f'))
        out_file.write(" {0:0.3f}, {1:0.3f}, {2:0.3f}".format(acc[0], acc[1], acc[2]))
        out_file.write(" {0:0.3f}, {1:0.3f}, {2:0.3f}\n".format(dps[0], dps[1], dps[2]))

this is the other way

while flag:
    t = time.time()
    acc = dev.get_acc()
    dps = dev.get_gyro()
    arr = [t, acc[0], acc[1], acc[2], dps[0], dps[1],dps[2]],
    np_data = np.array(arr)
    result = pd.DataFrame(np_data,index=[0])
    result.to_csv('/home/pi/Desktop/LSM9DS1/result.csv', mode='a', float_format='%.6f', header=False, index=0)

Thanks for Mark's answer. I did what he said, changed the code as below.

samples=[]
for i in range(100000):
    t = time.time()
    acc = dev.get_acc()
    dps = dev.get_gyro()
    # Append a tuple (containing time, acc and dps) onto sample list
    samples.append((t, acc, dps))

name = ['t','acc','dps']
f = pd.DataFrame(columns=name,data=samples)
f.to_csv('/home/pi/Desktop/LSM9DS1/result.csv', mode='a', float_format='%.6f', header=False, index=0)
print('done')

I have calculated the space of time (first 600 data), the average is 0.000265, it's much faster than before, almost 10 times as before.

Yu Bohang
  • 53
  • 4
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/208300/discussion-on-question-by-yu-bohang-the-most-efficient-way-to-store-data-from-se). – Bhargav Rao Feb 22 '20 at 02:21

2 Answers2

1

As I said in the comments: "The answer is vastly different depending on what you are trying to do! If the gyro is on a drone and you are sending the data to a PC to control the direction, you need to get the latest reading to the PC with the minimum latency - this requires no storage, and data from 4 seconds ago is useless. If you are running an experiment for 4 hours and analysing the results later, you probably want to read the gyro at the maximum rate, storing it all locally and transferring it at the end - this requires more storage."

The fastest place to store a large number of samples is in a list in RAM:

samples=[]
while flag:
    t = time.time()
    acc = dev.get_acc()
    dps = dev.get_gyro()
    # Append a tuple (containing time, acc and dps) onto sample list
    samples.append((t, acc, dps))

Benchmark

Running in IPython on my desktop, this can store 2.8 million tuples per second, each containing the time and 2 lists of 3 elements each:

In [92]: %%timeit 
...:  
...: samples=[] 
...: for i in range(2800000): 
...:     t = time.time() 
...:     acc = [1,2,3] 
...:     dps = [4,5,6] 
...:     # Append a tuple (containing time, acc and dps) onto sample list 
...:     samples.append((t, acc, dps))

1.05 s ± 7.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Hello Mark, I want to know if the speed of saving data to file have some influence. When the saving speed is slower than read, will we loss some data? – Yu Bohang Feb 22 '20 at 09:00
  • The code I show does not write to a file. It stores the samples in a Python list in memory (RAM). – Mark Setchell Feb 22 '20 at 09:44
  • What should I do if I want to store the "samples" into a file, another answer said that it is better to save as a binary format file. I try use "struct", the TypeError said "a bytes-like object is required, not 'tuple'" – Yu Bohang Feb 22 '20 at 09:57
  • I am suggesting you store your data in memory during your experiment because you said you wanted to go as fast as possible and memory is 1000s of times faster than disk. I am then suggesting you write your data to disk at the end of the experiment when it doesn't matter if it takes 12 or 15 seconds. So the format on disk is unimportant. – Mark Setchell Feb 22 '20 at 10:27
  • Thanks a lot! I tried to save the data as a csv file, but the values in second and third columns is also a list. How can I read the data in MATLAB. – Yu Bohang Feb 22 '20 at 14:24
0

Some ideas which may improve speed and which you may try:

  1. use binary format instead of text - write binary time (see: Write and read Datetime to binary format in Python) and write binary floats. You may process them later offline.
  2. call get_acc and get_gyro in parallel
  3. store some number of measurements in memory and write whole buffer of them at once instead of calling write many times
  4. have separate thread for writing and separate thread for getting measurements
  5. rewrite in C
Łukasz Ślusarczyk
  • 1,775
  • 11
  • 20