2

I am working on characterizing an SSD drive to determine max TBW / life expectancy.

Currently I am using BASH to generate 500MB files with random (non-zero) content :

dd if=<(openssl enc -aes-128-cbc -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) of=/media/m2_adv3d/abc${stamp1} bs=1MB count=500 iflag=fullblock&

Note : {stamp1} is a time stamp for ensuring unique file names.

I am looking to accomplish the same result in Python but am not finding efficient ways to do this (generate the file quickly).

Looking for suggestions.

Thanks!


Update

I have been experimenting with the following and seem to have achieved 2 second write; files appear to be random and different :

import os

newfile = open("testfile.001", "a")
newfile.write (os.urandom(500000000))    # generate 500MB random content file
newfile.close ()

A little skeptical that this is truly good enough to stress an SSD. Basically going to infinitely loop this; once drive is full, deleting to oldest file and writing new one, and collecting SMART data every 500 files written to trend the aging.

Thoughts?

Thanks,

Dan.

Dan G
  • 366
  • 1
  • 3
  • 18
  • 2
    Perhaps if you edited the question to show the code you would like speeded up people will suggest improvements. Hard to answer without seeing the existing code. – holdenweb Feb 28 '19 at 16:22
  • 1
    Thank you for the feedback @holdenweb; updated with code. – Dan G Feb 28 '19 at 17:16
  • One thought: since the IO operation is bound to take time, a threaded or asynchronous solution that allows a new random block to be generated while the last one is being written might speed things up. – holdenweb Mar 01 '19 at 12:42
  • @holdenweb ; thank you for the suggestions. Tried threading and took a performance hit ... while I seem to be able to consistently write 500MB files at 3 ~ 5 seconds a piece (linear); when I attempt to do two in parallel using threads, I am hitting between 10 ~ 17 seconds ... more towards the 17 seconds. Will post the code for reference and close this one off. Thanks! – Dan G Mar 05 '19 at 23:17

2 Answers2

2

The os.urandom option works best for generating large random files.

Dan G
  • 366
  • 1
  • 3
  • 18
1

You could try something as easy as this.

import pandas as pd
import numpy as np

rows = 100000
cols = 10000

table_size = [rows,cols]

x = np.ones(table_size)
pd.DataFrame(x).to_csv(path)

You can update the table size to make it larger or smaller. I am not sure if this is more / less efficient than what you are already trying.