Testing durability by automatically killing a C program at file/line?

Question

I'm writing a bit of software that uses the fsync() system call to ensure a file is persisted to disk. I've done this before, and I'm aware of various "gotchas" you need to be aware of (i.e., when replacing a file, you need to fsync() the file, issue a rename(), then fsync() the containing directory. Coding the software that durably writes files to disk is fine.

Testing the software is another matter. I want to verify that it operates correctly in the face of, e.g., power outages. My physical reflexes are pretty good, but not quite good enough to unplug the power cable between two CPU instructions.

How should I test the durability of a C program that writes to disk?

(As implied by the title, I'm assuming you use some sort of debugger-like software, perhaps using ptrace(), to automatically kill your program at particular files/lines, as listed in the debugging information of the executable.)

score 0 · Answer 1 · answered Oct 11 '18 at 05:44

Be prepared that you cannot save 100% data into the disk in the case of unforeseen events random events such as power outages under normal circumstances. However, the probability can be improved by various ways - such as writing to disk in chunks. The size of chunk be such that if this is not saved in the disk then either some parts of it could be recoverable after restart of power or ignored safely. The size of chunk shall also depend on disk cache which will ensure data is quickly written and the rate of data stream that program reads. Data recovery can be implemented by designing your HW/SW such that in case of power outage, there is some sort of backup power in the system. The design will either allow enough time to system to signal SW to complete saving of remaining data if any or, data remain stored in ram exclusive for this purpose until main power to cpu returns. There are several testing ideas, one of them can be to perform disk write operations in a thread and start/kill the thread from main thread in a loop at rate as per your requirement.

The main part of this answer related to durability techniques, which I specifically said wasn’t my main concern (test ability). The remainder of the answer gave a good suggestion - killing from an alternate thread - but randomly killing a thread is not very scientific and may not work well at all on short lived processes. — magnus, Oct 11 '18 at 05:49
I am referring to the tester program, which shall simulate the behavior by executing your portion of code in a thread. But this can be only used for testing **portion of code**! — Vivek, Oct 11 '18 at 09:47

score 0 · Answer 2 · answered Oct 11 '18 at 10:23

Type of test that you are looking for are - Load test and stress test. Load test will test if your application is able to perform when huge amount of data needs to be written on disk very frequently. This requires you to simulate the behaviour by providing it input externally eg. via a test application simulating read data source. Stress test will test the behavior of program when CPU utilization is 100% in the system or there is no memory left in the system ram or disk.strees-ng tool can be used for simulating cpu load scenarios. Please see [link] (https://www.tecmint.com/linux-cpu-load-stress-test-with-stress-ng-tool/) .Hope it helps.

Testing durability by automatically killing a C program at file/line?

2 Answers2