2

We have an application that does a LOT of logging. The medium we log to is SLC SSD drives however we are starting to see some failures in the field. We could turn logging off (we do), have log levels (we have) however sometimes an engineer turns on logging to diagnose a fault and forgets to turn it off which results in a failed SSD some time later.

Looking at the logging code, we save the log entry to a queue and every 5 seconds, iterate over the collection and use File.AppendAllText to write the line to the file.

According to MSDN this writes to the file then closes it.

What would be a better regime to use to achieve the same functionality but prevent (or reduce) damage to the SSD?

Would it be better to open a FileStream at software start, write to the stream during use and close before the software quits? How would this alleviate the situation at the disk level? What processes are involved and how is this better than opening the file and closing it immediately. Using FileStream 'feels' better but I need a more concrete rationale before making changes.

Maybe there is a better way that we haven't considered.

DidIReallyWriteThat
  • 1,033
  • 1
  • 10
  • 39
Sparers
  • 423
  • 5
  • 15
  • SSD has a limited write life. Look to a different type of disk technology or another brand of SSD. http://www.pcworld.com/article/2043634/how-to-stretch-the-life-of-your-ssd-storage.html – paparazzo Dec 01 '14 at 16:10
  • We use SSD for other reasons and we cannot change the 100's of systems out there. We already use what we think is the most robust brand of SSD (cost vs quality). I'm looking for an answer that looks at how .NET writes to the drive and the mechanics of that process. – Sparers Dec 01 '14 at 16:13
  • A write every 5 seconds is a write every 5 seconds. I don't think leaving a file open is going to be a magic bullet. A filestream writes to disc. An MLC has a nominal life of 10,000 write cycles. At every 5 seconds that is only 14 hours. – paparazzo Dec 01 '14 at 16:50
  • Oops. I mean't SLC not MLC – Sparers Dec 01 '14 at 17:18
  • Even at 1 million cycles that is less than 60 days at every 5 seconds. I hope you get a fix with .NET bit I would so go at this with a database on a regular drive. Maybe hold in memory and only log every hour. – paparazzo Dec 01 '14 at 17:28
  • I now can't be positive that the logging is causing the issue as new information this morning seems to indicate two brands of which one brand is having more issues. I guess the solution (if logging is causing the problem) is to queue and commit less regularly (as you suggest Blam) and have a timeout of the logging. – Sparers Dec 02 '14 at 10:19
  • How much volume are you writing? The SSD is not burned out after 100k writes. It is burned out after 100k *full* writes. – usr Dec 02 '14 at 13:40
  • For the amount of information that will kill a SSD disk, I'll look for another method of logging... Like dropping the logs into a Queue (maybe RabbitMQ? MSMQ? JMS Queue?) and get a separated process that dumps the data into another medium... SSD disk will be useful for fast search on the older events. – HiperiX Dec 02 '14 at 13:44

2 Answers2

1

Queue and commit less often if you have enough memory to hold the log messages. But the problem there is if it goes down you won't have recent log messages.

paparazzo
  • 44,497
  • 23
  • 105
  • 176
0

This is not so much about the number of writes but about the number of SSD pages written. The more you buffer and the less physical writes you cause the better.

AppendAllText to append a single line is a very inefficient way to do this. It burns a lot of CPU because lots of objects and handles must be opened and closed for each line. Each change in file size causes an NTFS log flush when that change hardens.

Write all data out with one AppendXxx call every five seconds, or build something similar using a FileStream. You can leave it open or not. It doesn't matter. One additional IO every five seconds is meaningless for endurance.

It is not possible to be more efficient than this. This scheme writes the minimal amount of data in a sequential way.

Consider compressing what you write.

usr
  • 168,620
  • 35
  • 240
  • 369