5

What's the cleanest way in php to open a file, read the contents, and subsequently overwrite the file's contents with some output based on the original contents? Specifically, I'm trying to open a file populated with a list of items (separated by newlines), process/add items to the list, remove the oldest N entries from the list, and finally write the list back into the file.

fopen(<path>, 'a+')
flock(<handle>, LOCK_EX)
fread(<handle>, filesize(<path>))
// process contents and remove old entries
fwrite(<handle>, <contents>)
flock(<handle>, LOCK_UN)
fclose(<handle>)

Note that I need to lock the file with flock() in order to protect it across multiple page requests. Will the 'w+' flag when fopen()ing do the trick? The php manual states that it will truncate the file to zero length, so it seems that may prevent me from reading the file's current contents.

Phillip
  • 5,366
  • 10
  • 43
  • 62
  • 3
    I would suggest writing to a different temporary file, then deleting the first and renaming the temp. – BishopRook Aug 19 '11 at 19:33
  • Did you ever find a solution to this? I'm interested in exactly this scenario (read then overwrite while the file is locked) – ralbatross Mar 02 '16 at 21:01
  • As I've not been able to find a way of getting file_put_contents to work inside of a flock (i.e. fopen/flock/file_put_contents/fclose), then you can consider using fopen/flock/fread/[ftruncate/rewind]/fwrite/fclose; see the code I posted on http://php.net/manual/en/function.file-put-contents.php#122781. – user1432181 May 30 '18 at 14:11

3 Answers3

2

If the file isn't overly large (that is, you can be confident loading it won't blow PHP's memory limit), then the easiest way to go is to just read the entire file into a string (file_get_contents()), process the string, and write the result back to the file (file_put_contents()). This approach has two problems:

  • If the file is too large (say, tens or hundreds of megabytes), or the processing is memory-hungry, you're going to run out of memory (even more so when you have multiple instances of the thing running).
  • The operation is destructive; when the saving fails halfway through, you lose all your original data.

If any of these is a concern, plan B is to process the file and at the same time write to a temporary file; after successful completion, close both files, rename (or delete) the original file and then rename the temporary file to the original filename.

tdammers
  • 20,353
  • 1
  • 39
  • 56
  • Thanks. So file_get_contents() and file_put_contents() will replace fopen(), fread(), fwrite(), and fclose(). At what point should I lock & unlock the file? Unfortunately, flock() requires a file handle. – Phillip Aug 19 '11 at 20:01
  • 1
    You don't lock the original file. You simply don't touch it until you write all the data successfully to another temporary file. When this procedure is complete - delete or rename (for backup, in case something goes wrong) the original file and rename the temporary file to the name of the original file. This way if something goes wrong midway you have your original file and a broken temporary file which you can easily discard and start over. – Dzhuneyt Jul 08 '13 at 09:51
  • 3
    fine as long as not another process wants to change that file. Think of an counter... So you are save for the first try, but since the original file was not locked it can be changed by another process writing into it. Or reading outdated information, while waiting for the file lock to to go away. – Calamity Jane Jul 22 '14 at 12:55
0

One solution is to use a separate lock file to control access.

This solution assumes that only your script, or scripts you have access to, will want to write to the file. This is because the scripts will need to know to check a separate file for access.

$file_lock = obtain_file_lock();
if ($file_lock) {
    $old_information = file_get_contents('/path/to/main/file');
    $new_information = update_information_somehow($old_information);
    file_put_contents('/path/to/main/file', $new_information);
    release_file_lock($file_lock);
}

function obtain_file_lock() {

    $attempts = 10;
    // There are probably better ways of dealing with waiting for a file
    // lock but this shows the principle of dealing with the original 
    // question.

    for ($ii = 0; $ii < $attempts; $ii++) {
         $lock_file = fopen('/path/to/lock/file', 'r'); //only need read access
         if (flock($lock_file, LOCK_EX)) {
             return $lock_file;
         } else {
             //give time for other process to release lock
             usleep(100000); //0.1 seconds
         }
    }
    //This is only reached if all attempts fail.
    //Error code here for dealing with that eventuality.
}

function release_file_lock($lock_file) {
    flock($lock_file, LOCK_UN);
    fclose($lock_file);
}

This should prevent a concurrently-running script reading old information and updating that, causing you to lose information that another script has updated after you read the file. It will allow only one instance of the script to read the file and then overwrite it with updated information.

While this hopefully answers the original question, it doesn't give a good solution to making sure all concurrent scripts have the ability to record their information eventually.

Mehmet Karatay
  • 269
  • 4
  • 11
0

Read

$data = file_get_contents($filename);

Write

file_put_contents($filename, $data);
KingCrunch
  • 128,817
  • 21
  • 151
  • 173