6

I have a website where at the same moment, there can be multiple users writing to the same file at the same time. An example of my code is below.

PHP 5.6.20

<?php
$root=realpath($_SERVER["DOCUMENT_ROOT"]);
$var=time()."|";
echo $var."<br/>";
$filename=$root."/Testing/Testing/test1.txt";

$myfile=fopen($filename,"a");
fwrite($myfile,$var);
fclose($myfile);

$myfile=fopen($filename,"r");
$contents = fread($myfile, filesize($filename));
echo $contents;
fclose($myfile);
?>

I read that in PHP if multiple users are trying to write to the same file at the same file, there is a chance of data corruption, I of course don't want to have a code that may cause some data corruption over the long run.

I tried to run the above code at almost the same time on my browser to simulate multiple users writing to the same file at the same time, and it produced no errors or any data corruption to the file that is writing to, but I'm still not sure about future data corruptions.

I read that I can use flock to make sure that 2 users can not write to the file at the same time, but since I tested the above code and it produced no data corruption, I'm not sure if I should update my code with flock or just leave it as it is.

My questions are:

1) Is there any chance of the above code corrupting the file that it's writing to?

2) if yes, is using flock will solve this issue? if yes, how should I implement flock in the above code ?

Edit:

I know that this uncertainty can be solved by using a database, but for this case, it's better to use a plain text, so please don't suggest me to use a DB.

thanks in advance.

John Colbir
  • 65
  • 2
  • 5
  • This code MAY cause the file to get corrupted. You are appending to the file so it may be ok. If you are using this as a log file, then it will be fine. Using flock will cause the code to wait and will result in delays to those using it. – ryantxr Sep 02 '17 at 18:17
  • What do you mean that I'm appending to the file so it may be ok? when it's not ok? is it when writing to the file with "w", because I tried that to and I got no data corruptions too. – John Colbir Sep 02 '17 at 18:21
  • 1
    Is there a reason your users can't write to a database instead? Files are great for reading (via includes, etc), but you're right, you could have contention during rights. That might not even be solved with a database - for instance, if you use MySQL with it's old MyISAM table type. But if you stick with InnoDB, you're all set. So just wondering why you're choosing to a file in the first place. Sorry to not solve your question, though. – Lucas Krupinski Sep 02 '17 at 18:32

4 Answers4

2

If two scripts attempt to write to a file at the same time. The fopen() function, when called on a file, does not stop that same file from being opened by another script, which means you might find one script reading from a file as another is writing, or, worse, two scripts writing to the same file simultaneously. So it is good to use flock() . You can get more help on http://www.hackingwithphp.com/8/11/0/locking-files-with-flock . For your code you may use flock() as

<?php
$root=realpath($_SERVER["DOCUMENT_ROOT"]);
$var=time()."|";
echo $var."<br/>";
$filename=$root."/Testing/Testing/test1.txt";

$myfile=fopen($filename,"a");
if (flock($myfile, LOCK_EX)) {        
    fwrite($myfile,$var);
    flock($myfile, LOCK_UN); // unlock the file
} else {
    // flock() returned false, no lock obtained
    print "Could not lock $filename!\n";
}

fclose($myfile);

$myfile=fopen($filename,"r");
if (flock($myfile, LOCK_EX)) {        

    $contents = fread($myfile, filesize($filename));
    echo $contents;
    flock($myfile, LOCK_UN); // unlock the file
} else {
    // flock() returned false, no lock obtained
    print "Could not lock $filename!\n";
}
fclose($myfile);
?>
Tim Krief
  • 354
  • 2
  • 10
Vijay Rathore
  • 593
  • 8
  • 16
  • Why did you use LOCK_EX when reading the file, shouldn't you use LOCK_SH so that multiple users can read the file at the same time, which cause any data corruption to the file. What do you think, ? – John Colbir Sep 02 '17 at 23:54
  • Yeah, you can use LOCK_SH while reading the file. But while writing you must use LOCK_EX as only 1 process should be able to write at a time. – Vijay Rathore Sep 04 '17 at 03:49
1

I'm not familiar with flock, but my first thought is a locking or queueing mechanism. If the data written does not have to be used or view back to the users, then a work queue would be the best choice. Write the data to a redis or memcached based system, sql or another type of queueing system, or just dump unique timestamped files with the interesting content in a directory that a worker can aggregate in ascending order into the maste file.

For use cases where the written data triggers something the user need to get a report or result of, respond to, or other feed back, then locking might be the way if you can't re-architect to an async stack with eventual consistency. It's also difficult to know with out knowing the load and number of concurrent users, the number of servers etc.

Canis
  • 4,130
  • 1
  • 23
  • 27
1

This is a common theoretical problem for many web applications in many programming languages.

The answer is yes, it CAN cause trouble. However, this is a very theoretical problem, if the contents you are adding to the files aren't very big and if you don't have heavy traffic. Today's operating systems and file systems are so well optimized (Caching, lazy writing etc.) that it is very unlikely to happen, when you close your file handles immediately after using them.

You could add something like a buffer, if you run into an error (check access rights before writing with PHP/catch exceptions in other languages) and try again after some delay or write your buffer to a temp file and merge it with another process - you have several possibilities.

And yes, flock() is a function that could be good for these purposes, but I think, that would be already over-engineered.

iquellis
  • 979
  • 1
  • 8
  • 26
  • according to this "that it is very unlikely to happen, when you close your file handles immediately after using them." I can only tell you that I thought exactly the same up to the moment cron ran my PHP script twice at the same moment on my server. This was badly configured task but it happens. – Artur Poniedziałek Sep 02 '17 at 21:13
  • Good example. But for a cronjob, it would be easy to write a lock ;-) – iquellis Sep 02 '17 at 22:20
  • @iquellis Can using flock really guarantee that I wouldn't get a corrupted data to the file? if yes, can you edit your answer with flock added to the code? – John Colbir Sep 03 '17 at 00:00
  • 1
    Never used flock() myself. Only read the documentation on php.net. However, I have seen lots of primitive and bad code that locks something that works on extreme high traffic APIs and does not cause problems. Finally everyone puts his pants on one leg at a time. – iquellis Sep 03 '17 at 00:20
-1

You need to change saving mechanism from simple file into any database with transactions.

  • 1
    I don't want to use a database, I want to use a plain text. – John Colbir Sep 02 '17 at 18:20
  • 3
    I understand what you want, but you need to know that you cannot fully control any file from PHP script. All above tricks works in 99% but you cannot control this 1%. This is why in databases we have a mechanism of transactions. – Artur Poniedziałek Sep 02 '17 at 21:04