How to save memory when reading a file in Php?

Question

I have a 200kb file, what I use in multiple pages, but on each page I need only 1-2 lines of that file so how I can read only these lines what I need if I know the line number?

For example if I need only the 10th line, I don`t want to load in memory all the lines, just the 10th line.

Sorry for my bad english!

Gordon · Answer 1 · 2010-04-09T07:08:04.947

19

Try SplFileObject

echo memory_get_usage(), PHP_EOL;        // 333200

$file = new SplFileObject('bible.txt');  // 996kb
$file->seek(5000);                       // jump to line 5000 (zero-based)
echo $file->current(), PHP_EOL;          // output current line 

echo memory_get_usage(), PHP_EOL;        // 342984 vs 3319864 when using file()

For outputting the current line, you can either use current() or just echo $file. I find it clearer to use the method though. You can also use fgets(), but that would get the next line.

Of course, you only need the middle three lines. I've added the memory_get_usage calls just to prove this approach does eat almost no memory.

edited Apr 09 '10 at 07:08

answered Apr 08 '10 at 22:35

Gordon

312,688
75
539
559

Nice. I didn't notice that `seek` was line rather than byte based. – Yacoby Apr 08 '10 at 22:43
+1 I prefer this code because it's just less work for the programmer, and it's clearer what's happening (seeking to certain line) than `fgets`. – davidtbernal Apr 08 '10 at 22:43
@Yacoby there is `SplFileInfo::fseek()` and `SplFileInfo::seek()`. The latter is line based, the other is byte based. `seek()` is a method from the `SeekableIterator` interface. – Gordon Apr 08 '10 at 22:45
3

Note that the line number being `seek`-ed to is not line 5,000. The `$line_pos` parameter is zero-based so the example seeks to line number 5,001 as it would be seen in a text editor, etc.. – salathe Apr 08 '10 at 23:17

score 3 · Accepted Answer · edited May 23 '17 at 12:31

3

Unless you know the offset of the line, you will need to read every line up to that point. You can just throw away the old lines (that you don't want) by looping through the file with something like fgets(). (EDIT: Rather than fgets(), I would suggest @Gordon's solution)

Possibly a better solution would be to use a database, as the database engine will do the grunt work of storing the strings and allow you to (very efficiently) get a certain "line" (It wouldn't be a line but a record with an numeric ID, however it amounts to the same thing) without having to read the records before it.

edited May 23 '17 at 12:31

Community

1
1

answered Apr 08 '10 at 22:17

Yacoby

54,544
15
116
120

That Database will be faster is subjective. If the information he is trying to access is in the beginning of the file it will be a lot more faster. Reading from a database is still reading from a file. He will get improvement from the database index only if he is looking for something away from the beginning of his file. It also depends on what he is trying to achieve exactly. – Ivo Sabev Apr 08 '10 at 22:26
2

He never said the database would be faster. Only that it would be better. The OP's concern could be seen as an issue of memory rather than speed. – webbiedave Apr 08 '10 at 22:32
1

@Ivo As @webbiedave said, I never mentioned faster. I was trying to add in the suggestion that there are alternatives that *may* be a better solution to the problem rather than the first solution I suggested. – Yacoby Apr 08 '10 at 22:54

score 2 · Answer 3 · answered Apr 09 '10 at 02:41

Do the contents of the file change? If it's static, or relatively static, you can build a list of offsets where you want to read your data. For instance, if the file changes once a year, but you read it hundreds of times a day, then you can pre-compute the offsets of the lines you want and jump to them directly like this:

 $offsets = array();
 while ($line = fread($filehandle)) { .... find line 10 .... }
 $offsets[10] = ftell($filehandle); // store line 10's location
 .... find next line
 $offsets[20] = ftell($filehandle);

and so on. Afterwards, you can trivially jump to that line's location like this:

 $fh = fopen('file.txt', 'rb');
 fseek($fh, $offsets[20]); // jump to line 20

But this could entirely be overkill. Try benchmarking the operations - compare how long it takes to do an oldfashioned "read 20 lines" versus precompute/jump.

Ivo Sabev · Answer 4 · 2010-04-08T22:52:34.777

<?php
    $lines = array(1, 2, 10);

    $handle = @fopen("/tmp/inputfile.txt", "r");
    if ($handle) {
        $i = 0;
        while (!feof($handle)) { 
            $line = stream_get_line($handle, 1000000, "\n");

            if (in_array($i, $lines)) {
                echo $line;
                            $line = ''; // Don't forget to clean the buffer!
            }

            if ($i > end($lines)) {
                break;
            }

            $i++;
        } 
        fclose($handle);
    }
?>

score 0 · Answer 5 · answered Apr 08 '10 at 22:17

0

Just loop through them without storing, e.g.

$i = 1;
$file = fopen('file.txt', 'r');
while (!feof($file)) {
   $line = fgets($file); // this gets whole line from the file;
   if ($i == 10) {
       break; // break on tenth line
   } 
   $i ++;
}

The above example would keep memory for only the last line it got from the file, so this is the most memory efficient way to do it.

answered Apr 08 '10 at 22:17

bisko

3,948
1
27
29

1

1. you forget $i++, 2. why not just check if $i == 10? – zerkms Apr 08 '10 at 22:18
Bleh, I always forget to put the increments. As for the == 10 ... again, a bad habbit of parsing too much stuff around with repetitions.. really sorry, fixed :) – bisko Apr 08 '10 at 22:20
stream_get_line() is faster than fgets() – Ivo Sabev Apr 08 '10 at 22:20
1

@Ivo: can you measure this difference? btw, C++ code will be faster, than php - so we need to rewrite this in C++? – zerkms Apr 08 '10 at 22:23
10,000 lines file fgets() - 27 seconds, stream_get_line() - 0.5 seconds. You can use assembler if you want. – Ivo Sabev Apr 08 '10 at 22:29
@Ivo, check your hard drive, please. 10,000 lines with fgets = ~0.000327, while stream_get_line goes by ~0.0000532. So this confirms it's faster. Not sure why, though. – bisko Apr 08 '10 at 22:36
@bisko It might be from the version as stream_get_line is faster in new versions of PHP 5+ – Ivo Sabev Apr 08 '10 at 22:52
@brisko I checked inside the PHP source code. fgets() is defined in file.c and stream_get_line is defined in streamsfuncs.c You can read their source code and see the fgets() actually calls stream_get_line with couple of argument checks before that and some result improvements at the back, which make fgets() a bit slower. This is version 5.3.2 – Ivo Sabev Apr 08 '10 at 23:05
@Ivo, that's what I said. What puzzles me is that it took 27 seconds for fgets. – bisko Apr 08 '10 at 23:06

score 0 · Answer 6 · answered Apr 08 '10 at 22:17

0

use fgets(). 10 times :-) in this case you will not store all 10 lines in the memory

answered Apr 08 '10 at 22:17

zerkms

249,484
69
436
539

score 0 · Answer 7 · answered Apr 08 '10 at 22:55

Why are you only trying to load the first ten lines? Do you know that loading all those lines is in fact a problem?

If you haven't measured, then you don't know that it's a problem. Don't waste your time optimizing for non-problems. Chances are that any performance change you'll have in not loading the entire 200K file will be imperceptible, unless you know for a fact that loading that file is indeed a bottleneck.

How to save memory when reading a file in Php?

7 Answers7

Linked