3

What happens if you do not unset an array before the script is done executing?

I am running through thousands of CSV files, parsing data for hundreds of thousands of customers into arrays. It works fine for the first 5/6 hours then starts bogging down bad.

I run about 5-10 CSVs per execution...I'm wondering if unsetting the arrays in the script would help this or not...I thought they would be unallocated after the script ends. Am I wrong?

Mark
  • 31
  • 3
  • You don't mention if you are using command line PHP or accessing it inside a web server. – Jon May 12 '11 at 20:20
  • You'd need to unset the array to clean it up between each csv in any given execution run, assuming that each csv has to be deal with independently. But as a general rule, all memory used by a program is released when the program exits, and this applies to PHP as well. – Marc B May 12 '11 at 20:25
  • @jon its a script i invoke through FireFox... – Mark May 12 '11 at 20:27
  • A PHP script that runs for six (or more) hours? I would really try to change the script so it doesn't need to run that long. If that's not possible, a command line script could be better since it does not require you to keep your browser open. – Arjan May 12 '11 at 20:32
  • @Arjan, it usually takes up to 10 minutes to run a batch of 5 or 10 files...I run the 5 files, then remove them from the dir, put new ones in and run it again..I am doing so few because I didn't really code a whole lot of error handling and if there is a problem, I don't want to sift through a huge amount of data to find the problem. I do this all day, since last thursday ha. – Mark May 12 '11 at 20:34

5 Answers5

1

As far as I'm aware, arrays -- like all memory -- should die when the script does.

Is your PHP script being invoked by another PHP script? If you're doing it by 'include', that essentially takes your 'lower' level PHP script and plugs it into the higher level one -- which would cause them to persist.

RonLugge
  • 5,086
  • 5
  • 33
  • 61
  • This script is a patch to correct an error I made in importing the data the first time..I had to go back and set a date field I forgot the first time, so its bare bones, 1 try statement and one error catching if statement... I am on my local machine where I have a cloned environment of the running database, but after about 5 hours of running the script my machine stops responding until I stop and restart apache and php... I wonder what the problem is then. – Mark May 12 '11 at 20:23
  • How are you looping across the files? Somehow, I don't think you're manually pointing the script at each individual file! – RonLugge May 12 '11 at 20:28
  • I have a dir /csv where I put the files I want to parse...It then reads the dir and puts the list into an array, then foreach through each one. – Mark May 12 '11 at 20:30
  • If you're using FOREACH, then the program doesn't die. I'm assuming you're invoking a function on the elements in the array, rather than generating a new process, but I'm betting I'm right. – RonLugge May 12 '11 at 21:06
  • $files = dirfiles('csv/'); foreach ($files as $el => $val) { foreach ($val as $file) { try { if (($handle = fopen($file, "r")) != FALSE) { // Do stuff – Mark May 12 '11 at 21:08
  • Sorry, can't really paste code in a comment, its just a procedural script to patch an error i made in processing the data the first time...nothing deployment level. – Mark May 12 '11 at 21:10
  • You could have edited the question :D However, my statement is completely correct. You're opening the file, creating arrays from it, then moving onto the next file. You never actually end the script, so the memory never gets reclaimed. – RonLugge May 12 '11 at 21:11
  • Right but why would it work for the first say 20 executions (first 5 hours) then after start going down hill after that?? I only run a few at a time...weird problem. I will try to put some memory handlers in after the foreach iteration completes. – Mark May 12 '11 at 21:16
  • I'm assuming that we're talking about it taking 5 hours of continuos running before it slows down; if so, it's just because it takes that long before the memory leak takes up enough memory to cause issues. – RonLugge May 12 '11 at 21:17
  • Sorry, let me explain better...I copy 5-10 CSVs to /csv then run the script. The script can process those files in under 10 minutes. When it gets done with that, I move the the files out of the dir and put new ones in...and run the script again. After about 5 hours of doing this over and over, it starts to cause my system to hang. – Mark May 12 '11 at 21:20
  • Oh. In that case I have no clue. – RonLugge May 12 '11 at 21:26
1

All memory is cleared when the script ends. Have you tried using memory_get_peak_usage() and memory_get_usage()? They can be useful for finding memory allocation problems.

mcrumley
  • 5,682
  • 3
  • 25
  • 33
0

All used memory should be cleaned up after the script successfully finishes. If not this is a bug in PHP. unseting arrays wont help here.

0

I think it depends which version of PHP you are running. PHP 5.3 has an improved garbage collection mechanism which should prevent this memory leak of sorts. This page (http://www.php.net/manual/en/features.gc.performance-considerations.php) documents the issue in versions prior to 5.3 but suggests that you can manually invoke garbage collection by using the gc_collect_cycles() function (if I've read it properly).

Matt Asbury
  • 5,644
  • 2
  • 21
  • 29
  • I'm running 5.3.4...but will definitely look into that documentation. Thanks! – Mark May 12 '11 at 20:26
  • You might want to have a look at your code as well. I recently came across something called closures in a javascript application that I was helping to debug. What was happening was that in IE, the memory usage was increasing every time the script interacted with the DOM. It turned out that part of the script was creating a circular reference to an object that was supposed to be destroyed, thus freeing up memory, but due to the circular reference IEs garbage collection couldn't clean these unused objects up. You can read more here - http://www.javascriptkit.com/javatutors/closures.shtml ..... – Matt Asbury May 12 '11 at 20:35
0

Unset just destroys a variable - it doesn't free memory. If that behaviour occurs after hours of runtime, and this is a one-time script, maybe split the CSV files into smaller blocks to speed things up and help finding out where the problem starts to occur.

Bjoern
  • 15,934
  • 4
  • 43
  • 48
  • I'm running 5-10 where each is on average 2 megs. It usually takes about 10 minutes to run one batch. – Mark May 12 '11 at 20:29
  • Seems like a solid size. Are you sure its php which is causing the problems? Maybe its just the database files getting too big or something similar. – Bjoern May 12 '11 at 20:31
  • Could be..I'm not sure. The only way to regain control of my system is to restart apache and php... – Mark May 12 '11 at 20:35
  • This just stops the script whatever it is doing at that time. My advice: break it down into smaller pieces and start those one by one. This might narrow down the issue. – Bjoern May 12 '11 at 20:41
  • yea thats why i reduced the batches to 5-10 files...I'll do them individually for a while and see what I come up with. – Mark May 12 '11 at 20:52
  • Good Luck! Please post here if you have narrowed it down. – Bjoern May 12 '11 at 20:53