0

I am having some problems with parsing large text strings.

I have this line:

$tokens = token_get_all("<?php\n" . $string . "\n?>");

and when the string is small it works fine. But when the string is like 15mb my app crashes showing just a blank page. If i debug it with die(), die on top of this line works fine but when i put it down this line it doesn't work.

Anybody have any ideas how to make it possible for token_get_all to parse large strings?

user2707590
  • 1,066
  • 1
  • 14
  • 28
  • There's really no way we can diagnose this without more detail. There could be a syntax error in your string, or it could be too big, or any number of other possible errors. Have you tried switching on `error_reporting` and `display_errors`? That way PHP will output an error message instead of a blank screen, which would surely be helpful to you for debugging? Alternatively, try checking the server error logs to see the error details. – Simba Aug 07 '15 at 08:03

1 Answers1

0

Depending on the version of PHP you're running, 15mb may be exceeding the default memory_limit defined in php.ini

Add this line to the top of your script and try it again

ini_set('memory_limit', 128M);

By adding this line you're changing the amount of memory accessible to this script to 128mb.

Evan
  • 89
  • 5
  • The default memort limit is already 128 and i also tried it on 512M, but still doesn't work. any other idea? – user2707590 Aug 07 '15 at 06:44
  • It depends on the content of _$string_ – Evan Aug 07 '15 at 07:37
  • the content of _$string_ is data from a text file aroung 14mb i increased the memory limit even more by ** ini_set('memory_limit', '1024M');* and now its working, and my pc gets damn slow with that.. do you think it's a good idea to set it to 1024? – user2707590 Aug 07 '15 at 07:58
  • There's nothing wrong with a 1024M memory limit per se -- some programs do need that much. But if it's slowing your whole PC down then it implies that you're out of physical memory and the computer is resorting to using the hard disk swap space. You need to avoid that because it will make everything painfully slow. It's not clear what you're trying to do with your call to `token_get_all()` but it sounds like you're giving it way too much data; maybe you need to find another way to acheive what you're trying to do. – Simba Aug 07 '15 at 08:06
  • I don't see a problem with it. Try running your script again but with this... ini_set('memory_limit', -1) That removes the restriction on the amount of memory PHP has access to for the duration of the script. I wonder if that would improve performance. Removing memory restrictions is often considered bad practice though. – Evan Aug 07 '15 at 08:08
  • As Simba is saying, you'll need to intelligently reduce the amount of data `token_get_all( )` is dealing with. Try chopping up the `$string` into smaller substrings and calling `token_get_all()` on those smaller strings. Then merge the output arrays into one array. – Evan Aug 07 '15 at 08:15
  • Actually i am using a parser to parse VDF files (valve data format). I need to make an api call to obtain the file, which is around 14-15 mb. So i do not have control on the file size or the data in it. and there are no other way of obtaining this data. The main problem here is that thif file will grow over time and god knows if valve will ever make an easier way of getting data from this file. **FYI:** vdf is almost as similar as a json file but with no columns and commas in arrays. can't use json_decode :( i think the whole process to parse the 15mb file takes around 5-7 minutes which is ok – user2707590 Aug 07 '15 at 09:46
  • @user2707590, check out this [existing thread](http://stackoverflow.com/questions/9301511/parsing-valve-data-format-files-in-php) – Evan Aug 07 '15 at 18:01