0

I know there are existing some Questions about Chunking a mysql array in php, but my problem is, that I want to keep the output in JSON.

Scenario: I want to get data from mysql, do some stuff with it ( like time formatting ) and output it in JSON. The JSON data is parsed in the browser and visualized over a javascript chart.

Problem: All of the above is working, but because of the huge amount of data, I'm getting an out of memory error, when I select bigger date ranges to output.

The Idea of directly sending out each x-lines of data is not working because of the JSON format it needs to be. Several JSON chunks won't work, it needs to be one for the chart.

So in the end I need to chunk the data but keep it as one big JSON. (And setting up the memory limit is not really a solution.)

Ideas: One Idea would be, to let the browser chunk the date range and ask the data as chunks & then put them together. Of course this would work, but if there is a way to do this server side, it would be better.

Code:

private function getDB($date1, $date2){
    $query = 'SELECT * FROM `db1`.`'.$table.'` WHERE `date` BETWEEN "'.$date1.'" AND "'.$date2.'" order by `date`;';
    // date = datetime !
    $result = $this->db->query($query);

    $array = array();
    while ( $row = $result->fetch_assoc () ) {
        $array[] = array( strtotime($row[ 'date' ])*1000 , (float)$row[ 'var' ] );
        // the formatting needs to be done, so the chart accepts it..
    }

    $result->close();

    return json_encode($array);
}
proc
  • 300
  • 3
  • 16
  • How big are your data ? Did you keep only useful data to preserve memory ? Show us your code, maybe we can help you to preserve your memory. – Kevin Labécot Aug 15 '14 at 13:07
  • It's for an analyze, I'm talking about 500k for an 1 year timespan, when you only select 1 table. Yeah I kept only the data I need, it's not for an hobby project.. (Yeah I could deny selecting 1 year, or something like that, but that wouldn't satisfy the client.. ) Last measurement with this "bad" unchunked way took 145MB Ram, no question that setting up the ram size is not an option. – proc Aug 15 '14 at 13:23
  • Are you sure, that you want to show all 500k points in chart? Try to decrease number of records using "group by". For example: SELECT *, avg(var) FROM `db1`.`'.$table.'` WHERE `date` BETWEEN "'.$date1.'" AND "'.$date2.'" group by month(`date`), day(`date`) order by `date`; – Alexander Kononenko Aug 15 '14 at 13:34
  • Yeah, I'm sure about that. Also the chart itself is having a selector with an overview and detailed view. – proc Aug 15 '14 at 13:42

3 Answers3

0

Since this is not an option,

ini_set("memory_limit","32M")

perhaps you can add LIMIT to the function paramaters and query:

private function getDB($date1, $date2, $start, $pageSize){
    $query = 'SELECT * FROM `db1`.`'.$table.'` WHERE `date` BETWEEN "'.$date1.'" AND "'.$date2.'" order by `date` LIMIT $start, $pageSize;';
    // date = datetime !
    $result = $this->db->query($query);

    $array = array();
    while ( $row = $result->fetch_assoc () ) {
        $array[] = array( strtotime($row[ 'date' ])*1000 , (float)$row[ 'var' ] );
        // the formatting needs to be done, so the chart accepts it..
    }

    $result->close();

    return json_encode($array);
}

Then setup a for loop in javascript, call this with Ajax, incrementing the $start variable each time.

Store each responseText.substr(1).substr(-1) in an array.

When the responseText is "", all of the records have been returned.

.join the array with a comma, then add a new opening and closing "{ }", and you should have a JSON equivalent to all records.

Minimal parsing, and you'll be using built-in functions for most of it.

var startRec=0;
var pageSize=50;
var xmlhttp=new XMLHttpRequest();
var aryJSON=[];
var JSON;
xmlhttp.onreadystatechange=function(){
    if (xmlhttp.readyState==4 && xmlhttp.status==200){
        if(xmlhttp.responseText==""){ //Might need to check for "{}" here instead of ""
            //All records are received
            JSON="{" + aryJSON.join(",") + "}";
            aryJSON=[];
            startRec=0
        }else{
            aryJSON.push(xmlhttp.responseText.substr(1).substr(-1));
            startRec+=pageSize;
            getNextPage();
        }
    }
}
function getNextPage(){
    xmlhttp.open("GET","your.php?start=" + startRec + "&pageSize=" + pageSize,true);
    xmlhttp.send();
}
alfadog67
  • 831
  • 7
  • 28
  • Like I wrote, icreasing the memory limit is not an option. Simply because that would only block multi-user usage. Also 32megabyte is not enough, 140MB at least, for an 1 year timespan, in my last measurement. – proc Aug 15 '14 at 13:18
  • Yes, I see that now - sorry. I have updated my response to reflect a more appropriate answer. – alfadog67 Aug 15 '14 at 13:52
  • That would be my mentioned idea from the question, only with LIMIT instead of datespan chunks like 30-days chunks. Would give +1 for the adaption & non date processing solution, if I could.. – proc Aug 15 '14 at 14:03
  • Yes it does sound much like your idea, that occurred to me. Also, this solution should work for everyone who has a similar problem. – alfadog67 Aug 15 '14 at 14:09
  • I like it, especially because you gave it another try, but I'm waiting if someone figures out the real solution. I found one and don't want to waste rep.. As soon as I can you'll get an +1 :D – proc Aug 15 '14 at 14:19
  • Hi ProcTrap, I added some javascript to the answer. It's not tested, but shouldn't need too much coaxing to get to work. – alfadog67 Aug 15 '14 at 20:49
  • Ok, I marked yours as answer, the cause for this you can check in my one. Seems like apache is still the problem. – proc Aug 17 '14 at 07:46
0

I would recommend that you have the server send the browser exactly what it needs to create the table. Parsing can be a heavy task, so why have the client do that lifting?

I would have your backend send the browser some kind of data structure that represents the table (i.e. list of lists), with all the formatting already done. Rendering the table should be faster and less memory-intensive.

kbuilds
  • 991
  • 11
  • 19
  • 1
    I'm not talking about memory intense on the client. The server IS sending the data formatted to the client, the real problem lies at the cost of this unchunked processing. – proc Aug 15 '14 at 13:31
0

One way of answer would be, to do the chunking on the server, by giving out the JSON, removing the leading [ & ].

@apache_setenv('no-gzip', 1);
@ini_set('zlib.output_compression', 0);
@ini_set('implicit_flush', 1);

$array = array();
echo '[';
$started = false;
while ( $row = $result->fetch_assoc () ) {
    $array[] = [ strtotime($row[ 'datetime' ])*1000 , (float)$row[ 'var' ] ];
    if(sizeof($array) == 1000){
        if($started){
            echo ',';
        }else{
            $started = true;
        }
        echo substr(substr(json_encode($array),1), 0, -1);
        // converting  [[datetime1, value1],[datetime2, value2]]
        //          to  [datetime1, value1],[datetime2, value2]
        ob_flush();
        $array = array();
    }
}
if($started)echo ',';
$this->flushJSON($array);
echo ']';
flush();

$result->close();

This is working and reducing the ram usage to 40%. Still it seems that Apache is buffering something, so the ram usage increases over the time, the script is running. (Yeah, the flush is working, I debugged that, that's not the problem.)

But because of the remaining increase, the fastest way to achieve a clean chunking is to do this like alfadog67 pointed it out.

Also, to mention it, I had to disable the output compression, otherwise apache wouldn't flush it directly..

proc
  • 300
  • 3
  • 16