1

I am using an API service from a provider. The API usage looks like -

https://api.thesite.com/getTable1Records?offset=0 
https://api.thesite.com/getTable2Records?offset=0 
https://api.thesite.com/getTable3Records?offset=0 

(NOT the real addresses) returns JSON of 1000 records for each API call.

For the first time, I retrieved all the records and saved it to my database server. The user searching/processing is run on my database server. It is the way the API service provider recommeded.

The API service provider updates their database whenever the data are changed. I am not able to know when they change and what they change. They might add new records/ update the existings/ delete some. I need to update my database periodically (weekly, every Monday OR 2 times a week is ok).

Here is my PHP code which update one of the table

// Update Table1 
echo "STARTED@" . time() . "<br />\n"; // just for log
$offset             = 0;
$username           = "username";
$password           = "password";
$url                = "https://api.thesite.com/getTable1Records";

$c                  = curl_init();    
do{
    curl_setopt($c, CURLOPT_URL,             "$url?offset=$offset" );
    curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($c, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($c, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($c, CURLOPT_USERPWD,        "$username:$password");
    $json           = curl_exec($c);
    $phpobj         = json_decode($json);
    $offset         += 1000;
    update($phpobj);
    echo "1000UPDATED@" . time() . "<br />\n"; // just for log
}while( count($phpobj) > 0 );
echo "ENDED@" . time() . "<br />\n"; // just for log

function update($phpobj){    
    $host     = "localhost";
    $user     = "root";
    $pass     = "";
    $db       = "theapitest";
    $link     = mysqli_connect($host, $user, $pass, $db);    

    for( $i=0; $i<count($phpobj); $i++){
        $row     = $phpobj[$i];
        $id      = mysqli_real_escape_string($link, $row->id);
        $name    = mysqli_real_escape_string($link, $row->name);
        $query   = "INSERT INTO `tablename` VALUES('$id', '$name') 
                    ON DUPLICATE KEY UPDATE `name`='$name'";
        mysqli_query($link, $query);
    }
    mysqli_close($link);
}//end function

The problems are

  1. It is too slow. Some tables have million records. (Any better ways?)
  2. Some tables have no primary key (I cannot use INSERT INTO ... ON DUPLICATE KEY UPDATE)
  3. For the record deletion (I don't know how to do; I think delete all records and add all again is not the best idea)
Tun Zarni Kyaw
  • 2,099
  • 2
  • 21
  • 27
  • You can approach in this way: 1. create another table where you dynamically insert and update. 2. When all the insertion or updation is done on a particular interval then dump all the records from this table to your primary table. 3. The primary table will be used by your site for searching and other tasks. 4. Until and unless all the records are fetched from the api dont update your primary table. – Viswanath Polaki Feb 24 '14 at 08:19
  • @eggyal ... is that all what we can do? i hope better way. – Tun Zarni Kyaw Feb 24 '14 at 08:19
  • You might want to try more of a database approach using LOAD DATA INFILE. Get the data as csv from the API and then load it into MySQL. – smoore4 Feb 24 '14 at 08:49
  • @ViswanathPolaki I have that idea also. – Tun Zarni Kyaw Feb 24 '14 at 08:55
  • @SQLDBA I will have a try `LOAD DATA INFILE`. – Tun Zarni Kyaw Feb 24 '14 at 08:56

0 Answers0