3

I have a lot of (about 1 million in second)structural data that must be insert to database I see a lot of benchmark about sql vs noSql and type of Nosql then collect cassandra as database

but I create a benchmark to test mysql vs cassandra in write/update/select speed mysql have better performance in my benchmark, I want to know what is my mistake??

php use as programming language YACassandraPDO and cataloniaframework use as php driver and PDO use as mysql driver

my server is centOS 6.5 with 2 core CPU and 2GB RAM, mysql and cassandra have default configuration

detail of benchmark:

cassandra keyspace and column family structure: CREATE KEYSPACE test2 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 } AND durable_writes = false;

CREATE TABLE test (
    uuid int PRIMARY KEY,
    symbol_id int,
    bid int,
    ask int,
    time timestamp,
);

mysql database and table structure: CREATE DataBase test;

CREATE TABLE `test` (
    `id` INT NOT NULL ,
    `symbol_id` int,
    `bid` int,
    `ask` int,
    time timestamp,
    PRIMARY KEY (id)
)ENGINE=MyISAM;

my result of benchmark:

Insert each 100000 record in cassandra in about 26s, Insert each 100000 record in mysql in about 11s

UPDATE each 100000 in 24s in about cassandra, UPDATE each 100000 in 12s in about mysql

SELECT each 10000 in 741s in about cassandra, SELECT each 10000 in 21s in about mysql

my php code use for benchmark:

cassandra code:

$db_handle = new PDO("$dbtype:host=$dbhost;port=$dbport;cqlversion=3.0.0;dbname=$dbname", $dbuser, $dbpass);
while ($count < $rowNum){
    $stmt = $db_handle->prepare("INSERT INTO test (uuid, symbol_id, bid, ask, time) values ($count, " . rand(1, 100) . ", " . rand(1, 10000) . ", ".  rand(1, 10000).", dateof(now())); ");
    $exec = $stmt->execute();     
 }
 unset($db_handle);

mysql code:

$db_handle = new PDO("$dbtype:host=$dbhost;dbname=$dbname", $dbuser, $dbpass);
while ($count < $rowNum){
    $stmt = $db_handle->prepare("INSERT INTO test (id, symbol_id, bid, ask, time) values ($count, " . rand(1, 100) . ", " . rand(1, 10000) . ", ".  rand(1, 10000).", now()); ");
    $exec = $stmt->execute();
}
unset($db_handle);
Community
  • 1
  • 1
shahab
  • 171
  • 2
  • 11
  • Why do you think you did something wrong? – Dan Bracuk Sep 03 '14 at 13:28
  • 2
    @shahab . . . Your test is likely more of the `prepare` time for the queries than of the underlying database, and I wouldn't be surprised if MySQL makes better use of a query cache. You can test this by moving the prepare outside the while and running it a zillion times. Another possibility is to prepare the statement once with parameters and then just assign the parameters inside the loop. – Gordon Linoff Sep 03 '14 at 13:53
  • 1
    It could be also that the driver is slow. MySQL’s driver is years old and was improved over that time. Cassandra’s driver is rather new comparingly. – peter Sep 03 '14 at 16:53
  • @DanBracuk In most benchmark show cassandra write is better than mysql but my benchmark show opposite of it, why? – shahab Sep 04 '14 at 04:46
  • @zahorak for cassandra 2.0 I just find this two framewrok good, and no one good enough – shahab Sep 04 '14 at 04:47
  • @GordonLinoff I test this way that you say. but don't change time – shahab Sep 04 '14 at 06:15
  • Are you just using a single client process? You are unlikely to be able to max out Cassandra without using multiple clients, preferably on multiple machines. – DNA Sep 17 '14 at 21:58
  • @DNA yes, I use just one node, but my mysql is not cluster and is in a just one system, but I think cassnadra must have more speed in equal position – shahab Sep 18 '14 at 15:35
  • 1
    I'm talking about client processes, not server. If you have a single client writing to cassandra, you won't get anywhere near to the maximum write speed, even for a single server node - cassandra is designed for a lot of concurrency. – DNA Sep 18 '14 at 23:21

2 Answers2

0

Disable possible MySQL caching by saying

SELEC SQL_NO_CACHE ...

MySQL checks the PRIMARY KEY integrity with each INSERT/UPDATE. MariaDB at least can disable this for updates, might work in MySQL, too.

Benvorth
  • 7,416
  • 8
  • 49
  • 70
0

if you wanna test cassandra, you can simply use cassandra-stress tools, installed with datastax. you can find it in C:\Program Files\DataStax-DDC\apache-cassandra\tools\bin it's a bat file. no need to write even a line of code, simply execute it with desired parameter and benchmark the cassandra.

Ali Ziaee
  • 569
  • 5
  • 10