13

I've a web app built in PHP with Zend on a LAMP stack. I have a list of 4000 words I need to load into memory. The words have categories and other attributes, and I need to load the whole collection every time. Think of a dictionary object.

What's the best way to store it for quick recall? A flat file with something like XML, JSON or a serialized object? A database record with a big chunk of XML, JSON or a serialized object? Or 4000 records in a database table?

I realize different server configs will make a difference, but assume an out-of-the-box shared hosting plan, or WAMP locally or some other simple setup.

tshepang
  • 12,111
  • 21
  • 91
  • 136
Corey
  • 1,977
  • 4
  • 28
  • 42

9 Answers9

11

If you're using APC (or similar), your fastest result is probably going to be coding the word list directly into a PHP source file and then just require_once()'ing it.

Alex Howansky
  • 50,515
  • 8
  • 78
  • 98
  • 3
    If I would have it in a json file and do decode the json file into an array, would this still benefit from APC? Or would this approach be much slower? – Adam Jul 11 '17 at 12:33
5

In an ideal system I would say memory (memcached), disk and database. But depending on setup the database could be on several occasions faster than disk because the result could stick in the query cache.

It all depends on the environment; and if it's that critical, you should measure it. Otherwise place it where you think it is more accessible.

mhitza
  • 5,709
  • 2
  • 29
  • 52
  • 2
    "if it's that critical, you should measure it." - I could't agree more – carlosvini Feb 13 '14 at 17:55
  • Nice advice, I too believe measuring is best option, in my case the db got better results so I will stick with the db instead of the file system. – Melsi Nov 13 '14 at 12:57
3

I'd place it in a file that can be cached, saving you a lot of unnecessary database calls on a (or maybe even every?) page load. How you store it doesn't really matter, whatever works best for you. Speed-wise, 4000 words shouldn't be a problem at all.

For translations in projects I work on I always use language files containing serialized php-data which is simply easy to retrieve:

$text = unserialize(file_get_contents('/language/en.phpdata'));
Alec
  • 9,000
  • 9
  • 39
  • 43
  • I do this from time to time, I use the database as the original and run a cron Job to update the file. Just cache the file. It can save so many resources. Basically one query for the last updated date in the database if it equals the file date use the file. Make a session or cookie to always reference the file if you want, but it can easily save 1 second of queries for big projects. It isn't for every project, but there are times it is useful. – Dillon Burnett Feb 20 '17 at 20:46
3

Format the list as a PHP source and include it.

Failing that, ask yourself if it really matters how fast this will load. 4000 words isn't all that many.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
1

if you are able to use memcached, creating the array once using any of the methods above, sending it to memcached, and then reuse it from there is probably fastest. Check the answer of Can you store a PHP Array in Memcache for an example. Basically it would look like this:

$cache = new Memcache;
$cache->connect('localhost', 11211) or die ("Could not connect");
$cache->set('words', $myarray);

and to get it:

$myarray = $cache->get('words');
Community
  • 1
  • 1
Tuncay Göncüoğlu
  • 1,699
  • 17
  • 21
1

If you need all 4000 in memory all the time, that defeats the purpose of querying a database, although I could be wrong. Serialized object sounds simple enough and I would think it would perform alright on that number of words.

Aaron Anodide
  • 16,906
  • 15
  • 62
  • 121
0

I had similar problem and run some test for it. Here are the timings for 25 000 loops:

Read one long text from DB: 9.03s Read one file: 6.26s Include php file where is variable containing the text: 12.08s

Maybe the fastest way would be to read this data (once, after restart of server) with any of this options and create database stored in memory (storage engine: memory), but it can be little to complicated, so I would prefer "read from file" option.

Juraj.Lorinc
  • 503
  • 6
  • 26
0

If you're going to serialise the list of words as XML/JSON anyway, then just use a file. I think a more natural approach is to include the list in the PHP source though.

If that list is going to change, you will have more flexibility with a database.

Tim McNamara
  • 18,019
  • 4
  • 52
  • 83
0

If you just need to know which one is faster, I'm going with the DB. In addition to the speed, using the DB is safer and easier to use. But, be careful to use a proper data type, like ntext (MS-SQL server) or BLOB (oracle).