0

i was getting in a question when i got this scenario:

I have to make a history log about what the user does and of course the user can do a lots different action.

i thought two different 2 way for make it i just need someone that can help me to follow the right way.

First way:

Create 2 different tables

  • History_user
  • History_type

History_user table

id | user_id | history_type (int)
 1      1             1
 1      3             2

History_type

id | name_action (string)
 1   The user has posted on the wall
 2   The user has change his profile picture

and then just join on the query with History_user.history_type = History_type.id

Second way:

is create the History_user table and an helper example called Converter.

<?php

class Converter {

  function history($type_history) {
        switch($type_history) {
           case 1:
            $human_history = "The user has posted on the wall";
           break;

           case 2:
             $human_history = "The user has change his profile picture";
           break;
        }

        return $human_history;
  }

}

$converter = new Converter();
$converter->history(1);

I was looking for the better way for do that, in terms of performance and maintainability. Thank you.

Fabrizio Fenoglio
  • 5,767
  • 14
  • 38
  • 75
  • 1
    I would just do the join, which if indexed will be very quick. Keeping the data in the tables is also far better for maintenance. – Kickstart Jan 28 '14 at 11:33

1 Answers1

1

Both helper and History_type table are necessary for information representation. In terms of performance it doesn't really matter, because you will insert only in one table on user action. If you need to represent data, you will need just one more query to get descriptions of actions (without joins, ofc, if you want some performance). So 2 tables way is more flexible and extendable.

You still could do that helper function which lets say will have static cache variable - array of id => name of actions, which will be lazy loaded on history() function like this:

class Converter {
    protected static $_cache;

    protected static function _loadCache() {
        if (null !== self::$_cache) {
            return;
        }
        self::$_cache = array();
        $query = "SELECT * FROM `History_type`";
        $res = mysql_query($query);
        while ($row = mysql_fetch_assoc($res)) {
            self::$_cache[(int) $row['id']] = $row['action'];
        }
    }

    public static function history($id) {
        self::_loadCache();
        return isset(self::$_cache[$id]) ? self::$_cache[$id] : 'Undefined action';
    }
}

Converter::history(1);
mr. Pavlikov
  • 982
  • 5
  • 7
  • Thank you for the answer, but when you said `you will need just one more query to get descriptions of actions (without joins, ofc, if you want some performance).` How come is more performance doing 2 queries then 1? Can you just be more precise, because i didn't understood how the helper can be useful having both the tables. I appreciate your help. :) Cheers! – Fabrizio Fenoglio Jan 28 '14 at 11:43
  • With two tables way you only lose performace when trying to represent log of some users actions. Loss of performance equals one query using index field, thats not too much of a loss. Upon inserting into log there is no difference - you will still insert only in one table. – mr. Pavlikov Jan 28 '14 at 11:45
  • Or do you mean "no joins" part? Joins are not faster then two queries. Especially when it goes to "repeated" joined values. – mr. Pavlikov Jan 28 '14 at 11:46
  • Also join brings you too much of information. If 1 user posted on wall 10 times, you will get "Wall post" string 10 times in reply to your sql query. You do not need that, if you know that id=1 is "wall post" already. Also u can re-use that list of id => name paris later in code and so on. – mr. Pavlikov Jan 28 '14 at 11:47
  • 1
    Here is converter class example, which will do mysql query only once to output action name: http://pastebin.com/3XQNnLpf – mr. Pavlikov Jan 28 '14 at 11:53
  • Thank you now it's make more sense!! :) and then if I put a new "action" on the table i need just to clear the previous cache and it will refresh new cache with the new one right? – Fabrizio Fenoglio Jan 28 '14 at 12:05
  • This is not that type of "cache" like memcached or something. It renews itself on every page load if this class is called. So u need to do nothing with new action added. – mr. Pavlikov Jan 28 '14 at 12:08
  • Ok! I got it! Cheers! – Fabrizio Fenoglio Jan 28 '14 at 12:14
  • Repeated queries are generally a lot slower that a join using indexed fields when using php / mysql. – Kickstart Jan 28 '14 at 13:27
  • Who says about repeating? – mr. Pavlikov Jan 28 '14 at 13:32
  • It is still 2 separate queries rather than a single fast join. Join on an indexed integer field (as in this case) will take no time at all compared to the overhead of sending an extra query to MySQL. – Kickstart Jan 28 '14 at 13:37
  • Well, sorry, I'm used to highload, and there are no joins there. Joins on non-indexed prohibited. Joins on indexed not recommended. – mr. Pavlikov Jan 28 '14 at 13:39
  • Knocking up a quick test script, executing a join 100000 times, and on the same tables executing 100000 times a read to get the lookup values (ie, history type) and then a read to get the individual history user values and assigning the history type from the array. The join method takes about 14 seconds while splitting it into 2 separate queries takes 21 seconds. A fairly major difference. – Kickstart Jan 28 '14 at 14:09
  • Oh this pretty much depends on many things, sql caching the repeated requests, number of rows in joined query and amount of columns you get from there. What happens when there will be special settings like json field in db, or just few other columns? Duplicate them too on each join? Separate queries make it easier for abstraction, it will be easy to redo in mongodb or in some api that doesn't have joins. – mr. Pavlikov Jan 28 '14 at 14:22
  • While it can be more complicated, not using joins pretty much defeats the object of using a relational database. If the fields are required on each row I would say to duplicate them on each join unless the volumes are very high (in which case I am not sure that the caching would help). Also if the SQL is getting the data related to what turns out to be (say) half a dozen users, there is no point to getting and caching the details of several thousand other users not required at that moment. – Kickstart Jan 28 '14 at 14:30
  • Yeah. Relational databases are no more. – mr. Pavlikov Jan 28 '14 at 14:31