6

When reading from database, I want to sort my Post entities based on two factors:

  • likes count (the more the better)
  • age (the newer the better)

Currently I have implemented it this way (as a calculated value):

@Entity
public class Post {

    // divide timestamp by a day length so after each day score decrements by 1
    @Formula("UNIX_TIMESTAMP(creation_date_time) / 24 * 60 * 60 * 1000 + likes_count")
    private long score;

    @CreationTimestamp
    private LocalDateTime creationDateTime;

    @Min(0)
    private long likesCount;
}

It works fine but may not be the best approach because:

  1. I think the RDBMS cannot make any index for score attribute.
  2. The hard-coded function UNIX_TIMESTAMP() is specific to MySQL. So this will cause problems if I want to use another database (say H2) in my test environment.
Mahozad
  • 18,032
  • 13
  • 118
  • 133
  • 3
    Store the score of the post in the database. When you update the like count, update the score. Now it can be indexed and there is no need to use database specific functions. – Strelok Oct 18 '18 at 15:58
  • But the main problem is age of the `Post`. How about that? – Mahozad Oct 18 '18 at 16:00
  • For the age of the `Post`, instead of using `UNIX_TIMESTAMP()`, which is like you said specific to MySQL, you could use the Java equivalent (see https://tecadmin.net/get-current-timestamp-in-java/) and INSERT the result in the database. I agree with @Strelok's suggestion. – Nic3500 Oct 18 '18 at 17:32
  • Using a Numeric field to store the timestamp in the database instead of one of the Date types would solve issue number 2. And maybe issue 1, I'm not sure. – Bernie Oct 25 '18 at 05:49
  • Actually, the score in your question only changes when `likesCount` changes, so @Strelok's suggestion would save you having an `@Formula`, and would easily support the use of an index. If your formula is meant to calculate a different result over time, you may want to edit your question. – Bernie Oct 25 '18 at 22:04

2 Answers2

1

I think of a solution that could be interesting and could help you maintain your score updated. Would be to create a scheduler that will be, every certain amount of time (for my example it will be done daily at 1 am), will go through all posts updating their scores, that way maintaining an updated score.

@Component
public class Scheduler {

    @Autowired
    PostService postService;

    // Dialy routine that will start at 1:00 am.
    @Scheduled(cron="0 0 1 * * *")
    public void updateDateScore() {

        // Of course, I wouldn't recommend doing this all at once.
        // I would do it in batches, but this is just to give you an idea.
        List<Post> posts = postService.getAll();
        for(Post p: posts) {
           LocalDateTime time = p.getCreationTime();
           ZoneId zoneId = ZoneId.systemDefault(); 
           long epoch = time.atZone(zoneId).toEpochSecond();
           // Your formula.
           long score = epoch / 24 * 60 * 60 * 1000 + p.getLikesCount();
           p.setScore(score);
           postService.update(p);
        }

    }

}

In order for scheduled tasks to work, you must add the following annotation to your Main class, @EnableScheduling. Of course, this will work on all RDBMS, so you won't need to worry about what DB you are using and you would have an updated index at all times.

Recommendations

  • This should be done in batches, that way it would perform a lot better.
  • I would of course paginate my getPost() method, so that I would only fetch a plausible amount to update each loop.
  • Also, I would set a maximum date to fetch posts. Anyways, after a certain amount of time a post might not be so relevant.
Alain Cruz
  • 4,757
  • 3
  • 25
  • 43
1

Use database triggers to update/maintain those side aggregate tables. Running heavy scheduled jobs for such things (which cause load spikes) really wouldn't make sense...

Also, below WHERE clause will never use indexes. Never ever.

UNIX_TIMESTAMP(creation_date_time) / 24 * 60 * 60 * 1000 + likes_count
Harly Hallikas
  • 610
  • 4
  • 13
  • Part of the requirements, is that the solution shouldn't be DB specific. So programmings triggers does fall out of the scope. Nonetheless, triggers are not very promoted, because of the obscurity they bring along with them. Adding business logic to the DB could be a bad idea. – Alain Cruz Oct 22 '18 at 11:48
  • 1
    Well... yeah... there are some companies where such requirements do exist. Howerver, based on my experience over last 20 years I haven't seen many database server changes for mature products. And most popular database engines have implemented similar functionality, although with different dialect. Usually desision to move to different database is coming from performance issues (which in most cases are a direct result of such requirement to begin with)... – Harly Hallikas Oct 23 '18 at 08:46