I have a news site that have a lot of topics. There might be millions of users following topics. I maintain a sortedset for each user to load news belonging to topics they are following. When an article is added or updated, I will write this article to affected users' lists. Specifically, pseudo code as follows:
if a article is added/updated
get all topics that the article belong (each article may belong to many topics)
for each topic: get all topic followers
update_user_news_list(userId, articleId)
This is the java code with jedis:
static final int LIMIT_BATCH = 1000;
static void addToUserHomeFeed(int index, Jedis jd) {
int range_limit = index + LIMIT_BATCH - 1;
Set<String> list = jd.zrange("Follower:Topic:Id", index, range_limit); // get list of followers
if (list.isEmpty()) return;
Iterator<String> it = list.iterator();
while (it.hasNext()) {
// update user list
}
addToUserHomeFeed(range_limit + 1, jd);
}
The problem is, my site currently has nearly 1 million users, some popular topics followed by around 800000 users and sometimes the system produces "buffer overflow" errors. Am I doing something wrong or there are better approaches? I use redis 2.4