0

I have an activity table which says which users follows who. (fromUser and toUser) I am constructing a leaderboard to see who has the most rating posted, amongst the followers.

So I created this query:

ParseQuery<ParseObject> queryActivityFollowing = new ParseQuery<>("Activity");
queryActivityFollowing.whereEqualTo("type", "follow");
queryActivityFollowing.whereEqualTo("fromUser", ParseUser.getCurrentUser());
queryActivityFollowing.setLimit(500);

// innerQuery, only get Users posted by the users I follow
ParseQuery<ParseUser> queryUserFollowing = ParseUser.getQuery();
queryUserFollowing.whereMatchesQuery("toUser", queryActivityFollowing);

// querySelf
ParseQuery<ParseUser> querySelf = ParseUser.getQuery();
querySelf.whereEqualTo("objectId", ParseUser.getCurrentUser().getObjectId());

List<ParseQuery<ParseUser>> queries = new ArrayList<>();
queries.add(queryUserFollowing);
queries.add(querySelf);

query = ParseQuery.or(queries);
query.orderByDescending("rating_count");
query.setLimit(20);

But somehow, it times out and never displays a result. Is there something inefficient with my query?

Thanks!

Edit: Data description: Activity is a class with 3 columns, fromUser, toUser, type. fromUser and toUser are pointers to the _User class, type is a string

in _User, I have the classic attributes, and an integer named rating_count, to which is the orderBy criteria (updated code above).

Actually, I think the query doesn't time out, but just returns 0 results. I follow some of my users so it's definitely not the expected output.

danh
  • 62,181
  • 10
  • 95
  • 136
Stephane Maarek
  • 5,202
  • 9
  • 46
  • 87
  • Please further describe the data. There's a couple hints in the question about the relevant classes and attributes, but please include their types (e.g. fromUser is a pointer to __User?) and what they mean (e.g. fromUser follows the toUser or the other way around?). I don't see any reference to a "rating" attribute, which the text indicates might be important. – danh May 25 '15 at 19:07
  • @dahn: Thanks for your comments, I have updated the question. If I follow userB, then fromUser contains the pointer of me, and toUser contains the pointer of userB – Stephane Maarek May 25 '15 at 19:14

2 Answers2

1

It's a tough one, because parse's query supports this sort of thing only minimally. The best idea I can offer is this one:

  1. One query on the Activity table whereEqualTo("type", "follow") and whereEqualTo("fromUser", ParseUser.getCurrentUser())
  2. no queryUserFollowing, no querySelf. These are unnecessary. This also frees you from Parse.Query.or().
  3. setLimit(1000) will explain why below
  4. include("toUser")
  5. upon completion, loop through the results, maximize for result.get("toUser").getInt("rating_count") since the results will be instances of Activity, and you'll have eagerly fetched their related toUsers.

This scheme is simpler than what you coded, and will get the job done. However, possibly a major problem is that it will miss data for users with > 1000 followers. Let me know if that's a problem, and I can suggest a more complex answer. A minor shortcoming is that you'll be forced to do the search (maybe a sort) yourself in memory to find the maximum rating_count.

EDIT - For > 1k followers, you're stuck with calling the query multiple times, setting the skip to the count of records received in the previous query, collecting the results in a big array.

Your point about transmitting so much data is well taken, and you can minimize the network use by putting all this into a cloud function, doing the in-memory work in the cloud and only returning the records the client needs. (This approach has the added benefit of being coded in javascript, which I speak more fluently than java, so I could be more prescriptive about the code).

EDIT 2 - Doing this in cloud code has the benefit of reducing the network traffic to just those users (say, 20) that have maximum ratings. It doesn't solve the other problems I indicated earlier. Here's how I'd do it in the cloud...

var _ = require('underscore');

Parse.Cloud.define("topFollowers", function(request, response) {
    var user = new Parse.User({id:request.params.userId});
    topFollowers(user, 20).then(function(result) {
        response.success(result);
    }, function(error) {
        response.error(error);
    });
});

// return the top n users who are the top-rated followers of the passed user
function topFollowers(user, n) {
    var query = new Parse.Query("Activity");
    query.equalTo("type", "follow");
    query.equalTo("fromUser", user);
    query.include("toUser");
    return runQuery(query).then(function(results) {
        var allFollowers = _.map(results, function(result) { return result.get("toUser"); });
        var sortedFollowers = _.sortBy(allFollowers, function(user) { return user.get("rating_count"); });
        return _.first(sortedFollowers, n);
    });
}

// run and rerun a query using skip until all results are gathered in results array
function runQuery(query, results) {
    results = results || [];
    query.skip(results.length);
    return query.find().then(function(nextResults) {
        results = results.concat(nextResults);
        return (nextResults.length)? runQuery(query, results) : results;
    });
}

Note - I haven't tested this, but have similar stuff working in production.

danh
  • 62,181
  • 10
  • 95
  • 136
  • 1
    Hi @Danh, It seems like it would work, but I think there is another inefficiency. I would have to fetch 1000 users through network (kinda costly) and then sort through them to extract the top 20. (extreme case, but you see my point). What would be a more complex answer to make sure all the processing and computing is done on Parse's side? – Stephane Maarek May 25 '15 at 19:35
  • I'm thinking I can remove the query self, add `getCurrentUser()` when the results of the followers are fetched, and see if it falls between the top20. Still interested in your second more complex answer – Stephane Maarek May 25 '15 at 19:37
  • The root problem is that what we really want is to sort on toUser.rating_count, but this kind of relational sort is not available in parse. I'm trying to think of a better approach altogether, but I don't think there is. Will edit answer to describe what I mean by the more complex solution for > 1k followers – danh May 25 '15 at 19:40
  • @Stephane - please see edit. I don't see any scenario where either user query gets you any further. The problem is that the follower query is limited to 500 in your code, or 1000 in mine. Also the first parameter to whereMatchesQuery must be a column on the class being queried. toUser is a column on Activity, not user. – danh May 25 '15 at 19:51
  • @Stephane, I have another idea, much better, imo, but it requires a rethink to your data model. (in a nutshell, a class that combines the information for following and rating... think of it as a user's public persona to other users of your app). With it, this query will be trivial, no matter how many followers, and there are a couple other benefits. I can write it up in another answer, but it only makes sense if you're willing to do some brain surgery on the app to change the data model. Should I write it? – danh May 25 '15 at 19:59
  • hi @Danh. I like the approach of using cloud code as a middleware! I'll explore that. If you don't mind, I'm still interested in how the data model should be changed, maybe it's not too big of a change and I can integrate that. The main concern is that the app is already in production :) – Stephane Maarek May 25 '15 at 20:05
  • 1
    @Stephane - glad to help. Since its a different idea entirely, I added a different answer about a bigger design change. I think the setup described there is more desirable for social apps, since it explicitly separates users' public and private info. – danh May 25 '15 at 20:57
1

If you're up for changing the data model, there's a solution that gets what you need, plus some side benefits. Consider a system where the User class only pertains to the relationship between the app and a real person. Users' public faces to each other are presented by a new class (call it PublicUser or Persona).

In this PublicUser class, you have a pointer to the user that owns it, and a an array of pointers to other PublicUsers who this one is following. This class also contains the rating attribute. Now the query in the OP is simple:

  1. query PublicUser whereKey "following" equals currentUser
  2. order by rating, limit to 20 or whatever number you wish to limit

That's it. Another benefit of this scheme is access control. It's understood by the system that anything in PublicUser is readable to other PublicUsers, and everything about __User is kept between that individual person and the app.

danh
  • 62,181
  • 10
  • 95
  • 136
  • I really like it - it's not too disruptive of my data model, but it would still require some good amount of change in the way my apps work. I also forces me to implement a much better security system, which I've always been interested into. The only caveat I see here, is that a row can only have 128KB of data. Might be the array of 1000 followers would go over that...? who knows. But this is very extreme, just thought of mentioning it as a possible limitation. Thanks so much @dahn – Stephane Maarek May 25 '15 at 21:02
  • 1
    If there's a good chance that followers count will exceed 1k, then we'd switch the array of pointers to a relation. – danh May 25 '15 at 21:04