0

I have two class called School and Student as you see. I want to search for "students that school names are bla bla bla" and "schools that have students which has higher grade than 90". I read some documents but I am a little confused.

public class School extends BasicDBObject  {
  private int id;
  private String name;
  private String number;
  private List<Student> studentList = new ArrayList<Student>();,

  //getter and setters
}

public class Student extends BasicDBObject{
  private int id;
  private String name;
  private String grade;
  private School school;

  //getter and setters
}
Community
  • 1
  • 1
hellzone
  • 5,393
  • 25
  • 82
  • 148
  • From my little experience I'd say that both require you to write map reduce queries. The first could somehow be solved by first querying the schools and then you have the `Student` lists. – Prinzhorn Jul 07 '13 at 21:04
  • @Prinzhorn I will query and get Schools list, then for every School id I will search for Student list? What if I have get 1000 Schools that name is bla bla bla? – hellzone Jul 07 '13 at 21:09
  • There are no joins in MongoDb. However, your query doesn't look like a join is needed if you've got the student details in the School document. Show us the queries you have tried. Search by name, and then search in the student list for a grade > 90. I always find it easiest to build sample docs using the MongoDb console, then build the queries there, and then translate them as needed to the destination programming language. – WiredPrairie Jul 07 '13 at 22:39
  • @WiredPrairie I have nearly 100 classes. I can't build sample docs for every situation. I want to find a general solution for join operation but I think nobody has any idea about join queries in mongodb. – hellzone Jul 08 '13 at 06:39
  • @hellzone I don't know about the Java driver, but with Mongoose (node.js) you'd query the Schools (one roundtrip) and then fill all the lists with another roundtrip (one request for all Students). Doesn't the Java driver do that (fill the `studentList` when querying for `School`?). – Prinzhorn Jul 08 '13 at 08:14
  • @Prinzhorn Of course It doesn't. I don't understand how Mongoose does it. How does it know which fields are related between documents? – hellzone Jul 08 '13 at 08:22
  • @hellzone The `School` documents have a field `studentList`, which contains an array of `ObjectId`s. – Prinzhorn Jul 08 '13 at 09:20
  • @Prinzhorn - not all the drivers support the lazy loading like Mongoose. – WiredPrairie Jul 08 '13 at 10:55
  • If you really have 100 classes, especially with relationships, you may find that you can't efficiently use MongoDb. Without joins, the driver must make one or more queries to each collection to gather data relationships (or it must be done manually) – WiredPrairie Jul 08 '13 at 10:59
  • @WiredPrairie Forget 100 classes. For above scenario(Student and School) what are the best designed entities? I want to understand If all entities are designed for Mongodb then what MongoDB does different from Relational DBs. – hellzone Jul 08 '13 at 11:17
  • This isn't a tutorial site. If you're asking the basic questions like "what does MongoDB do differently", you should start with the docs, and importantly the FAQ: http://docs.mongodb.org/manual/faq/. – WiredPrairie Jul 08 '13 at 12:11
  • @WiredPrairie Just read my comment carefully.I didn't say guys tell me the difference between MongoDB and relationalDb. My question is what are the best designed entities for this problem. I am asking this question because of nobody knows how to query "students that school names are bla bla bla" statement and they say change your entities. – hellzone Jul 08 '13 at 12:31
  • I did read your comment carefully. I'm was trying to understand and help. "then what MongoDB does different from Relational DBs". But, this question is beyond the scope of StackOverflow as written. – WiredPrairie Jul 08 '13 at 12:40

1 Answers1

10

MongoDB is not a relational database. It doesn't support joins. To simulate a join, you have to query the first collection, get the results, and then query the second collection with a large $in query filled with the applicable key values of the documents returned by the first query. This is as slow and ugly as it sounds, so the database schema should be designed to avoid it.

For your example, I would add the school names to the Student documents. This would allow to satisfy both of your use-cases with a single query.

Anyone coming from a relational database background would now say "But that's a redundancy! You've violated the second normal form!". That's true, but normalization is a design pattern specific to relational databases. It doesn't necessarily apply to document-oriented databases. So what design patterns are applicable for document-oriented databases? Tough call. It's a new technology. We are still figuring this out.

Philipp
  • 67,764
  • 9
  • 118
  • 153
  • "For your example, I would add the school names to the Student documents. This would allow to satisfy both of your use-cases with a single query." I don't understand anything. What if I want to query "students that school number is ..." then I will add school numbers to the Student documents? – hellzone Jul 08 '13 at 06:33
  • exactly the point! in general, for nosql db you have to think first which kind of query you will perform and then design the schema. in contrast to the sql world where you design a schema that is normalized for every kind of query you will think of in the future. – ALoR Jul 08 '13 at 06:57
  • @ALoR Then i will add all Student fields(id,name,grade) to School Document and all School fields(id,name,number) to School Document. I have nearly 100 classes and I will add all other 99 classes field's(thousands of fields) to the School Document. I think there is a problem here. Every class can have millions of lines. – hellzone Jul 08 '13 at 07:57
  • @hellzone Regarding your first comment: I understood your question like you want to search students by school name. When you want to search them by schoon *id*, you would put the school id into the students document. Regarding your second comment: Do you actually need queries for all those fields? When you want to allow extensive data analysis in the future but you don't know yet what kind of analysis, you should use a relational database. – Philipp Jul 08 '13 at 08:19
  • @Philipp I think I must put the School id into the Students document and Student id into the School Document. Then I will search for schools by name and get their ids, then I will search for students with these ids. I can do this with relational db so what is the main point of nosql? You say (relational db = nosql + join). – hellzone Jul 08 '13 at 08:45
  • 2
    @hellzone There is no such thing as NoSQL. When you actually mean "what's the main point of MongoDB over relational databases", among its advantages are better clustering, better sharding, support for nested data and that it allows non-homogenous data (storing documents with different fields in the same collection). But as long as you try to reduce MongoDB to the subset of features it has in common with relational databases, you won't become happy with it. – Philipp Jul 08 '13 at 08:59
  • @Philipp What I got is that Nosql just makes things complicated without Join operation. If you have 2-3 table and there is no relation between these table, then you can use Mongodb. I think MongoDB is not enough for real world problems but some simple school projects. – hellzone Jul 08 '13 at 10:31
  • @hellzone This is a question&answer website. When you want to discuss opinions, please post on a discussion forum. – Philipp Jul 08 '13 at 12:21