How to model Student/Classes with DynamoDB (NoSQL)

Question

I'm trying to get my way with DynamoDB and NoSQL.

What is the best (right?) approach for modeling a student table and class tables with respect to the fact that I need to have a student-is-in-class relationship. I'm taking into account that there is no second-index available in DynamoDB.

The model needs to answer the following questions:

Which students are in a specific class?

Which classes a student take?

Thanks

Is this homework? If so, please tag it as such (not trying to be mean, but just checking). — Kiril, Feb 07 '12 at 16:30
It's not homework; I've tried to come up with the easiest constructive question I could think of after spending some time trying to understand NoSQL and non-relational models... — Chen Harel, Feb 07 '12 at 18:06

Niels Christensen · Accepted Answer · 2012-02-07T21:59:34.213

15

A very simple suggestion (without range keys) would be to have two tables: One per query type. This is not unusual in NoSQL databases.

In your case we'd have:

A table Student with attribute StudentId as (hash type) primary key. Each item might then have an attribute named Attends, the value of which was a list of Ids on classes.
A table Class with attribute ClassId as (hash type) primary key. Each item might then have an attribute named AttendedBy, the value of which was a list of Ids on students.

Performing your queries would be simple. Updating the database with one "attends"-relationship between a student and a class requires two separate writes, one to each table.

Another design would have one table Attends with a hash and range primary key. Each record would represent the attendance of one student to one class. The hash attribute could be the Id of the class and the range key could be the Id of the student. Supplementary data on the class and the student would reside in other tables, then.

edited Feb 07 '12 at 21:59

answered Feb 07 '12 at 21:49

Niels Christensen

508
4
9

Since (to my understanding) I am limited to one hash key in DynamoDB (no column indexes) it must be the username and not some arbitrary userId since I would like to get a student's classes while providing his name to a "query" – Chen Harel Feb 08 '12 at 09:16
Yes, if you have a unique username for each student, that's the right identifier to use. – Niels Christensen Feb 08 '12 at 12:29
Is it considered de-normalization? – Chen Harel Feb 08 '12 at 13:18
Which "it"? In general, normalization and denormalization tends to be used in the context of relational databases. – Niels Christensen Feb 09 '12 at 22:31

score -3 · Answer 2 · answered Apr 25 '12 at 20:09

To join two Amazon DynamoDB tables

The following example maps two Hive tables to data stored in Amazon DynamoDB. It then calls a join across those two tables. The join is computed on the cluster and returned. The join does not take place in Amazon DynamoDB. This example returns a list of customers and their purchases for customers that have placed more than two orders.

CREATE EXTERNAL TABLE hive_purchases(customerId bigint, total_cost double, items_purchased array<String>) 
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" = "Purchases",
"dynamodb.column.mapping" = "customerId:CustomerId,total_cost:Cost,items_purchased:Items");

CREATE EXTERNAL TABLE hive_customers(customerId bigint, customerName string, customerAddress array<String>) 
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES ("dynamodb.table.name" = "Customers",
"dynamodb.column.mapping" = "customerId:CustomerId,customerName:Name,customerAddress:Address");

Select c.customerId, c.customerName, count(*) as count from hive_customers c 
JOIN hive_purchases p ON c.customerId=p.customerId 
GROUP BY c.customerId, c.customerName HAVING count > 2;

This is not DynamoDB, and does not answer the question about data modeling — Kyeotic, Mar 31 '17 at 16:12

How to model Student/Classes with DynamoDB (NoSQL)

2 Answers2

Linked