1

My question is regarding ID generation for sharded environment. I am following the same steps as instagram did for generating unique ids. I have a few question on the implementation of this id generation in MySQL.

This is how the ID is being generated (This is a PGQL stored procedure.)

CREATE OR REPLACE FUNCTION insta5.next_id(OUT result bigint) AS $$
DECLARE
    our_epoch bigint := 1314220021721;
    seq_id bigint;
    now_millis bigint;
    shard_id int := 5;
BEGIN
    SELECT nextval('insta5.table_id_seq') %% 1024 INTO seq_id;

    SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
    result := (now_millis - our_epoch) << 23;
    result := result | (shard_id << 10);
    result := result | (seq_id);
END;
$$ LANGUAGE PLPGSQL;



CREATE TABLE insta5.our_table (
    "id" bigint NOT NULL DEFAULT insta5.next_id(),
    ...rest of table schema...
)

MY question is how can I do that in MySQL (equivalent code). This stored procedure should be called from a query to get the next id.

My other question is regarding querying the shards. From what I understand they use logical shards that are maps to actual servers. If they decide to map a logical shard to a new server, this means that they have to query two servers at the same time to aggregate the results, because before they assigned a new server for the same logical shard, some data was added to the older server. I just wanted to know if there's a better way to use logical shards and having the option to query a single server where the data resides instead of querying all the servers that are and were belong to that logical shard?

Thanks.

Community
  • 1
  • 1
Liron Harel
  • 10,819
  • 26
  • 118
  • 217

6 Answers6

1

By the look of the code it looks like you just need to replicate sequence, you can do it by creating table in MySQL with AUTO_INCREMENT and use it for generating identity numbers.

Farfarak
  • 1,497
  • 1
  • 8
  • 8
  • but how still can't ordered results from several shards if I run AI on every shard start from 1. I want to page the data (ex. 10 last records, page 2 next 10 records and so on). – Liron Harel Oct 22 '12 at 21:36
  • What I meant was to replace this line of code in PL/SQL code: SELECT nextval('insta5.table_id_seq') %% 1024 INTO seq_id;, with Auto_Increment column value from the table. And rewrite the rest of the query in mysql. – Farfarak Oct 23 '12 at 08:10
1

This is postgres equivalent

Specify TABLE_SCHEMA and MASKTABLE below

I am creating a MASKTABLE with a dummy first entry

DELIMITER $$
CREATE OR REPLACE FUNCTION generate_next_id() RETURNS bigint NOT DETERMINISTIC
MAIN: BEGIN
DECLARE our_epoch bigint;
DECLARE seq_id bigint;
DECLARE now_millis bigint;
DECLARE shard_id int;
DECLARE param bigint ;
SET @our_epoch = 1568873367231;
SET @shard_id = 1;
SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA = "SCHEMANAME" AND TABLE_NAME = "MASKTABLE" into @seq_id;
SELECT FLOOR(UNIX_TIMESTAMP()) * 1000 into @now_millis;
SELECT (@now_millis - @our_epoch) << 23 into @param;
SELECT @param | (@shard_id <<10) into @param; 
select @param | (@seq_id) into @param; 
RETURN @param;
END MAIN;$$ 
DELIMITER ;

Usage

select generate_next_id()

You can use it in trigger like

CREATE TRIGGER trigger_name
BEFORE INSERT ON TableName 
FOR EACH ROW
SET new.id = generate_next_id();
Yugandhar Chaudhari
  • 3,831
  • 3
  • 24
  • 40
0

At ScaleBase we found a nice way to provide support for autoincrements, through the variable: SET @@auto_increment_increment=4;.

Just make sure each shard has a starting offset and you're free to go. It's straight, simple, compatible with existing MySQL and stuff.

David Wolever
  • 148,955
  • 89
  • 346
  • 502
Doron Levari
  • 566
  • 4
  • 4
  • Does scalebase can maintain the incremental Id's. I ask that because my URLs are bound to the primary key (website.com/?record=112 , 112 is a primary key to that specific record) – Liron Harel Oct 22 '12 at 23:58
0

Regarding the other question, all data for one logical shard only live on one server in production at a time (without taking replication into account). When they're running out of capacity of their existing servers, they duplicate data in each server to a follower, and when the new server is ready (at this point both the original server and the new server contain all data for a specific logical shard), they start reading all data for half of the shards from the new server, and the other half stays in the original server. See e.g. http://www.craigkerstiens.com/2012/11/30/sharding-your-database/ for illustration.

Petr Bela
  • 8,493
  • 2
  • 33
  • 37
0

I have a question out for the conversion routine I wrote, but it does appear to work!

Can AUTO_INCREMENT be safely used in a BEFORE TRIGGER in MySQL

Once I get the question answered, I'll update this answer as well.

MySQL Version:

CREATE TRIGGER shard_insert BEFORE INSERT ON tablename
FOR EACH ROW BEGIN

DECLARE seq_id BIGINT;
DECLARE now_millis BIGINT;
DECLARE our_epoch BIGINT DEFAULT 1314220021721;
DECLARE shard_id INT DEFAULT 1;

SET now_millis = (SELECT UNIX_TIMESTAMP());
SET seq_id = (SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA = "dbname" AND TABLE_NAME =     "tablename");
SET NEW.id = (SELECT ((now_millis - our_epoch) << 23) | (shard_id << 10) | (SELECT MOD(seq_id, 1024)));
END
Community
  • 1
  • 1
jsidlosky
  • 407
  • 1
  • 5
  • 17
0

Implementing the exact same id generation logic with a MySQL stored procedure is not possible. However, this can be done using a MySQL UDF.

Here is a UDF that recreates instagrams id generator with a few changes. Modifying it to work exactly like the instagram id generator would involve updating the MAX_* constants and the bit shifting logic in next_shard_id and shard_id_to_ms.

Nate
  • 1
  • 1