2

While creating a gmail account, it asks us to enter the username. When we enter the username and password, then we click the Next Button. Within couple of seconds it gives the error like "That username is taken. Try another.". There are billions of gmail account. My question is, what algorithm Google uses to find out that the username is already taken or not, and how come it gives the response within 1-2 seconds.

enter image description here

Emma
  • 27,428
  • 11
  • 44
  • 69
Aman Shekhar
  • 2,719
  • 1
  • 18
  • 29
  • 1
    @Emma, Hi, I have gone through Bloom Filters, but it also says that it is not 100% effective. – Aman Shekhar Jun 09 '20 at 04:22
  • A Bloom filter might prevent you from registering a name that's available, but it definitely won't let you register a name that's already taken. So it's fine for checking gmail names. – user3386109 Jun 09 '20 at 04:28
  • 3
    On a single PC with a 4TB drive, you could store about about 10 billion email addresses with password checking data and account IDs in a perfectly ordinary database like Postgres or Oracle, and that DB could check any given email address in about 50ms if it's a spinning magnetic disk or much faster if it's an SSD. It would use a B+tree index or similar to accomplish this: https://en.wikipedia.org/wiki/B%2B_tree – Matt Timmermans Jun 09 '20 at 04:36

2 Answers2

0

Gmail probably sends a query to their server. If it passes through successfully it means email already exists or else sends error not founded.

You can read here more.

Emma
  • 27,428
  • 11
  • 44
  • 69
πΛσ
  • 33
  • 10
0

If you inspect the sign-up page and see, for each username you enter, they send a request to their server. Behind the scenes they could have implemented a bloom filter or anything, but from the client side they do send a request.

Once the request reaches their back-end, it's not a difficult task to cache all the already present user email ids and query that cache for that one value.

Those back-end servers and corresponding cache could be geographically distributed, so as to make sure it responds in a few milliseconds back to the user.

Aman Shekhar
  • 2,719
  • 1
  • 18
  • 29
Sandeep Kaul
  • 2,957
  • 2
  • 20
  • 36