0

I have few sample databases but they are quite small in size. I want to create a database with records upto 50GB. I want to learn many aspects of the database and also want to test our application's performance against it.

How to gather random data?

RPK
  • 215
  • 1
  • 5
  • 12

3 Answers3

2

Just script a lot of appropriate inserts with randomised data.

I can't be any more specific as you tell us nothing about your database.

Chopper3
  • 101,299
  • 9
  • 108
  • 239
1

You can use some "Faker" APIs like this one : http://faker.rubyforge.org/

Or if you want real data, you can gather it from internet, some months ago I needed real data for tests, I've made a IRC bot that log all the messages on the Top 10 channels on Freenode and let it run 24h24 for many weeks, that gave me lot of data (~1 million rows) :)

Kedare
  • 1,786
  • 4
  • 20
  • 37
1

A number of sources on the internet allow you to download their content, and it often reaches into the 10's or 100's of GB. Two that I can think of off the top of my head are:

Though these dumps are in XML, it's easily imported into an empty database by any modern RDBMS. A number of other sites have dumps as well, especially wiki-ish sites (like both of these examples).

SqlRyan
  • 906
  • 5
  • 14
  • 22