3

Sorry if this is off topic - but it is certainly programming related.

I need to test my web application at scale (concurrent users and amount of data in system). For the latter, I need some way of generating dummy data for a variety of types (name, address, email and some other data types)

Are there any open source (free), or commercial providers of dummy data dictionaries (in any format but preferably mySQL) (I don't really need a whole application - just the data).

How have others solved this problem?

edit: Sorry if I wasn't clear. I don't need a way to code this - I just need the dummy data(base) files to provide the raw information. I don't want nonsense data (like randomly generated characters) because this won't allow us to perform usability tests or demonstrations. If this isn't available in open source - does anyone know why not?

edit 2: I've seen generatedata.comm, but the database that backs the application is too small. I need to test around 100,000 users (and I have needs for data types that are not supported by that application. Even just a dictionary (english), in database form would be useful.

calumbrodie
  • 4,722
  • 5
  • 35
  • 63

2 Answers2

1

This website offers you a lot of free data for tests purpose : www.fakenamegenerator.com

Kevin Labécot
  • 2,005
  • 13
  • 25
  • Thanks. What I need is the database file that backs this web application (or similar). There is also http://www.generatedata.com/, but the amount of data in the included database is too small. – calumbrodie Aug 25 '11 at 09:45
  • 2
    What's the problem ? You can "order" up to 50 000 entries at 1 time on this website. If you need more, simply order again. It's free. http://fr.fakenamegenerator.com/order.php – Kevin Labécot Aug 25 '11 at 09:47
  • Nice - didn't see that. That's very awesome. Now I just need some for the rest of my data (which, given that it's so different could easily come from a regular english dictionary). It would be best if this site provided the database that they run off (and it would save their server as well) - but I'm not complaining. Thanks Again. – calumbrodie Aug 25 '11 at 09:58
  • Annnnnnddd I'm an idiot - there's a link to download 1000000 users on the thank you page when creating an order. That will do nicely :-) – calumbrodie Aug 25 '11 at 10:06
0

Could you just write a simple script to programmatically randomly generate the required data? I would use python, but you could do it in practically anything.

Something along the lines of this pseudo code should do the trick:

for i in range(0, 100000):
    name = randomName()
    email = randomEmail()
    insertIntoSomeTable(name, email)

Where randomName generates a random name, randomEmail generates a random email and insertIntoSomeTable takes the randomly generated data and inserts it into one of your tables. These functions should be trivial to implement.

Repeat for all of the tables you need random data for.

Spycho
  • 7,698
  • 3
  • 34
  • 55
  • It's not writing the code - I can do that. It's the data itself that I need. I need something to provide the results for randomName() and randomEmail(). Ideally this would run a 'SELECT RAND() FROM derp' type query against an sql table. I don't have the SQL table. – calumbrodie Aug 25 '11 at 09:38
  • What I mean is, can you just randomly create the data? e.g. randomly generate a series of alphabetic characters for a name? – Spycho Aug 25 '11 at 09:40
  • I may need to do that if I can''t find a better solution. Ideally I would use the same application state to test at scale and to test usability and other functional testing. To do this I need data that isn't nonsense. – calumbrodie Aug 25 '11 at 09:43