4

I've been using ScalaCheck for automatic unit testing. Its default String generator (i.e., its default Arbitrary[String] instance) is a little too powerful, generally producing an unreadable jumble made up mainly of characters I'm not trying to support and my system can't even render.

I've set out to create some more Arbitrary[String] instances, and am trying to find out what's out there. Here are some examples of String classes that would be helpful for testing:

  • basic multilingual plane strings
  • astral strings
  • latinate strings (including extensions a/b)
  • French words
  • left-to-right language strings
  • right-to-left language strings
  • Chinese sentences
  • "web strings" (strings drawn from a character set that constitutes 99.9999% of web content)
  • use your imagination ...

Are there libraries out there that can make these, or similar strings, at random?

Yuvi Masory
  • 2,644
  • 2
  • 27
  • 36

2 Answers2

1

I'd take a different approach. All of your examples map to blocks of characters in Unicode. See http://www.fileformat.info/info/unicode/block/index.htm Just pick the blocks you like, and then generate random strings that are limited to those ranges.

int count = 10;
StringBuilder out = new StringBuilder();
Random rand = new Random(0);
for (int i = 0; i < count; i++) {
  char ch = rand.nextInt(numCharsInRange) + firstCharInRange;
  out.append(ch);
}
return out.toString();

Another approach would be to grab random snippets of pre-composed text from different languages. You can grab some here: http://www.unicode.org/standard/WhatIsUnicode.html Just look at the translations.

ccleve
  • 15,239
  • 27
  • 91
  • 157
0

Try

RandomStringUtils.random(10, true, false)

The parameters are as follows: int count, boolean letters, boolean numbers

You will need to import org.apache.commons.lang.RandomStringUtils;

shaunw
  • 360
  • 1
  • 3
  • 10