2

For some reason, this test utilizing InlineData fails in xUnit:

[Theory]
[InlineData("\uD800", 1)]
public static void HasLength(string s, int length)
{
    Assert.Equal(length, s.Length);
}

while this, which uses MemberData, passes:

public static IEnumerable<object[]> HasLength_TestData()
{
    yield return new object[] { "\uD800", 1 };
}

[Theory]
[MemberData(nameof(HasLength_TestData))]
public static void HasLength(string s, int length)
{
    Assert.Equal(length, s.Length);
}

What is the reason for this? Have I discovered a bug in xUnit.net? (I think it may have something to do with the fact that \uD800 is a surrogate character, and it's somehow getting translated to 2 characters when passing thru InlineData. Not sure why, though.)

James Ko
  • 32,215
  • 30
  • 128
  • 239

2 Answers2

1

No, it's not a bug.

If you want to represent a value greater than U+FFFF in UTF-16, you need to use two UTF-16 code units: a high surrogate (in the range 0xD800 to 0xDBFF) followed by a low surrogate (in the range 0xDC00 to 0xDFFF). So a high surrogate on its own makes no sense. It’s a valid UTF-16 code unit in itself, but it only has meaning when followed by a low surrogate.

More info in this article http://codeblog.jonskeet.uk/2014/11/07/when-is-a-string-not-a-string/

nicolas2008
  • 945
  • 9
  • 11
1

Nicolay's answer doesn't answer the question, but does link to it:

Attribute values are stored using UTF-8, and an isolated high surrogate cannot successfully be converted from UTF-16.

Here's a (VB.NET) LinqPad query confirming the situation.

Mark Hurd
  • 10,665
  • 10
  • 68
  • 101
  • 1
    Ah, that makes much more sense now- so it isn't a bug in xUnit, but this is built-in to the .NET Framework. Thanks for your answer! – James Ko Mar 20 '16 at 15:02