0

So basically, I have a HttpClient that attempts to obtain any form of JSON data from an endpoint. I previously utilized Newtonsoft.Json to achieve this easily but after migrating all of the functions to STJ, I started to notice improper parsing.

Platforms tested: macOS & Linux (Google Kubernetes Engine)

Framework: .NET Core 3.1 LTS

The code screenshots below show an API that returns a JSON Array. I simply stream it, load it into a JsonDocument, and then attempt to peek into it. Nothing comes out as expected. Code below is provided along with the step debug var results.

using System;
using System.ComponentModel;
using System.IO;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Web;
using System.Xml;

namespace HttpCallDemo
{
    class Program
    {
        static async Task Main(string[] args)
        {
            using (var httpClient = new HttpClient())
            {
                // FLUSH
                httpClient.DefaultRequestHeaders.Clear();
                httpClient.MaxResponseContentBufferSize = 4096;
                string body = string.Empty, customMediaType = string.Empty; // For POST/PUT

                // Setup the url
                var uri = new UriBuilder("https://api-pub.bitfinex.com/v2/tickers?symbols=ALL");
                uri.Port = -1;

                // Pull in the payload
                var requestPayload = new HttpRequestMessage(HttpMethod.Get, uri.ToString());
                HttpResponseMessage responsePayload;

                responsePayload = await httpClient.SendAsync(requestPayload,
                    HttpCompletionOption.ResponseHeadersRead);

                var byteArr = await responsePayload.Content.ReadAsByteArrayAsync();
                if (byteArr.LongCount() > 4194304) // 4MB
                    return; // Too big.

                // Pull the content
                var contentFromBytes = Encoding.Default.GetString(byteArr);
                JsonDocument payload;

                switch (responsePayload.StatusCode)
                {
                    case HttpStatusCode.OK:
                        // Return the payload distinctively
                        payload = JsonDocument.Parse(contentFromBytes);

#if DEBUG
                        var testJsonRes = Encoding.UTF8.GetString(
                            Utf8Json.JsonSerializer.Serialize(payload.RootElement));
                        // var testRawRes = contentStream.read
                        var testJsonResEl = payload.RootElement.GetRawText();
#endif
                        break;
                    default:
                        throw new InvalidDataException("Invalid HTTP response.");
                }
            }
        }
    }
}

Simply execute the above Minimal code, notice that the payload is different from its original after parsing? I'm sure there's something wrong with the options for STJ. Seems like we have to optimise or explicitly define its limits to allow it to process that JSON payload.

Initial code from HttpClient

enter image description here

Diving deeper into the debug content made things even weirder. When the HttpClient obtains the payload, reads it to a string, it gives me the entire JSON string as is. However, once we attempt to parse it into a JsonDocument and the further invoking RootElement.Clone(), we'll end up with a JsonElement with much lesser data and while carrying an invalid JSON struct (Below).

ValueKind = Array : "[["tBTCUSD",11418,70.31212518,11419,161.93475693,258.02141213,0.0231,11418,2980.0289306,11438,11003],["tLTCUSD",58.919,2236.00823543,58.95,2884.6718013699997,1.258,0.0218,58.998,63147.48344762,59.261,56.334],["tLTCBTC",0.0051609,962.80334198,0.005166,1170.07399991,-0.000012,-0.0023,0.0051609,4178.13148459,0.0051852,0.0051],["tETHUSD",396.54,336.52151165,396.55,384.37623341,8.26964946,0.0213,396.50930256,69499.5382821,397.77,380.5],["tETHBTC",0.034731,166.67781664000003,0.034751,356.03450125999996,-0.000054,-0.0016,0.034747,5855.04978836,0.035109,0.0343],["tETCBTC",0.00063087,15536.813429530002,0.00063197,16238.600279749999,-0.00000838,-0.0131,0.00063085,73137.62192801,0.00064135,0.00062819],["tETCUSD",7.2059,9527.40221867,7.2176,8805.54677899,0.0517,0.0072,7.2203,49618.78868196,7.2263,7],["tRRTUSD",0.057476,33577.52064154,0.058614,20946.501210000002,0.023114,0.6511,0.058614,210741.23592011,0.06443,0.0355],["tZECUSD",88.131,821.28048322,88.332,880.37484662,5.925,0.0

And of course, attempting to read its contents would result in:

System.InvalidOperationException: Operation is not valid due to the current state of the object.
   at System.Text.Json.JsonElement.get_Item(Int32 index)
   at Nozomi.Preprocessing.Abstracts.BaseProcessingService`1.ProcessIdentifier(JsonElement jsonDoc, String identifier) in /Users/nicholaschen/Projects/nozomi/Nozomi.Infra.Preprocessing/Abstracts/BaseProcessingService.cs:line 255

Here's proof that there is a proper 38KBs worth of data coming in from the endpoint.

enter image description here

UPDATE

Further testing with this

                                    if (payload.RootElement.ValueKind.Equals(JsonValueKind.Array))
                                    {
                                        string testJsonArr;
                                        testJsonArr = Encoding.UTF8.GetString(
                                            Utf8Json.JsonSerializer.Serialize(
                                                payload.RootElement.EnumerateArray()));
                                    }

show that a larger array of arrays (exceeding 9 elements each with 11 elements) would result in an incomplete JSON struct, causing the issue i'm facing.

Nicholas
  • 1,883
  • 21
  • 39
  • Might you please [edit] your question to include your code and JSON as **text** rather than as a screenshot? It's requested here not to to use images for this purpose, see [*Discourage screenshots of code and/or errors*](https://meta.stackoverflow.com/a/307500) and [*Why not upload images of code on SO when asking a question*](https://meta.stackoverflow.com/a/285557) for why. – dbc Aug 05 '20 at 15:15
  • @dbc roger, will add it once I’m at my desk. The reason for screenshots are for step debug variable output reasons – Nicholas Aug 05 '20 at 15:19
  • Your code doesn't call `JsonDocument.ParseAsync()` though? Can you share a [mcve]? – dbc Aug 05 '20 at 17:04
  • @dbc roger will work on it in a moment. I changed it to non asynchronous to see if it makes a difference, made too many attempts debugging till I tried that for one last try – Nicholas Aug 05 '20 at 17:05
  • OK. Did changing to non-async parsing fix the problem? Also, are you using .Net Core 3.1, or a .Net 5 preview? – dbc Aug 05 '20 at 17:12
  • @dbc Updated. No it did not, that was my "hopeless" last attempt before coming here haha. – Nicholas Aug 05 '20 at 17:30
  • Are you using [`Utf8Json`](https://github.com/neuecc/Utf8Json) to serialize a `JsonElement` deserialized via `System.Text.Json`? I can't see why that would work, you need to use `System.Text.Json` itself to re-serialize a `JsonElement`. If I do, there's no problem. See https://dotnetfiddle.net/nWBiuH – dbc Aug 06 '20 at 07:11
  • Nope i wasn't, I attempted that just to be sure its not plausible. Following your fiddle, I'm very baffled.. I can't get my flow to work still – Nicholas Aug 06 '20 at 07:20
  • I have validated your code. I think ```var testJsonRes2 = JsonSerializer.Serialize(payload.RootElement.EnumerateArray());``` If you peek into the RootElement of ```payload``` without serializing it, you still won't be able to obtain everything. – Nicholas Aug 06 '20 at 07:31

1 Answers1

0

For those who are working with JsonDocument and JsonElement, take note that the step debug variables are not accurate. It is not advisable to inspect the variables during runtime as they do not display themselves entirely.

@dbc has proven that re-serializing the deserialized data will produce the complete dataset. I strongly suggest you wrap the serializers for debugging in a DEBUG preprocessor to make sure these redundant lines don't end up being executed out of development.

To interact with these entities, ensure you .clone() whenever you can to prevent disposals and ensure that you're accessing the RootElement and then subsequently traversing into it before viewing its value in step debug mode because large values will not be displayed.

Nicholas
  • 1,883
  • 21
  • 39