0

I am fetching json data from txt file in linq and performing some actions on that to making the process fast. but when I am trying to call that query then it's showing me an error for deserialization of json object. So how to deserialize it?

I am getting error like

Cannot deserialize the current JSON object (e.g. {"name":"value"}) into type 'System.Collections.Generic.List1[MMF.LiveAMData]' because the type requires a JSON array (e.g. [1,2,3]) to deserialize correctly.

I search to solve this issue but almost all answers performing desrialization without linq. I need to use linq due to time latency.

Below is my method which I am calling

public static void something()
      {
          File.ReadLines(filePath)
              .AsParallel()
              .Select(x => x.TrimStart('[').TrimEnd(']'))
              .Select(JsonConvert.DeserializeObject<List<LiveAMData>>)
              .ForAll(WriteRecord);
      }

and below is my class object which I am using

public class LiveAMData
    {
        public string ev { get; set; }
        public string sym { get; set; }
    }
dpapadopoulos
  • 1,834
  • 5
  • 23
  • 34
Hardik Dhankecha
  • 148
  • 1
  • 11
  • 3
    What do you think `.Select(x => x.TrimStart('[').TrimEnd(']'))` is doing? – ProgrammingLlama Feb 15 '19 at 08:03
  • 1
    Why do you use such code in the first place? It won't work with valid JSON text. It can only work if one valid JSON value was stored per line. Otherwise each line would contain incomplete JSON strings. JSON-per-line is used in event analytics where there may be hundreds of thousands of events stored in a single file – Panagiotis Kanavos Feb 15 '19 at 08:06
  • Second, you're *explicitly* removing the array braces, making it impossible to deserialize any array data. Even if the line contained a valid JSON array it ends up as a invalid JSON string that consists of multiple JSON objects separated by commas – Panagiotis Kanavos Feb 15 '19 at 08:07
  • What does the JSON file look like? – Panagiotis Kanavos Feb 15 '19 at 08:09
  • 1
    Why are you trying to deserialize the lines in parallel? How many lines are there? Most of the time is spent in IO which means your code will end up using a single thread and take *more* time than the single-threaded equivalent. JSON-per-line is used to allow partitioning of the file along lines, with each core/task/thread handling a chunky partition, thus minimizing IO costs – Panagiotis Kanavos Feb 15 '19 at 08:11
  • 1
    "due to time latency." ? You mean reading / DeserializeObject the Json is slow? Did you stopwatch it? Is your WriteRecord a database insert, could it be the slow part? If it's really a serialisation/Deserialisation performance issue the best thing is to adress that before thinking about parallel. with custom Deserilizer you can shrink the deserialisation time https://www.newtonsoft.com/json/help/html/Performance.htm by one order of magnetude in most case. Because reflexion is slow – xdtTransform Feb 15 '19 at 08:44
  • 1
    `I need to use linq due to time latency.` you aren't doing that though, you are *adding* delays by reading the file line-by-line. Then you have a huge delay by adding records to your target one by one instead of batches. All databases perform faster when you send a batch of 1000 objects at once instead of making 1000 individual connections, round trips etc. Never mind the *concurrency conflicts* between 1000 connections vs a single one. – Panagiotis Kanavos Feb 15 '19 at 09:33
  • 1
    Most databases have fast bulk import mechanisms too, that use minimal logging, streaming etc to reduce the IO, logging needed when performing bulk operations. SQL Server, Oracle and MySQL ADO.NET clients provide such bulk copy classes that can take a stream of records and send it to the database in a bulk operation. An alternative is to store data in a format recognized by the database's bulk import utilities and use *that* instead of individual INSERTs – Panagiotis Kanavos Feb 15 '19 at 09:35

2 Answers2

3

You're trying to deserialize a JSON array, but you're trimming the [ and ] parts so that it's no longer a JSON array. Remove the trim line:

public static void something()
{
    File.ReadLines(filePath)
        .AsParallel()
        .Select(JsonConvert.DeserializeObject<List<LiveAMData>>)
        .ForAll(WriteRecord);
}

If each line of your file is a JSON array like so:

[{"ev":"Test1", "sym": "test"},{"ev":"Test2", "sym": "test"}]

Your trim line will change it to this invalid JSON:

{"ev":"Test1", "sym": "test"},{"ev":"Test2", "sym": "test"}

Which certainly can't be deserialized to a List<LiveAMData>>

ProgrammingLlama
  • 36,677
  • 7
  • 67
  • 86
  • 1
    I suspect `WriteRecord` expects single records which means SelectMany should be used instead of Select – Panagiotis Kanavos Feb 15 '19 at 08:15
  • 1
    @Panagiotis Perhaps, but I'll wait for more input form OP before I go changing anything. For all I know the method is just badly named. OP is getting a runtime exception, after all. – ProgrammingLlama Feb 15 '19 at 08:17
  • Hi, @PanagiotisKanavos, if I have try to use SelectMany for it then it's showing me an error like `cannot convert from method group to Action`. I search for it but all solutions were out of linq. – Hardik Dhankecha Feb 15 '19 at 10:30
0

As you are deserializing each object individually, you don't need the List type in the DeserializeObject call. Try this:

        public static void something()
        {
            File.ReadLines(filePath)
                .AsParallel()
                .Select(x => x.TrimStart('[').TrimEnd(']'))
                .Select(JsonConvert.DeserializeObject<LiveAMData>)
                .ForAll(WriteRecord);
        }
Nimesh Madhavan
  • 6,290
  • 6
  • 44
  • 55