-2

My project is to take large amounts of logs, output inside text files and parse some of the data to be made into Excel format.

There is a lot of garbage data in between not needed

This is how one portion of it is like:

2018-05-17 07:16:57.105>>>>>>
{"status":"success", "statusCode":"0", "statusDesc":"Message Processed Sucessfully", "messageNumber":"451", "payload":{"messageCode":"SORTRESPONSE","Id":"5L","Id":"28032","messageTimestamp":"2018-05-16 23:16:55"}}

I will first need to take the time stamp befor the "{}" as it differs from the messageTimestamp

When generating the excel workbook This is how it will look like in Excel:

------A-----------------------------------B--------------C

1. Overall time stamp ---------- status------- statusCode 

2. 2018-05-17 07:16:57.105   - success --- -0

And so on...

payload has its own section of logs within its "{}"

so its section in excel will look like this:

  F

1. payload
2. {"messageCode":"SORTRESPONSE","Id":"5L","Id":"28032","messageTimestamp":"2018-05-16 23:16:55"}

its content can be in one section that's not an issue.

A friend of mine have done something similar but it can take a few minutes to even generate even one relatively small excel document

My Question:

What is the most optimal way can I parse the data needed to then store it in an array or multidimensional array

to then push it into an excel document.

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
Antonio
  • 3
  • 4

1 Answers1

0

I would try to split the input text on newline characters, then parse the JSON part with Newtonsoft.Json.

I would highly advise to not try to parse the JSON yourself. The bottleneck here will be disk IO not in-memory processing, so make it easy to write correct code and use 3rd party libraries.

Once you have structured data representing the input, you can write each entry to an output file with only the fields you need.

For an Excel file, is CSV okay or do you need XLSX files? For CSV you can just write to a file directly, for XLSX I would recommend the EPPlus library.

https://www.newtonsoft.com/json

https://archive.codeplex.com/?p=epplus

JamesFaix
  • 8,050
  • 9
  • 37
  • 73
  • Thank you, Ill try that. I was thinking regex at first. Looping each line in txt for the key words then taking the context after ' :" ' and before ' " ' but I was afraid that would take too long as I have many key words to look for, as the error text can change. And as for the CSV question. I don't believe they have a preference. They wanted to be able to read the logs in a manageable fashion either in a DB or Excel format. – Antonio May 21 '18 at 13:47
  • This worked thank you it turned just like I wanted =) – Antonio May 24 '18 at 19:31