-2

have this example jsonl code and i try to sort this by timestamp

{"log_level": "DEBUG", "timestamp": "2022-12-04 18:04:09", "message": "Fry saw a man on the Mars yesterday"}

{"log_level": "INFO", "timestamp": "2022-12-03 11:21:35", "message": "Bender played a mall near the Square Garden today"}

{"log_level": "ERROR", "timestamp": "2022-12-03 11:21:42", "message": "Dr. Zoidberg took a mall at park day after tomorrow"}

{"log_level": "DEBUG", "timestamp": "2022-12-03 11:21:49", "message": "Fry built a fish at park today"}

{"log_level": "WARNING", "timestamp": "2022-12-03 11:21:55", "message": "Dr. Zoidberg brought a boat at park tomorrow"}

{"log_level": "ERROR", "timestamp": "2022-12-03 11:21:57", "message": "Farnsworth killed an apple near the Square Garden today"}

this is my code that should be sorted

def sort_merged_files(merged_file):

    with open(merged_file) as writer:
        dict = collections.defaultdict(list)
        for obj in jsonlines.Reader(writer):
            for k1, v1 in obj.items():
                dict[k1].append(v1)

        sorted_date = sorted(
            dict, key=lambda x: datetime.strptime(x["timestamp"], "%Y-%m-%d")
        )
        print(sorted_date)

my error sorted_date = sorted(dict, key=lambda x:datetime.strptime(x["timestamp"], "%Y-%m-%d")) TypeError: string indices must be integers

EDIT 1 i solve this problem

def sort_merged_files(merged_file):

        with open(merged_file) as reader:
                print(type(reader))
                list_log = []
                ## create list and add dictionary
                for obj in jsonlines.Reader(reader):
                        print(obj)
                        list_log.append(obj)
                sorted_list = sorted(
                        list_log, key=lambda x: time.mktime(time.strptime(x["timestamp"], "%Y-%m-%d %H:%M:%S")))
                print(type(sorted_list))
                ## write sorted list in file
                with open(merged_file, "w") as f:
                        for dic in sorted_list:
                                json.dump(dic, f)
                                f.write("\n")
                del sorted_list, list_log
  • Only date or with seconds too/ – Bhargav - Retarded Skills Dec 05 '22 at 14:32
  • date and time but now i try to sort with date only and it doesnt work. – ChungusNAPAS Dec 05 '22 at 14:34
  • 1.) The format of the timestamp as string is such that you can sort by string. No need to convert to something else. 2.) don't name variables like types, such as `dict,`, better call it `log_dict` or whatever. 3.) iterating a dictionary iterates the keys, not the values. The keys are strings, that's why `x["timestamp"]` attempts to index a string. 4.) you probably want to sort the individual lists which you have as values in your dictionary, not sort the dictionary itself. 5.) use a debugger. – Adrian W Dec 05 '22 at 14:51

1 Answers1

-1

Sorting by dates is pretty easy. use natsort as follows.

Let's say you have this text in text file.

from natsort import natsorted,realsorted, ns

with open('ChungusNAPAS.txt') as file:
    lines = [line.rstrip() for line in file]

lines = list(filter(None, lines))

sorted_ = natsorted(lines, alg=ns.REAL | ns.LOCALE | ns.IGNORECASE)

for x in sorted_:
    print(x)

Gives #

{"log_level": "DEBUG", "timestamp": "2022-12-04 18:04:09", "message": "Fry saw a man on the Mars yesterday"}
{"log_level": "DEBUG", "timestamp": "2022-12-03 11:21:49", "message": "Fry built a fish at park today"}
{"log_level": "ERROR", "timestamp": "2022-12-03 11:21:42", "message": "Dr. Zoidberg took a mall at park day after tomorrow"}
{"log_level": "ERROR", "timestamp": "2022-12-03 11:21:57", "message": "Farnsworth killed an apple near the Square Garden today"}
{"log_level": "INFO", "timestamp": "2022-12-03 11:21:35", "message": "Bender played a mall near the Square Garden today"}
{"log_level": "WARNING", "timestamp": "2022-12-03 11:21:55", "message": "Dr. Zoidberg brought a boat at park tomorrow"}