0

Below is the code to try and clean my csv log, when i run the code i am getting an error; **

Traceback (most recent call last): File "page_hit_analysis.py", line 12, in line = parser(line)

**

import apache_log_parser
from collections import Counter
from pandas import DataFrame
import seaborn


parser = apache_log_parser.make_parser('%h %l %u %t "%r" %>s')

pages = []
with open('cleaned_log7.csv') as in_f:
    for line in in_f:
        line = parser(line)
        pages.append(line['request_url'])

counts = Counter(pages)

selected_pages = [pair[0] for pair in counts.most_common(5)]
print(selected_pages)

graph_pages = [page for page in pages if page in selected_pages]
data = DataFrame({'pages': graph_pages})
print(data)

plot = seaborn.countplot(data=data, x='pages', order=selected_pages)
plot.get_figure().savefig('pages_plot7.png')

The above code works with the uncleaned log but not with the cleaned one.

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • is there a reason to reuse the `line` variable? – EdChum Dec 18 '18 at 13:38
  • Im not sure thats what it was like in the tutorial i learned this from, i dont get the error with the uncleaned sad access log – user9056985 Dec 18 '18 at 13:40
  • can you try changing to `l = parser(line) pages.append(l['request_url'])` – EdChum Dec 18 '18 at 13:41
  • so when i removed the 2 lines under the for line in in_f and replace it with what you suggested i get a syntax error on the pages part. – user9056985 Dec 18 '18 at 13:44
  • Then you have more issues than just your original code, you need to address each one and do some debugging. Also don't add a comment that you get an error, update your question with the updated code and include the full error – EdChum Dec 18 '18 at 13:45

0 Answers0