0

I am using Lars framework (https://lars.readthedocs.io/en/latest/index.html) to analyze sample logs from an Apache Server stored in a txt file. I want to store the path_str of each log in a list so I am doing this:

from lars import apache
path_logs = []

with open('sample.txt', 'r') as f:
    with apache.ApacheSource(f) as source:
        for row in source:
            path_logs.append(row.request.url.path_str)

print(path_logs)

Which in theory should be correct, however I get this error:

'NoneType' object has no attribute 'url'

The funny thing is that if I create a variable to count and stop at a certain number of row this works:

with open('sample.txt', 'r') as f:
    with apache.ApacheSource(f) as source:
        count = 0
        for row in source:
            path_logs.append(row.request.url.path_str)
            count += 1
            if(count == 5):
                break
print(paht_logs)

Out:
['/api/buscador/filtros', 
'/api/buscador/busqueda', 
'/api/buscador//busqueda', 
'/api/buscador/filtros',
'/api/buscador/busqueda']

Of course there are thousands of rows, but anyone knows why this is happening? Am I missing something?

1 Answers1

0

Clearly somewhere down in your data, row.request is None. You'll want to add a guard against that:

import logging

with open('sample.txt', 'r') as f:
    with apache.ApacheSource(f) as source:
        for row in source:
            if row.request is None:
                logging.warning('Skipping row %r', row)
            else:
                path_logs.append(row.request.url.path_str)
tripleee
  • 175,061
  • 34
  • 275
  • 318