some logs from access.log file:
202.134.9.131 - - [24/Jun/2020:05:03:28 +0000] "GET /static/img/p-logos/ruby-rails.png HTTP/1.1" 200 7289 "http://35.230.90.99/static/css/main.css" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 OPR/58.2.2878.53403"
202.134.9.131 - - [24/Jun/2020:05:03:28 +0000] "GET /static/img/p-logos/aws.png HTTP/1.1" 200 7230 "http://35.230.90.99/static/css/main.css" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 OPR/58.2.2878.53403"
202.134.9.131 - - [24/Jun/2020:05:03:28 +0000] "GET /static/img/p-logos/js.png HTTP/1.1" 200 7335 "http://35.230.90.99/static/css/main.css" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 OPR/58.2.2878.53403"
202.134.9.131 - - [24/Jun/2020:05:03:26 +0000] "GET /static/img/business-img.png HTTP/1.1" 200 853648 "http://35.230.90.99/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 OPR/58.2.2878.53403"
I can upload logs to elasticsearch in this way:
with open(access.log, "r") as ral:
for line in ral:
try:
log={
"user_IP" : line.split(" ")[0],
"request_date" : line.split("[")[1].split("]")[0],
"request_method" : line.split('"')[1].split(" ")[0],
"internal_url" : line.split('"')[1].split(" ")[1],
"HTTP_version" : line.split('"')[1][-3:],
"request_status ": line.split('"')[2][1:4],
"request_size" : line.split('"')[2].split(' ')[2],
"external_url" : line.split('"')[3],
"user_agent" : line.split('"')[5]
}
res = es.index(index=index, body=log)
print(res)
except:
log={"log" : line}
res = es.index(index=index, body=log)
print(res)
But I'm facing a few problems in this way:
- It creates heavy traffic in the elasticsearch server.
- It takes a long time and consumes resources for big logs file
In Kibana, there are options for uploading CSV, LOG, JSON files. Elasticsearch parse logs automatically. But here, I'm parsing logs with python and it takes time and resources.
My question is:
- Is there are any way to upload the "access.log" file without parsing?
I want to upload logs as a file. I don't want to parse logs and upload them as JSON. I can upload logs as a file from kivana. But is there are any way to do this with python?