I need to count the number of lines with a matching patterns across s3 buckets. The command I am using is -:
s3cmd ls --recursive s3://mys3.com/bucket1/ | awk '{print $4}' | grep '.lzo' | xargs -I@ s3cmd get @ - | zgrep 'my-pattern-of-interest-1' | zgrep 'my-pattern-of-interest-2'|wc -l
but this still downloads the files physically, is there an external utility (with boto for example), where I can still do the same, but without downloading the file physically ? I need to scan thorough 4-5 months of data,so want to avoid download at all costs.