You ask multiple question. So I try to answer each of your question.
Where do I need to read: through lambda or through spark, which one would be beneficial?
You can let s3 trigger lambda , and lambda trigger EMR spark.
Here are many example for you
How should I read compressed files in spark?
First, which kind of compressed file? Spark and Hadoop support following compressed type
name | ext | codec class
-------------------------------------------------------------
bzip2 | .bz2 | org.apache.hadoop.io.compress.BZip2Codec
default | .deflate | org.apache.hadoop.io.compress.DefaultCodec
deflate | .deflate | org.apache.hadoop.io.compress.DeflateCodec
gzip | .gz | org.apache.hadoop.io.compress.GzipCodec
lz4 | .lz4 | org.apache.hadoop.io.compress.Lz4Codec
snappy | .snappy | org.apache.hadoop.io.compress.SnappyCodec
If your compressed type is supported, you can read compressed files by following example code.
rdd = sc.textFile("s3://bucket/project/logfilexxxxx.*.gz")