1

We've a requirement to scan the files uploaded by the user and check if it has virus and then tag it as infected. I checked few blogs and other stackoverflow answers and got to know that we can use calmscan for the same.

However, I'm confused on what should be the path for virus scan in clamscan config. Also, is there tutorial that I can refer to. Our application backend is in Node.js.

I'm open to other libraries/services as well

Nikhil
  • 665
  • 2
  • 11
  • 25
  • I would consider staging uploads in a dedicated S3 bucket (or at a dedicated prefix in an existing bucket) and trigger an anti-virus workflow on each object upload. The workflow might use Step Functions rather than Lambda so that it could scan large files on EC2 (no disk space limits, no time limits), and the workflow would finally move the (clean) scanned file to its ultimate S3 location. – jarmod Sep 25 '19 at 14:02

3 Answers3

1

Hard to say without further info (i.e the architecture your code runs on, etc).

I would say the easiest possible way to achieve what you want is to hook up a trigger on every PUT event on your S3 Bucket. I have never used any virus scan tool, but I believe that all of them run as a daemon within a server, so you could subscribe an SQS Queue to your S3 Bucket event and have a server (which could be an EC2 instance or an ECS task) with a virus scan tool installed poll the SQS queue for new messages.

Once the message is processed and a vulnerability is detected, you could simply invoke the putObjectTagging API on the malicious object.

Thales Minussi
  • 6,965
  • 1
  • 30
  • 48
0

We have been doing something similar, but in our case, its before the file storing in S3. Which is OK, I think, solution would still works for you.

We have one EC2 instance where we have installed the clamav. Then written a web-service that accepts Multi-part file and take that file content and internally invokes ClamAv command for scanning that file. In response that service returns whether the file is Infected or not.

Your solution, could be,

  1. Create a web-service as mentioned above and host it on EC2(lets call it, virus scan service).
  2. On Lambda function, call the virus scan service by passing the content.
  3. Based on the Virus Scan service response, tag your S3 file appropriately.

If your open for paid service too, then in above the steps, #1 won't be applicable, replace the just the call the Virus-Scan service of Symantec or other such providers etc.

I hope it helps.

Red Boy
  • 5,429
  • 3
  • 28
  • 41
  • What if I get the object from S3 bucket, convert it into readable stream and pass through **.scan_stream(stream[,callback])** method of Clamscan library and based on output we'll update the tag. Will it work? – Nikhil Sep 30 '19 at 03:26
  • @Nikhil, I think it should work, though I have not done that so not sure how Clamscan library would behave overall? The steps that I wrote are high level and obviously, you may have to make the deviations that solves your specific use case. I hope it helps. – Red Boy Sep 30 '19 at 11:17
0

You can check this solution by AWS, it will give you an idea of a similar architecture: https://aws.amazon.com/blogs/developer/virus-scan-s3-buckets-with-a-serverless-clamav-based-cdk-construct/

Gabriel
  • 1,749
  • 1
  • 11
  • 15