Is there a good way to integrate beanstalkd with hadoop?

Question

I am using beanstalkd to collect log data from multiple front-end servers (php application), and insert the data into mysql. As long as the data grown, I need to choose Hadoop to analysis data for BI purpose using Hive. What's the best practice to integrate beanstalkd with hadoop? I find FlumeNG, but it seems too heavy for my needs.

score 1 · Answer 1 · answered Jun 04 '14 at 14:16

Really interesting question.

Checkout Monolog and here is a great tutorial about handling logs with Fluentd.

You might want to consider a "triangle" of services, Laravel comes with Monolog, also with Beanstalkd support, and you can quickly add via composer Fluentd. So you could have a Laravel app that has workers on your tubes, and forwards to Fluentd. Fluentd has some great features like delayed log and tags. Not sure how you will handle delayed logs, or tags on the logs, but you probably already have those on your tube message.

You could use Beanstalkd Console to view your jobs and help you in development.

Is there a good way to integrate beanstalkd with hadoop?

1 Answers1