hiall My goal is to analyze log files of Hadoop and there are two tools starfish(open source) and splunk(commercial product). Does anyone know the pros and cons as to which one to choose. I really appreciate your answer. Thanks
-
Is there a link to the starfish alternative? – Oerd Jan 26 '13 at 12:39
-
If you're looking for another alternative then I'd recommend taking a look at logscape.com. Ive successfully used this in my last two roles. Especially useful for root cause analysis and visualising your data. – Mar 18 '14 at 14:29
2 Answers
Well,
the pros and cons are the same of any open source vs commercial tool choice.
The main guideline should be, what are your prerequisites?
Splunk core is opensource, the free license allows you to index 500Mb/day,
probably its main advantage is providing a BI tool cheaper than other comercial ones,
it also has an impressive amount of plugins, including for Hadoop,
and like Hadoop relies on a (different) MapReduce implementation since Splunk 4.x.
It both has a Python and Java SDK, which may come in handy.
Its approach is, install it and after (a minimal) setup, start playing with your data.
I don't know Starfish, though it does look promissing, it only seems to require JavaFX while Splunk comes with its own Python alternative installation.
But in the end, it all boils down to what are your most important prerequisites.

- 3,120
- 3
- 31
- 40
-
Can you add a link to the source code of Splunk-core? I thought it was just "free" to index 500megs/day – Oerd Jan 25 '13 at 17:46
-
I'm sorry if I misled you, I meant it has an opensource friendly policy, it opensources part of their projects, https://github.com/splunk/ You can also find a growing list of supported addons in, http://splunk-base.splunk.com/apps/ – Joao Figueiredo Jan 25 '13 at 18:16
-
But splunk-core is *not* on github, nor is the source code available anywhere I know of. I've been running Splunk-free for quite a while (call me early adopter). It comes as a blob, the indexer is not available, only apps are (as they are python/ruby/html/js). Again, it's not open-source it's free-to-use (as in: it-does-not-cost-to-index-up-to-500-megs). – Oerd Jan 26 '13 at 12:25
Barriers to entry is low for both. Best is to try both out for a while and see what works for you.
Depending on your use case each tool has different strengths. What is your use case?
Generally speaking Splunk is easy and modern with great community support. Answers are generally a few searches away.

- 313
- 1
- 9