How to serve a web request using hbase

Question

I have about 3 million documents that are pdfs, docs and images. I have build a website and if user search from website interface, I have to serve those hbase stored documents as required.

How can I do it?

Is it good to use hbase for serving web documents (in future these documents will be further increased) ?

My hadoop version is 1.2.1 and hbase 0.94.

what hbase gives you over a regular filesystem in your use case? For sure it is slower, requires more disk space, set up, etc, etc. — kostya, Dec 03 '15 at 06:31
Can regular filesystem can handle such big ammount of files ? — Hafiz Muhammad Shafiq, Dec 03 '15 at 08:12
yes. search servers can handle more than this in a scalable fashion. — Mostafa, Dec 04 '15 at 16:02

score 0 · Answer 1 · answered Dec 04 '15 at 16:02

0

I prefer in this case to have a search server that index this data and the web will integrate with this search server api for example: Solr, is an open source search server.

Hope this helps.

answered Dec 04 '15 at 16:02

Mostafa

3,296
2
26
43

Can we index images in solr and can be retrieved from solr? – Hafiz Muhammad Shafiq Dec 07 '15 at 07:03
Solr is a search product that can index any document type including images. – Mostafa Dec 08 '15 at 14:54
Can you provide me some link or reference to some solr example like my case ? – Hafiz Muhammad Shafiq Dec 09 '15 at 03:45
This is the Apache Solr ref on how to configure and index content in Solr: https://cwiki.apache.org/confluence/display/solr/Indexing+and+Basic+Data+Operations . This reference is very useful to get started and setup solr and configure it, and ingest data and all the way to advanced settings. Hope this helps. – Mostafa Dec 09 '15 at 17:17
Also, if you have scanned images that has text, this reference helps you step by step how to configure Solr to extract text within images. https://hortonworks.com/hadoop-tutorial/indexing-and-searching-text-within-images-with-apache-solr/ -- Hope this helps. – Mostafa Dec 09 '15 at 17:19

How to serve a web request using hbase

1 Answers1