0

We have ORACLE stored procedures like below:(psuedo code)

CREATE STOREDPROCEDURE(PARAMETERS)

BEGIN

SELECT STATMENTS;

CALLING CURSORS AND THEN DOING INSERTS INTO ANOTHER TABLE;

CALLING OTHER STORED PROCEDURES;

END;

We are having many stored procedures like above. Our data has grown and stored procedures are running slow. We would likto to convert these stored procedures to HIVE or ano other hadoop eco system. Please suggest us which is the hadoop ecosystem support this scenario.

1 Answers1

0

The answer really depends on what the whole picture looks like: what is stored in the database, how it gets there, what the stored procedures are doing, and how the results are accessed. I don't think you'll be able to get a question this large into a format that is answerable on stackoverflow.

To answer the specific question, it seems that you're doing ETL using stored procedures. There are many equivalents for that in Hadoop, the most well-known are: plain map-reduce, pig/hive scripts, spark. They won't work with data in your database though, so you'll need to get the data into Hadoop first.

If you're processing data with cursors and inserts that's probably the reason your procedures are running slowly. Have you considered hiring a consultant, who can suggest the ways to optimize your processing without rewriting everything from scratch on Hadoop?

Nickolay
  • 31,095
  • 13
  • 107
  • 185
  • Hi Nick,Thanks for your valuable suggestions. Yes,we are doing ETL using SPs. We can get the data into Hadoop using Sqoop. But after that I would like to know what are the stored procedure equivalents available in Hadoop.We dont want to live with the stored procedures. We would like to rewrite them into Hadoop or any other technology where we can scale our data and increase the performance. Could you please through some more suggestions this. – shanker valipireddy Mar 19 '15 at 19:23
  • @shankervalipireddy You'll have to read more about the technologies I mentioned to ask a more specific question; based on the information you provided all anybody can do is name the related technologies. And again, before ditching Oracle I urge you to check if you can optimize your ETL by avoiding cursor-based processing. See http://stackoverflow.com/questions/8714967/optimizing-row-by-row-cursor-processing-in-oracle-11g for an example of what I'm talking about. – Nickolay Mar 19 '15 at 22:19