2

I´d like to use a MySQL database to store the results of Hive analytics scripts in BAM 2.0.1. Taking a look at the supplied examples, I can see that I have to pass connection information using a number of properties (mapred.jdbc.*).

Is there a way to use a Carbon datasource instead of direct jdbc connections ? My main concern is the use of cleartext passwords in a script, a big blocker in security-concerned organizations.

TIA

Community
  • 1
  • 1
Philippe Sevestre
  • 974
  • 12
  • 18

3 Answers3

3

yes it is possible. You can use wso2.carbon.datasource.name parameter to pass the name of the carbon datasource.

chamibuddhika
  • 1,419
  • 2
  • 20
  • 36
0

Using passwords is required as server-to-server authentication is not properly implemented in Carbon framework yet. But in recent future we hope to remove this issue with an improvement to BAM.

Maninda
  • 2,086
  • 3
  • 15
  • 29
0

Using the property as chamibuddhika described did the trick. The table declaration below shows a complete example:

CREATE EXTERNAL TABLE IF NOT EXISTS BatchSummaryByWeek(
execYear SMALLINT,
execWeek SMALLINT,
job_name STRING,
exit_code INT,
totalExecutions INT,
avgElapsed FLOAT,
maxElapsed INT,
minElapsed INT
) 
STORED BY 
'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' 
TBLPROPERTIES ( 
'wso2.carbon.datasource.name' = 'MYSQL_BAM',
'hive.jdbc.update.on.duplicate' = 'true' , 
'hive.jdbc.primary.key.fields' = 'execYear,execWeek,job_name,exit_code' , 
'hive.jdbc.table.create.query' = 'CREATE TABLE BatchSummaryByWeek(execYear INTEGER, execWeek SMALLINT,job_name VARCHAR(250), exit_code INT,totalExecutions INT, avgElapsed FLOAT, maxElapsed INT,minElapsed INT)' );
Philippe Sevestre
  • 974
  • 12
  • 18