0

Would someone please let me know how to access the column name in simple hive udf.

import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
import Utils

@Description(name = "Decrypt", value = "Encrypt the Given Column", extended = "SELECT Decrypt('Hello World!');")
public class Encrypt extends UDF {

    private Text result = new Text();

    public Text evaluate(Text str) {
        if (str == null) {
            return null;
        }

        //Access Column Name and pass to the function to get encryption key 
        String secretKey = Utils.getSecretKey(columnName) 
        String encryptedText = AES.encrypt(str.toString(), "randomkey");
        result.set(encryptedText);
        return result;
    }

}
Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137
  • You actually create a jar and add it in hive. Post which a function is to be created with the class name used in the jar and then that function can be called on columns which will take input as column data and provide desired results as output. Refer https://cwiki.apache.org/confluence/display/Hive/HivePlugins#HivePlugins-CreatingCustomUDFs – yammanuruarun Apr 22 '20 at 14:40
  • the problem, is within UDF how to do I know on which `column` the `function` has been called? – Gaurang Shah Apr 22 '20 at 16:34
  • Well as far as i know we call the function with the column during run time in select clause. I am unsure of your scenario. If function is encrypt_coludf then we use it like select encrypt_coludf(col1),col2 from table; Functions are created mostly generic so that they can be applied to any suitable column of any table. – yammanuruarun Apr 22 '20 at 17:11
  • As far as I know, column names are stored in Hive Metastore, so when you call some column by name, Hive looks at Metastore to find out the order of that column in the data. There is probably no direct way to get column name in UDF. – serge_k Apr 27 '20 at 06:56

0 Answers0