0

I am trying to solve this problem for an UDF I am creating for hiveql environment.

public ObjectInspector initialize(ObjectInspector[] arguments)
            throws UDFArgumentException {
    if (arguments.length != 1) {
        throw new UDFArgumentException("Usage : multiple_prop(primitive var) ");
    }
    // This will be an string
    moi = (PrimitiveObjectInspector) arguments[0];

    ArrayList structFieldNames = new ArrayList();
    ArrayList structFieldObjectInspectors = new ArrayList();

    structFieldNames.add("fields name"); <-- Issue is here

How could I do to get the field name in there? It can be easily done for structObjectInspectors, but how do we manage this in PrimitiveObjectInspectors?

Complete code would be this one

public class prop_step2 extends GenericUDF {
    private PrimitiveObjectInspector moi;
    @Override
    public ObjectInspector initialize(ObjectInspector[] arguments)
            throws UDFArgumentException {
        if (arguments.length != 1) {
            throw new UDFArgumentException("Usage : multiple_prop(primitive var) ");
        }
        // This will be an string
        moi = (PrimitiveObjectInspector) arguments[0];

        ArrayList structFieldNames = new ArrayList();
        ArrayList structFieldObjectInspectors = new ArrayList();
        // Change this to get the input variable name, and not the type name
        structFieldNames.add(moi.getTypeName());<-- Change this to field name
        structFieldObjectInspectors.add( PrimitiveObjectInspectorFactory.writableStringObjectInspector );

       return ObjectInspectorFactory.getStandardStructObjectInspector(structFieldNames, structFieldObjectInspectors);
    }

    @Override
    public Object evaluate(DeferredObject[] arguments) throws HiveException {
        Object[] result;
        result = new Object[1];
        Text elem1 = new Text((String) moi.getPrimitiveJavaObject(arguments[0].get()));
        result[0]= elem1;
        return result;
    }
    @Override
    public String getDisplayString(String[] children) {
        return "stop";
    }}

When this would be finished, i would like to call this udf from hive:

CREATE TEMPORARY FUNCTION step AS 'UDFpack.prop_step2';
select 
step(bit) as sd
from my_table

And i would expect that if in an upper select i did this : sd.bit i would obtain the value of 'bit'.

LSG
  • 127
  • 1
  • 12
  • Could you please show more [complete code](https://stackoverflow.com/help/mcve) allowing us to understand your problem. In your last question, do you really mean `structObjectInspectors` or should it be`structFieldObjectInspectors`? I can't see a varialble `structObjectInspectors` anywhere. – LuCio Jun 25 '18 at 07:20
  • Fine, i will try to add the rest of the code, altough i do not think it is really relevant for this concrete task. Also, stackoverflow has kind of a problem with longs codes, it keeps looking for a description for each piece of it to be able to post it, so i will have to omit some parts. `structObjectInspectors` are not appearing in this code, i just meant that if 'moi' was an `structObjectInspector` instead of a `primitiveObjectInspector` task would be easily done. – LSG Jun 25 '18 at 07:56

1 Answers1

0

It's simply not possible. The information passed to the UDF - the ObjectInspectors - do not contain their name. That's why you can see the output column names being changed to _col0, _col1 .. in the intermediary stages of a Hive explain plan. I am also quite annoyed by this and think this is an oversight by Hive.

A workaround would be to put your input into a struct and parse that.

i.e step(named_struct('bit',bit)) and then you can get the field name of the struct in your UDF. But it's not nearly as nice

Clemens Valiente
  • 829
  • 1
  • 8
  • 16