0

I've useragent strings stored in String format.

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36 

I want to extract browser from user agent strings. So i have used ua-parser-java library.

Hive UDF code is as below:

public class BrowserInfo extends UDF{

    public Text evaluate(Text input) {

        if(input == null) return null;
        String uaString = input.toString();

        Parser uaParser= null;
        try 
        {
            uaParser = new Parser();
        } 
        catch (IOException e) 
        {
            e.printStackTrace();
        }
         Client c = uaParser.parse(uaString);

        return new Text(c.userAgent.family);
      }
}

It gives me following exception.

Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text dhruv.udf.BrowserInfo.evaluate(org.apache.hadoop.io.Text)  
on object dhruv.udf.BrowserInfo@5379d8 of class dhruv.udf.BrowserInfo 
with arguments {"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)":org.apache.hadoop.io.Text} of size 1

Tried with String instead of Text but getting same exception. Without hive this code works perfectly. UPDATE: No detail about this in logs of hadoop or hive.

Dhruv Kapatel
  • 873
  • 3
  • 14
  • 27

1 Answers1

0

To resolve the error, you need to ensure couple of things-

  1. regexes.yaml is present in packaged .jar file and its path is correct in parser.java
  2. All dependent jars are also packaged in the final .jar file.

Hope this helps.

sanket
  • 1