-1

I'm currently using HiveServer2 with thrift.dll library. If I try to make a tuple from a thrift object, I expect that fields that were not set in the object, are marked null in tuple. However instead default values are put into the tuple. E.g.

HqlConnection con = new HqlConnection("localhost", 10001, HiveServer.HiveServer2);    
con.Open();   
HqlCommand createCommand1 = new HqlCommand("select id,name,age,DOB,marks from engineer_list", con);   
createCommand1.ExecuteNonQuery();   
HqlDataReader reader = createCommand1.ExecuteReader();

expected output should be

{(1,'John',24,2010-01-01 10:22:47,45.6), (2,null,null,null,null)}

but actual result was:

{(1,'John',24,2010-01-01 10:22:47,45.6), (2,,,0,0)}

When we send request to call numeric column which contain NULL value for a type (int, double, long, float..) means thrift.dll itself return as zero instead of null or DBNull.value.

For string, timestamp type mean thrift.dll itself return as empty string instead of null or DBNull.value.

By analyzing the source of thrift.dll, we trigger out the method for reading the column by following

Class: Thrift.Protocol.TBinaryProtocol

Method: ReadAll(buffer, offset, length) by default it return as zero.

Description: This will get stream of data from thriftServer port

For Int value following method are call and this will return as '0' if data contain 'null'

private byte[] i32in = new byte[4];
public override int ReadI32()
{
  ReadAll(i32in, 0, 4);
  return (int)(((i32in[0] & 0xff) << 24) | ((i32in[1] & 0xff) << 16) | ((i32in[2] & 0xff) << 8) | ((i32in[3] & 0xff)));
}
  • This sounds like it ought to be either posted to [the Thrift developer mailing list](http://thrift.apache.org/mailing) or [filed as a bug report](http://thrift.apache.org/developers) instead of being written up here on Stack Overflow. –  Aug 24 '15 at 10:58
  • Is there a question hidden somewhere? – m02ph3u5 Aug 24 '15 at 11:14
  • Hi, Actually my question is how to get 'null' value instead of default value of each datatype from thrift which connecting with hiveserver2? – ǨÅVËĔŊ RĀǞĴĄŅ Aug 24 '15 at 11:52

2 Answers2

0

Nothing to do with Thrift: if you store an empty STRING into Hive, then you will get an empty STRING when selecting the field. If you store explicitly a Null STRING then you will get a Null. That's the way a proper DBMS should work.

Unfortunately Oracle and SQL Server have their own silly rules for VARCHAR data type (i.e. in Oracle a Null can appear as an empty string and vice-versa, like a wave/particle in quantum physics; and with SQL Server an empty string is more or less a 1-character-long string with a blank).

If you need application-level handling of empty STRINGs then use a conditional query clause such as CASE WHEN duh ='' THEN CAST(Null AS STRING) ELSE duh END

[Edit] Never, ever rely on implicit conversions from STRING to numeric data types. And I agree that in that case Hive is even worse than other databases.

CASE WHEN duh is Null OR duh ='' THEN CAST(Null AS SMALLINT) ELSE cast(duh as SMALLINT) END

PS: casting Nulls is required in some versions of Hive (especially in V0.14 - regression from V0.13)

Samson Scharfrichter
  • 8,884
  • 1
  • 17
  • 36
0

check the nulls field. e.g. if the nulls[0] = 12. then its binary value is "1100", read from right to left, it means the last two values is acutally NULL value.

Winnie
  • 1