6

I am creating a HBASE table with a value of integer -17678. But when i retrieve it from pheonix it gives me a different positive value. RowKey is a composite rowkey and there is no problem with rowkey.

Hbase insertion:

public class test
{
public static void main(String args[])
{
        Configuration config = HBaseConfiguration.create();

          Connection connection = ConnectionFactory.createConnection(config);
          Table table = connection.getTable(TableName.valueOf("TEST"));
          Integer i=-17678;

          try
          {
          Put p = new Put(Bytes.toBytes("rowkey"));
          p.addColumn(Bytes.toBytes("test"),Bytes.toBytes("test"),Bytes.toBytes(i));
          table.put(p);
          }
          finally
          {
             table.close();
             connection.close();
          }

    }
}

Phoenix retrieval:

select CAST("Value" AS INTEGER) from TEST ;

+------------------------------------------+
|         TO_INTEGER(test."Value")         | 
+------------------------------------------+
| 2147465970                               | 
+------------------------------------------+

Anything wrong here? or a phoenix issue?

kliew
  • 3,073
  • 1
  • 14
  • 25
Vignesh I
  • 2,211
  • 2
  • 20
  • 40

2 Answers2

8

http://phoenix.apache.org/language/datatypes.html

The binary representation is a 4 byte integer with the sign bit flipped (so that negative values sorts before positive values).

So to convert from HBase serialization format to Phoenix format:

(-17678)10 = (11111111111111111011101011110010)2
=> (01111111111111111011101011110010)2 = (2147465970)10

Thus the output is as expected. You need to be aware of the binary representation when inserting data using HBase.

Direct HBase toByte to Phoenix reads is only possible with CHAR and UNSIGNED_* data types. You'd have to serialize appropriately for other data types. ie. setting i = 2147465970 when you mean to insert -17678.

I recommend using Phoenix to insert data. If you are worried about keeping your application light on dependencies, Phoenix offers a "thin" jdbc driver (4mb instead of 86mb).

https://phoenix.apache.org/server.html


If you absolutely must use HBase, you can serialize signed numbers by using a bitwise XOR.

For integers, you would want to XOR your i with a bitmask to flip the sign bit.

The bitmask to apply to a 4-byte Integer is:

(10000000000000000000000000000000)2 = (-2147483648)10

From http://ideone.com/anhgs5 , we get 2147465970. If you insert that using HBase, when you read using Phoenix, you will read -17678).

You will need a different bitmask for Bigint (shared bitmask with the date-time types), Smallint, Float, and Double.

kliew
  • 3,073
  • 1
  • 14
  • 25
  • Thanks. So when i query using the JDBC driver will i be getting the correct value? I don't want to insert the data using phoenix, hbase client API is the bet. – Vignesh I Mar 22 '16 at 06:17
  • It won't work as-is because the data was not serialized in the format expected by Phoenix. I updated my answer with an example on how to use HBase to insert data that can be properly deserialized by Phoenix. – kliew Mar 22 '16 at 06:51
  • So the only workaround would be insert the data using pheonix itself? – Vignesh I Mar 22 '16 at 06:53
  • I edited my original answer with a workaround using HBase. Does this solution work for you? – kliew Mar 22 '16 at 06:54
  • Basically you will want to replace `Bytes.toBytes(i)` with `i ^ -2147483648` in `addColumn`. `-2147483648` is the bitmask that will flip the sign bit when you XOR with the integer, and you need a different bitmask for each signed data type. – kliew Mar 22 '16 at 06:59
  • It did but the problem is i have to do the same for double,float,bigint. It would get complicated and suppose i use a hbase scan to read the data then i have to do a reverse to extract the integer that would not be best way to go i guess. Hence just wanted to know whether the only workaround is inserting using phoenix?(Other than the hbase workaround). – Vignesh I Mar 22 '16 at 07:00
  • Yes, other than manually manipulating the bits, the only other solution is to use Phoenix to insert the data. But then if you try to read the data (inserted by Phoenix) using HBase, it won't be the expected value for signed data types. – kliew Mar 22 '16 at 07:02
  • Oops. Thanks a lot for your time. – Vignesh I Mar 22 '16 at 07:03
1

Try something like :

//Add dependencies in your pom.xml
<dependency>
            <groupId>org.apache.phoenix</groupId>
            <artifactId>phoenix-core</artifactId>
            <version>${phoenix.core.version}</version>
            <scope>provided</scope>
</dependency>

And than:

//Import these classes. More types available
import org.apache.phoenix.schema.types.{PDecimal, PInteger, PVarchar}
//and replace your addColumn method by this one:
p.addColumn(Bytes.toBytes("test"),Bytes.toBytes("test"),PInteger.INSTANCE.toBytes(i));
//Or if you wish a Decimal
p.addColumn(Bytes.toBytes("test"),Bytes.toBytes("test"),PDecimal.INSTANCE.toBytes(i));
Moshe Slavin
  • 5,127
  • 5
  • 23
  • 38
Jorge
  • 11
  • 1