0

In my research we are using the ( FIX 5.0 SP2 ) protocol format and ( CME DataMine ) data. But I am having problems creating a table for this data (nested tables). For example the basic message format is given by msg#1, starting at (^A1128) and ending at (^A10). Now in the case of msg#2 we have the same basic message starting at (^A1128) and multiple messages starting at (^A1023) and ending at (^A346) .

        MSG#1                       MSG#2
---------------------------------------------------------------------
    ^A1128=9                ^A1128=9 
        ^A9=136                     ^A9=1417
        ^A35=X                      ^A35=X 
        ^A49=CME                    ^A49=CME 
        ^A34=578                    ^A34=731 
        ^A52=20130714180133577      ^A52=20130714190005961
        ^A75=20130715               ^A75=20130710 
        ^A268=1                     ^A268=15
        ^A279=0                     ^A279=0
        ^A22=8                      ^A22=8
        ^A48=111473                 ^A48=28112 
        ^A64=20130712               ^A83=2 
        ^A83=1                      ^A107=ESZ3
        ^A107=ESU4                  ^A269=0
        ^A269=6                     ^A270=166000
        ^A270=164350                ^A271=1 
        ^A273=180133000             ^A273=190005000
    ^A10=248                        ^A336=0
                                    ^A346=1
                                              ^A1023=1 
                                                   ^A279=0 
                                                   ^A22=8 
                                                   ^A48=28112 
                                                   ^A83=3 
                                                   ^A107=ESZ3
                                                   ^A269=0 
                                                   ^A270=165675 
                                                   ^A271=1 
                                                   ^A273=190005000
                                                   ^A336=0 
                                                   ^A346=1
                                      ...   ...   ...  ...
                                               ^A1023=5 
                                     ^A10=029

The data comes in sequence of lines without spaces and at the end of a message there is a ( \n ) character. Something like this,

^A1128=9^A9=157^A35=X^A49=CME^A34=406238^A52=20130715123719934^A75=20130715^A268=1^A279=1^A22=8^A48=28112^A83=49004^A107=ESZ3^A269=1^A270=166775^A271=186^A273=123719000^A336=2^A346=3^A1023=1^A10=210^A

Here is the code that I am using to create a table in Hive without success.

        CREATE EXTERNAL TABLE fix_map
        (tag MAP<INT, STRING>)
        ROW FORMAT DELIMITED
        COLLECTION ITEMS TERMINATED BY '1'
        FIELDS TERMINATED BY '2'
        MAP KEYS TERMINATED BY '='
        LOCATION '/user/data/';

Any suggestions would be greatly appreciated it.

Grant Birchmeier
  • 17,809
  • 11
  • 63
  • 98
jlroo
  • 141
  • 2
  • 9

1 Answers1

0

I didnt find a concrete answer for this questions I used a different approach to solve this problem. The first solution is to create two different tables, one for the main fix msg the other for the nested part then join the two tables. I used this code to create the first table:

        CREATE TABLE futures_fix(
          tag MAP<INT,STRING>
          )
        ROW FORMAT DELIMITED
        FIELDS TERMINATED BY '2'
        COLLECTION ITEMS TERMINATED BY '1'
        MAP KEYS TERMINATED BY '='
        LOCATION '/user/lcoation/data/';

The other approach is to use the fis2json (npm) tool from (SunGard Labs) this guys created a great tool to decoded FIX to json format ( fix2json ). After you decoded the data you will have more tools to make work with nested tables also the fi2json decoder will change all the tags to the correct names! Here is an example of the fix2json output.

     {
        "ApplVerID":"FIX50SP2",
        "BodyLength":"136",
        "MsgType":"MARKETDATAINCREMENTALREFRESH",
        "SenderCompID":"CME",
        "MsgSeqNum":"578",
        "SendingTime":"20130714180133577",
        "TradeDate":"20130715",
        "NoMDEntries":"1",
        "MDEntries":[
           {
              "MDUpdateAction":"NEW",
              "SecurityIDSource":"EXCHANGE SYMBOL",
              "SecurityID":"111473",
              "SettlDate":"20130712",
              "RptSeq":"1",
              "SecurityDesc":"ESU4",
              "MDEntryType":"SETTLEMENT PRICE",
              "MDEntryPx":"164350",
              "MDEntryTime":"180133000"
           }
        ],
        "CheckSum":"248"
     }
jlroo
  • 141
  • 2
  • 9