0

this might be very silly question but I am getting a hard time to figure out a way to copy data inside my BIGSQL table to a .txt in local file system.

LazyBones
  • 113
  • 6
  • Not sure what BigSQL is, but if you use hive you can do `hive -e "select * from table" > output.txt`. That will dump the output of the select statement to `output.txt` – OneCricketeer Dec 11 '15 at 21:58
  • Our organization is using BigInsights cluster, service provided by IBM and bigsql is just like hive, but faster than hive. – LazyBones Dec 11 '15 at 22:01
  • Ah, okay. Still, if you can use a CLI to query the table, you can direct the output to a file as I mentioned. A quick google search brought me to the [jqsh CLI](http://www-01.ibm.com/support/knowledgecenter/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.analyze.doc/doc/bsql_jsqsh.html). – OneCricketeer Dec 11 '15 at 22:10
  • `'select * from XYZ' > /tmp/pig_test/party.txt` didn't work _Error: [State: 42601][Code: -104]: An unexpected token ""select * from XYZ" > /" was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "".. SQLCODE=-104, SQLSTATE=42601, DRIVER=3.67.33_ – LazyBones Dec 11 '15 at 22:19
  • Well, you are missing a FROM between the * and table. Also, you have to run it from shell (bash, zsh, etc.), not the CLI. – OneCricketeer Dec 11 '15 at 22:22
  • then it says command not found. I rectified the missing `from`. – LazyBones Dec 11 '15 at 22:24
  • Okay. Going back to my initial hive example (I know you aren't using Hive). So from bash, I can do `hive -e "query"` where the hive **command** **e**valuates (with `-e`) the query string. The `>` character is an output redirection for bash, not part of hive. If I were to type `select * from table > output.txt` in Hive (or any SQL equivalent), it will throw an error because "> output.txt" is not recognized as a valid query string. – OneCricketeer Dec 11 '15 at 22:30
  • I tried diff combinations `jsqsh -D` , `jsqsh -A` based on the documentation... A stands for auto connect... D stands for database connection but none worked. – LazyBones Dec 11 '15 at 22:37
  • Alright. Ignore the "> output.txt" for now. How do you normally query the data? – OneCricketeer Dec 11 '15 at 22:41
  • same as any sql `select * from xyz;`. – LazyBones Dec 11 '15 at 22:43

3 Answers3

1

Depending upon the size of the resulting data file, you can use the export command to get the data into one text file. The resulting file will end up on one node.

I used the following script as an example:

\connect bigsql
drop table if exists stack.issue2;

create hadoop table if not exists stack.issue2 (
f1 integer,
f2 integer,
f3 varchar(200),
f4 integer
)
stored as parquetfile;

insert into stack.issue2 (f1,f2,f3,f4) values (0,0,'Detroit',0);
insert into stack.issue2 (f1,f2,f3,f4) values (1,1,'Mt. Pleasant',1);
insert into stack.issue2 (f1,f2,f3,f4) values (2,2,'Marysville',2);
insert into stack.issue2 (f1,f2,f3,f4) values (3,3,'St. Clair',3);
insert into stack.issue2 (f1,f2,f3,f4) values (4,4,'Port Huron',4);

select * from stack.issue2;

{ call sysproc.admin_cmd('export to /tmp/t1.unl of del select * from stack.issue2') };

\quit

Running the script:

jsqsh --autoconnect --input-file=./t1.sql --output-file=t1.out

Yields:

cat t1.out
+----+----+--------------+----+
| F1 | F2 | F3           | F4 |
+----+----+--------------+----+
|  0 |  0 | Detroit      |  0 |
|  2 |  2 | Marysville   |  2 |
|  3 |  3 | St. Clair    |  3 |
|  1 |  1 | Mt. Pleasant |  1 |
|  4 |  4 | Port Huron   |  4 |
+----+----+--------------+----+
+---------------+---------------+-------------+
| ROWS_EXPORTED | MSG_RETRIEVAL | MSG_REMOVAL |
+---------------+---------------+-------------+
|             5 | [NULL]        | [NULL]      |
+---------------+---------------+-------------+

and the exported file:

ls -la /tmp/t1.unl
-rw-r--r-- 1 bigsql hadoop 93 Mar  3 16:05 /tmp/t1.unl

cat /tmp/t1.unl
0,0,"Detroit",0
3,3,"St. Clair",3
2,2,"Marysville",2
1,1,"Mt. Pleasant",1
4,4,"Port Huron",4
proksch_ibm
  • 278
  • 1
  • 9
0

Beauty of bigsql is you can connect just like you do a regular db2 database and call export.

[bigsql@myhost ~]$ db2 "create hadoop table test1 ( i int, i2 int , i3 int)"
DB20000I  The SQL command completed successfully.
[bigsql@myhost ~]$ db2 "insert into test1 values (1,2,3), (4,5,6),(7,8,9),(0,1,2)"
DB20000I  The SQL command completed successfully.
[bigsql@myhost ~]$ db2 "export to output.del of del select * from test1"
SQL3104N  The Export utility is beginning to export data to file "output.del".

SQL3105N  The Export utility has finished exporting "4" rows.


Number of rows exported: 4

[bigsql@myhost ~]$ cat output.del
1,2,3
4,5,6
7,8,9
0,1,2
iender
  • 3
  • 1
0

Another way to extract via SQL (as csv in this example) is as follow:

create hadoop table csv_tableName
  row format delimited fields terminated by ','
  location '/tmp/csv_tableName'
as select * from tableName

You can then get the file/s from HDFS.