1

How do I populate the Transaction Processing Performance Council's TPC-DS database for SQL Server? I have downloaded the TPC-DS tool but there are few tutorials about how to use it.

vladimir
  • 13,428
  • 2
  • 44
  • 70
zli89
  • 143
  • 2
  • 6

3 Answers3

1

In case you are using windows, you gotta have visual studio 2005 or later. Unzip dsgen in the folder tools there is dsgen2.sln file, open it using visual studio and build the project, will generate tables for you, I've tried that and I loaded tables manually into sql server

Hawk
  • 5,060
  • 12
  • 49
  • 74
  • Thanks for your answer Haytham. I have generated the data and created tables in SQL Server. But I have some trouble loading data into database. Do you use "BULK insert" statement to load the data? – zli89 Mar 10 '13 at 14:42
  • I've used SQL server import and export wizard, after I've modified the extension of generated files from .dat into .txt, in SQL server wizard I chose from data source list: flat files source, you gotta do it for every file. However, I'm stuck at the next step, how to perform the test – Hawk Mar 11 '13 at 08:07
  • I tried your method but there's an error: The number of input columns for Destination - call_center.Inputs[Destination Input] cannot be zero. Did this error ever happen to you? BTW, in importing, I should uncheck "Column names in the first data row" and keep everything else as default, right? Thanks for your help. – zli89 Mar 11 '13 at 14:56
  • I did not understand the error. however, yes, you have to uncheck column names in the first data row, some columns need to make their fields more than 50 digits (which are the default) you'll get error message at the end regarding this indicating the column's name, so you gotta go back and modify this manually – Hawk Mar 12 '13 at 14:08
  • Hi Haytham, I moved a step forward thanks to your help. But now I have troubles generating executable queries from templates. Did you generate TPC-DS queries by dsqgen? – zli89 Mar 18 '13 at 15:44
  • Not yet, I am working on this step currently, I will let you know once I do it, and you let me know if you did it. – Hawk Mar 19 '13 at 17:17
  • Is there a way to generate mysql queries, current dialects are incompatible with mysql. – Kumar Deepak Jul 11 '15 at 13:42
1

Let's describe the base steps:

  1. Before go to the next steps double-check that the required TPC-DS Kit has not been already prepared for your DB

  2. Download TPC-DS Tools

  3. Build Tools as described in 'v2.11.0rc2\tools\How_To_Guide-DS-V2.0.0.docx' (I used VS2015)

  4. Create DB

Take the DB schema described in tpcds.sql and tpcds_ri.sql (they located in 'v2.11.0rc2\tools\'-folder), suit it to your DB if required.

  1. Generate data that be stored to database
# Windows
dsdgen.exe /scale 1 /dir .\tmp /suffix _001.dat

# Linux
dsdgen -scale 1 -dir /tmp -suffix _001.dat
  1. Upload data to DB
# example for ClickHouse

database_name=tpcds
ch_password=12345

for file_fullpath in /tmp/tpc-ds/*.dat; do
  filename=$(echo ${file_fullpath##*/})
  tablename=$(echo ${filename%_*})
  echo " - $(date +"%T"): start processing $file_fullpath (table: $tablename)"

  query="INSERT INTO $database_name.$tablename FORMAT CSV"
  cat $file_fullpath | clickhouse-client --format_csv_delimiter="|" --query="$query" --password $ch_password
done
  1. Generate queries
# Windows
set tmpl_lst_path="..\query_templates\templates.lst"
set tmpl_dir="..\query_templates"
set dialect_path="..\..\clickhouse-dialect"
set result_dir="..\queries"
set tmpl_name="query1.tpl"

dsqgen /input %tmpl_lst_path% /directory %tmpl_dir% /dialect %dialect_path% /output_dir %result_dir% /scale 1 /verbose y /template %tmpl_name%

# Linux
# see for example https://github.com/pingcap/tidb-bench/blob/master/tpcds/genquery.sh

To fix the error 'Substitution .. is used before being initialized' follow this fix.

vladimir
  • 13,428
  • 2
  • 44
  • 70
0

I've just succeeded in generating these queries. There are some tips may not the best but useful.

  1. cp ${...}/query_templates/* ${...}/tools/
  2. add define _END = ""; to each query.tpl
  3. ${...}/tools/dsqgen -INPUT templates.lst -OUTPUT_DIR /home/query99/
vladimir
  • 13,428
  • 2
  • 44
  • 70
  • 1
    About 2nd point: It doesn't need to edit each *tpl*-file - just modify dialect-file as described here (https://dba.stackexchange.com/a/97926/203059). – vladimir Mar 10 '20 at 18:32