β

TPC-H Benchmark

连顺科的个人页面 200 阅读

关于TCP-H benchmark

The TPC-H Benchmark is a popular one for comparing database vendors. It's possible to run the TPC-H data set on PostgreSQL without having a formal testing kit (although there is DBT-3, a work in progress to provide a full kit). The results have generally been disappointing, for reasons that aren't necessarily relevant in the real world. PostgreSQL is missing some of the things needed to do well on this benchmark, whereas proprietary database vendors are so focused on it they will "game" TPC-H runs (add optimizations specifically aimed at it) to make absolutely sure they do well.



安装

下载页面地址:http://www.tpc.org/tpch/specs.asp


安装数据库

Version::9.3.4
Admin_user:postgres


安装tpch

解压:

[root@tpch opt]# unzip tpch_2_16_0v1.zip

设置makefile:

[root@tpch opt]# cd tpch_2_15_0/dbgen/
[root@tpch dbgen]# cp makefile.suite makefile
[root@tpch dbgen]# vi makefile
CC      = gcc
DATABASE= ORACLE
MACHINE = LINUX
WORKLOAD = TPCH

编译:

[root@tpch dbgen]# make

生成测试数据文件:

[root@tpch dbgen]# ./dbgen -s 1
[root@tpch dbgen]# ls *.tbl
customer.tbl  lineitem.tbl  nation.tbl  orders.tbl  partsupp.tbl  part.tbl  region.tbl  supplier.tbl


修改数据脚本,转换为csv格式:

(去掉每行最后一个”|”号)

[root@tpch dbgen]# for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; echo $i; done;
customer.tbl
lineitem.tbl
nation.tbl
orders.tbl
partsupp.tbl
part.tbl
region.tbl
supplier.tbl

修改约束脚本dss.ri:

(1)去掉”CONNECT TO TPCD;”
(2)去掉对象前的”TPCD.”
(3)去掉外键名称
(4)去掉”COMMIT WORK;”

测试

创建测试库

[postgres@tpch ~]$ createdb tpch


创建表

[postgres@tpch ~]$ psql -f /opt/tpch_2_15_0/dbgen/dss.ddl tpch
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE


载入数据

[postgres@tpch ~]$ cat load.sql 
COPY nation FROM '/opt/tpch_2_15_0/dbgen/nation.csv' WITH (FORMAT csv, DELIMITER '|');
COPY region FROM '/opt/tpch_2_15_0/dbgen/region.csv' WITH (FORMAT csv, DELIMITER '|');
COPY part FROM '/opt/tpch_2_15_0/dbgen/part.csv' WITH (FORMAT csv, DELIMITER '|');
COPY supplier FROM '/opt/tpch_2_15_0/dbgen/supplier.csv' WITH (FORMAT csv, DELIMITER '|');
COPY partsupp FROM '/opt/tpch_2_15_0/dbgen/partsupp.csv' WITH (FORMAT csv, DELIMITER '|');
COPY customer FROM '/opt/tpch_2_15_0/dbgen/customer.csv' WITH (FORMAT csv, DELIMITER '|');
COPY orders FROM '/opt/tpch_2_15_0/dbgen/orders.csv' WITH (FORMAT csv, DELIMITER '|');
COPY lineitem FROM '/opt/tpch_2_15_0/dbgen/lineitem.csv' WITH (FORMAT csv, DELIMITER '|');
[postgres@tpch ~]$ psql -f load.sql tpch
COPY 25
COPY 5
COPY 200000
COPY 10000
COPY 800000
COPY 150000
COPY 1500000
COPY 6001215


添加约束

[postgres@tpch ~]$ psql -f /opt/tpch_2_15_0/dbgen/dss.ri tpch
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE


生成查询脚本

[root@tpch dbgen]# vim gen_query_sql.sh
#!/bin/sh
# generate queries from query templates with qgen
DIR=.
mkdir $DIR/finals
cp $DIR/queries/*.sql $DIR
for FILE in $(find $DIR -maxdepth 1 -name "[0-9]*.sql")
do
    DIGIT=$(echo $FILE | tr -cd '[[:digit:]]')
    ./qgen $DIGIT > $DIR/finals/$DIGIT.sql
done
rm *.sql
[root@tpch dbgen]# chmod +x gen_query_sql.sh 
[root@tpch dbgen]# ./gen_query_sql.sh 
[root@tpch dbgen]# ls finals/
.sql  12.sql  14.sql  16.sql  18.sql  1.sql   21.sql  2.sql  4.sql  6.sql  8.sql
.sql  13.sql  15.sql  17.sql  19.sql  20.sql  22.sql  3.sql  5.sql  7.sql  9.sql



参考文档

http://wiki.postgresql.org/wiki/TPC-H

http://www.tpc.org/tpch/

http://dsl.serc.iisc.ernet.in/projects/PICASSO/picasso_download/doc/Installation/tpch.htm

https://github.com/tvondra/pg_tpch

http://ifthiskills.me/?p=588

http://blog.csdn.net/leixingbang1989/article/details/8766047




作者:连顺科的个人页面
连顺科的博客
原文地址:TPC-H Benchmark, 感谢原作者分享。