Skip to content

Commit b5f5539

Browse files
authored
[opt](benchmark) update ssb, tpch, tpcds benchmark result (#3535)
## Versions - [x] dev - [x] 4.x - [ ] 3.x - [ ] 2.1 ## Languages - [x] Chinese - [x] English ## Docs Checklist - [ ] Checked by AI - [ ] Test Cases Built
1 parent 882a532 commit b5f5539

12 files changed

Lines changed: 746 additions & 1520 deletions

File tree

docs/benchmark/ssb.md

Lines changed: 39 additions & 225 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,18 @@
88

99
# Star Schema Benchmark
1010

11-
[Star Schema Benchmark(SSB)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF) is a lightweight performance test set in the data warehouse scenario. SSB provides a simplified star schema data based on [TPC-H](http://www.tpc.org/tpch/), which is mainly used to test the performance of multi-table JOIN query under star schema. In addition, the industry usually flattens SSB into a wide table model (Referred as: SSB flat) to test the performance of the query engine.
11+
[Star Schema Benchmark(SSB)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF) is a lightweight performance test set in the data warehouse scenario. SSB provides a simplified star schema data based on [TPC-H](http://www.tpc.org/tpch/), which is mainly used to test the performance of multi-table JOIN query under star schema.
1212

13-
This document mainly introduces the performance of Doris on the SSB 1000G test set.
13+
This document mainly introduces the performance of Doris on the SSB SF1000 test set.
1414

15-
We tested 13 queries on the SSB standard test dataset based on Apache Doris version 2.0.15.1.
15+
We tested 13 queries on the SSB standard test dataset based on Apache Doris.
1616

1717
## 1. Hardware Environment
1818

1919
| Hardware | Configuration Instructions |
2020
|--------------------|------------------------------------------|
21-
| Number of Machines | 4 Aliyun Virtual Machine (1FE,3BEs) |
22-
| CPU | Intel Xeon (Ice Lake) Platinum 8369B 32C |
21+
| Number of Machines | 4 [Aliyun g9i Virtual Machine](https://www.alibabacloud.com/help/en/ecs/user-guide/general-purpose-instance-families#g9i) (1FE,3BEs) |
22+
| CPU | Intel® Xeon® Granite Rapids 32C |
2323
| Memory | 128G |
2424
| Disk | Enterprise SSD (PL0) |
2525

@@ -28,8 +28,7 @@ We tested 13 queries on the SSB standard test dataset based on Apache Doris vers
2828
- Doris Deployed 3BEs and 1FE
2929
- Kernel Version: Linux version 5.15.0-101-generic
3030
- OS version: Ubuntu 20.04 LTS (Focal Fossa)
31-
- Doris software version: Apache Doris 2.0.15.1
32-
- JDK: openjdk version "1.8.0_352-352"
31+
- JDK: openjdk 17.0.2
3332

3433
## 3. Test Data Volume
3534

@@ -42,57 +41,35 @@ We tested 13 queries on the SSB standard test dataset based on Apache Doris vers
4241
| dates | 2,556 | Date |
4342
| lineorder_flat | 5,999,989,709 | Wide Table after Data Flattening |
4443

45-
## 4. SSB Flat Test Results
46-
47-
Here we use Apache Doris 2.0.15.1 for comparative testing. In the test, we use Query Time(ms) as the main performance indicator. The test results are as follows:
48-
49-
| Query | Doris 2.0.15.1 (ms) |
50-
|-----------|---------------------|
51-
| q1.1 | 80 |
52-
| q1.2 | 10 |
53-
| q1.3 | 110 |
54-
| q2.1 | 1680 |
55-
| q2.2 | 1210 |
56-
| q2.3 | 1060 |
57-
| q3.1 | 2010 |
58-
| q3.2 | 1560 |
59-
| q3.3 | 600 |
60-
| q3.4 | 10 |
61-
| q4.1 | 2380 |
62-
| q4.2 | 190 |
63-
| q4.3 | 120 |
64-
| **Total** | **11020** |
65-
66-
67-
## 5. Standard SSB Test Results
68-
69-
Here we use Apache Doris 2.0.15.1 for comparative testing. In the test, we use Query Time(ms) as the main performance indicator. The test results are as follows:
70-
71-
| Query | Doris 2.0.15.1 (ms) |
72-
|-----------|---------------------|
73-
| q1.1 | 330 |
74-
| q1.2 | 80 |
75-
| q1.3 | 80 |
76-
| q2.1 | 1780 |
77-
| q2.2 | 1970 |
78-
| q2.3 | 1510 |
79-
| q3.1 | 4000 |
80-
| q3.2 | 1720 |
81-
| q3.3 | 1510 |
82-
| q3.4 | 160 |
83-
| q4.1 | 4010 |
84-
| q4.2 | 840 |
85-
| q4.3 | 400 |
86-
| **Total** | **19390** |
87-
88-
## 6. Environment Preparation
44+
## 4. Standard SSB Test Results
45+
46+
In the test, we use Query Time(ms) as the main performance indicator. The test results are as follows:
47+
48+
| Query | Doris 2.1.11 (ms) | Doris 3.1.4 (ms) | Doris 4.0.5 (ms) | Doris 4.1.0 (ms) |
49+
|-----------|-------------------|------------------|------------------|------------------|
50+
| **Total** | **13270** | **11591** | **12495** | **10934** |
51+
| q1.1 | 140 | 179 | 151 | 126 |
52+
| q1.2 | 70 | 105 | 114 | 82 |
53+
| q1.3 | 70 | 96 | 107 | 79 |
54+
| q2.1 | 1520 | 1066 | 1263 | 1096 |
55+
| q2.2 | 1630 | 1425 | 1311 | 1293 |
56+
| q2.3 | 1250 | 1086 | 1199 | 1008 |
57+
| q3.1 | 2470 | 2020 | 2174 | 2142 |
58+
| q3.2 | 1450 | 1165 | 1484 | 1395 |
59+
| q3.3 | 870 | 847 | 1080 | 314 |
60+
| q3.4 | 130 | 167 | 148 | 68 |
61+
| q4.1 | 2860 | 2485 | 2517 | 2427 |
62+
| q4.2 | 520 | 597 | 563 | 563 |
63+
| q4.3 | 290 | 353 | 384 | 341 |
64+
65+
## 5. Environment Preparation
8966

9067
Please first refer to the [official documentation](../install/deploy-manually/separating-storage-compute-deploy-manually) to install and deploy Apache Doris first to obtain a Doris cluster which is working well(including at least 1 FE 1 BE, 1 FE 3 BEs is recommended).
9168

9269

93-
## 7. Data Preparation
70+
## 6. Data Preparation
9471

95-
### 7.1 Download and Install the SSB Data Generation Tool.
72+
### 6.1 Download and Install the SSB Data Generation Tool.
9673

9774
Execute the following script to download and compile the [ssb-tools](https://github.com/apache/doris/tree/master/tools/ssb-tools) tool.
9875

@@ -102,7 +79,7 @@ sh bin/build-ssb-dbgen.sh
10279

10380
After successful installation, the `dbgen` binary will be generated under the `ssb-dbgen/` directory.
10481

105-
### 7.2 Generate SSB Test Set
82+
### 6.2 Generate SSB Test Set
10683

10784
Execute the following script to generate the SSB dataset:
10885

@@ -114,11 +91,11 @@ sh bin/gen-ssb-data.sh -s 1000
11491
>
11592
> Note 2: The data will be generated under the `ssb-data/` directory with the suffix `.tbl`. The total file size is about 600GB and may need a few minutes to an hour to generate.
11693
>
117-
> Note 3: A standard test data set of 100G is generated by default.
94+
> Note 3: A standard test data set of SF100 is generated by default.
11895
119-
### 7.3 Create Table
96+
### 6.3 Create Table
12097

121-
#### 7.3.1 Prepare the `doris-cluster.conf` File.
98+
#### 6.3.1 Prepare the `doris-cluster.conf` File.
12299

123100
Before import the script, you need to write the FE’s ip port and other information in the `doris-cluster.conf` file.
124101

@@ -141,23 +118,23 @@ export PASSWORD=''
141118
export DB='ssb'
142119
```
143120

144-
#### 7.3.2 Execute the Following Script to Generate and Create the SSB Table:
121+
#### 6.3.2 Execute the Following Script to Generate and Create the SSB Table:
145122

146123
```shell
147124
sh bin/create-ssb-tables.sh -s 1000
148125
```
149126

150127
Or copy the table creation statements in [create-ssb-tables.sql](https://github.com/apache/doris/blob/master/tools/ssb-tools/ddl/create-ssb-tables-sf1000.sql) and [create-ssb-flat-table.sql](https://github.com/apache/doris/blob/master/tools/ssb-tools/ddl/create-ssb-flat-tables-sf1000.sql) and then execute them in the MySQL client.
151128

152-
### 7.4 Import data
129+
### 6.4 Import data
153130

154131
We use the following command to complete all data import of SSB test set and SSB FLAT wide table data synthesis and then import into the table.
155132

156133
```shell
157134
sh bin/load-ssb-data.sh
158135
```
159136

160-
### 7.5 Checking Imported data
137+
### 6.5 Checking Imported data
161138

162139
```sql
163140
select count(*) from part;
@@ -168,176 +145,13 @@ select count(*) from lineorder;
168145
select count(*) from lineorder_flat;
169146
```
170147

171-
### 7.6 Query Test
148+
### 6.6 Query Test
172149

173150
- SSB-Flat Query Statement: [ ssb-flat-queries](https://github.com/apache/doris/tree/master/tools/ssb-tools/ssb-flat-queries)
174151
- Standard SSB Queries: [ ssb-queries](https://github.com/apache/doris/tree/master/tools/ssb-tools/ssb-queries)
175152

176-
#### 7.6.1 SSB FLAT Test for SQL
177153

178-
```sql
179-
--Q1.1
180-
SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
181-
FROM lineorder_flat
182-
WHERE
183-
LO_ORDERDATE >= 19930101
184-
AND LO_ORDERDATE <= 19931231
185-
AND LO_DISCOUNT BETWEEN 1 AND 3
186-
AND LO_QUANTITY < 25;
187-
188-
--Q1.2
189-
SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
190-
FROM lineorder_flat
191-
WHERE
192-
LO_ORDERDATE >= 19940101
193-
AND LO_ORDERDATE <= 19940131
194-
AND LO_DISCOUNT BETWEEN 4 AND 6
195-
AND LO_QUANTITY BETWEEN 26 AND 35;
196-
197-
--Q1.3
198-
SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
199-
FROM lineorder_flat
200-
WHERE
201-
weekofyear(LO_ORDERDATE) = 6
202-
AND LO_ORDERDATE >= 19940101
203-
AND LO_ORDERDATE <= 19941231
204-
AND LO_DISCOUNT BETWEEN 5 AND 7
205-
AND LO_QUANTITY BETWEEN 26 AND 35;
206-
207-
--Q2.1
208-
SELECT
209-
SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR,
210-
P_BRAND
211-
FROM lineorder_flat
212-
WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA'
213-
GROUP BY YEAR, P_BRAND
214-
ORDER BY YEAR, P_BRAND;
215-
216-
--Q2.2
217-
SELECT
218-
SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR,
219-
P_BRAND
220-
FROM lineorder_flat
221-
WHERE
222-
P_BRAND >= 'MFGR#2221'
223-
AND P_BRAND <= 'MFGR#2228'
224-
AND S_REGION = 'ASIA'
225-
GROUP BY YEAR, P_BRAND
226-
ORDER BY YEAR, P_BRAND;
227-
228-
--Q2.3
229-
SELECT
230-
SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR,
231-
P_BRAND
232-
FROM lineorder_flat
233-
WHERE
234-
P_BRAND = 'MFGR#2239'
235-
AND S_REGION = 'EUROPE'
236-
GROUP BY YEAR, P_BRAND
237-
ORDER BY YEAR, P_BRAND;
238-
239-
--Q3.1
240-
SELECT
241-
C_NATION,
242-
S_NATION, (LO_ORDERDATE DIV 10000) AS YEAR,
243-
SUM(LO_REVENUE) AS revenue
244-
FROM lineorder_flat
245-
WHERE
246-
C_REGION = 'ASIA'
247-
AND S_REGION = 'ASIA'
248-
AND LO_ORDERDATE >= 19920101
249-
AND LO_ORDERDATE <= 19971231
250-
GROUP BY C_NATION, S_NATION, YEAR
251-
ORDER BY YEAR ASC, revenue DESC;
252-
253-
--Q3.2
254-
SELECT
255-
C_CITY,
256-
S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR,
257-
SUM(LO_REVENUE) AS revenue
258-
FROM lineorder_flat
259-
WHERE
260-
C_NATION = 'UNITED STATES'
261-
AND S_NATION = 'UNITED STATES'
262-
AND LO_ORDERDATE >= 19920101
263-
AND LO_ORDERDATE <= 19971231
264-
GROUP BY C_CITY, S_CITY, YEAR
265-
ORDER BY YEAR ASC, revenue DESC;
266-
267-
--Q3.3
268-
SELECT
269-
C_CITY,
270-
S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR,
271-
SUM(LO_REVENUE) AS revenue
272-
FROM lineorder_flat
273-
WHERE
274-
C_CITY IN ('UNITED KI1', 'UNITED KI5')
275-
AND S_CITY IN ('UNITED KI1', 'UNITED KI5')
276-
AND LO_ORDERDATE >= 19920101
277-
AND LO_ORDERDATE <= 19971231
278-
GROUP BY C_CITY, S_CITY, YEAR
279-
ORDER BY YEAR ASC, revenue DESC;
280-
281-
--Q3.4
282-
SELECT
283-
C_CITY,
284-
S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR,
285-
SUM(LO_REVENUE) AS revenue
286-
FROM lineorder_flat
287-
WHERE
288-
C_CITY IN ('UNITED KI1', 'UNITED KI5')
289-
AND S_CITY IN ('UNITED KI1', 'UNITED KI5')
290-
AND LO_ORDERDATE >= 19971201
291-
AND LO_ORDERDATE <= 19971231
292-
GROUP BY C_CITY, S_CITY, YEAR
293-
ORDER BY YEAR ASC, revenue DESC;
294-
295-
--Q4.1
296-
SELECT (LO_ORDERDATE DIV 10000) AS YEAR,
297-
C_NATION,
298-
SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
299-
FROM lineorder_flat
300-
WHERE
301-
C_REGION = 'AMERICA'
302-
AND S_REGION = 'AMERICA'
303-
AND P_MFGR IN ('MFGR#1', 'MFGR#2')
304-
GROUP BY YEAR, C_NATION
305-
ORDER BY YEAR ASC, C_NATION ASC;
306-
307-
--Q4.2
308-
SELECT (LO_ORDERDATE DIV 10000) AS YEAR,
309-
S_NATION,
310-
P_CATEGORY,
311-
SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
312-
FROM lineorder_flat
313-
WHERE
314-
C_REGION = 'AMERICA'
315-
AND S_REGION = 'AMERICA'
316-
AND LO_ORDERDATE >= 19970101
317-
AND LO_ORDERDATE <= 19981231
318-
AND P_MFGR IN ('MFGR#1', 'MFGR#2')
319-
GROUP BY YEAR, S_NATION, P_CATEGORY
320-
ORDER BY
321-
YEAR ASC,
322-
S_NATION ASC,
323-
P_CATEGORY ASC;
324-
325-
--Q4.3
326-
SELECT (LO_ORDERDATE DIV 10000) AS YEAR,
327-
S_CITY,
328-
P_BRAND,
329-
SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
330-
FROM lineorder_flat
331-
WHERE
332-
S_NATION = 'UNITED STATES'
333-
AND LO_ORDERDATE >= 19970101
334-
AND LO_ORDERDATE <= 19981231
335-
AND P_CATEGORY = 'MFGR#14'
336-
GROUP BY YEAR, S_CITY, P_BRAND
337-
ORDER BY YEAR ASC, S_CITY ASC, P_BRAND ASC;
338-
```
339-
340-
#### 7.6.2 SSB Standard Test for SQL
154+
#### 6.6.1 SSB Standard Test for SQL
341155

342156
```sql
343157
--Q1.1

0 commit comments

Comments
 (0)