You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SET @restore_cmd =REPLACE(@restore_filelist_tmpl, '%F', @backup_file);
53
+
INSERTINTO @files
54
+
EXECUTE(@restore_cmd);
55
+
56
+
SET @restore_cmd =REPLACE(REPLACE(@restore_database_tmpl, '%F', @backup_file), '%D', LEFT(@backup_file, CHARINDEX('.', @backup_file)-1));
57
+
SET @restore_cur =CURSORFAST_FORWARD FOR SELECT LogicalName, REVERSE(LEFT(REVERSE(PhysicalName), CHARINDEX('\', REVERSE(PhysicalName))-1)) FROM @files;
58
+
OPEN @restore_cur;
59
+
WHILE(1=1)
60
+
BEGIN
61
+
FETCHFROM @restore_cur INTO @logical_name, @filename;
62
+
IF@@FETCH_STATUS<0BREAK;
63
+
64
+
SET @restore_cmd +=REPLACE(REPLACE(@move_tmpl, '%L', @logical_name), '%F', @filename);
65
+
END;
66
+
EXECUTE(@restore_cmd);
67
+
END;
27
68
GO
28
69
29
-
USE sales;
70
+
CREATE OR ALTERPROCEDURE #create_data_sources
71
+
AS
72
+
BEGIN
73
+
-- Create database master key (required for database scoped credentials used in the samples)
74
+
IFNOTEXISTS(SELECT*FROMsys.databasesWHEREname=DB_NAME() and is_master_key_encrypted_by_server =1)
Copy file name to clipboardExpand all lines: samples/features/sql-big-data-cluster/data-virtualization/README.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,8 @@ In **SQL Server 2019 big data clusters**, the SQL Server engine has gained the a
6
6
7
7
**Applies to: SQL Server 2019 big data cluster**
8
8
9
-
In SQL Server 2019 big data cluster, the storage pool consists of HDFS data node with SQL Server & Spark endpoints. The [storage-pool](storage-pool) folder contains SQL scripts that demonstrate how to query data residing in HDFS data inside a big data cluster.
9
+
In SQL Server 2019 big data cluster, the storage pool consists of HDFS data node with SQL Server & Spark endpoints. The [storage-pool](storage-pool) folder contains SQL scripts that demonstrate how to query data residing in HDFS data inside a big data cluster. The [hadoop](hadoop) folder contains SQL scripts that demonstrate how to query data residing in HDFS data using the HADOOP data source for
10
+
operations that are not yet supported with storage pool (ex: export data to HDFS).
# Data virtualization in SQL Server 2019 big data cluster
2
+
3
+
In SQL Server 2019 big data clusters, the SQL Server engine has gained the ability to natively read HDFS files, such as CSV and parquet files, by using SQL Server instances collocated on each of the HDFS data nodes to filter and aggregate data locally in parallel across all of the HDFS data nodes. Using the PolyBase v1 HADOOP data source, you can manipulate ORC or RCFILE files inside the big data cluster.
4
+
5
+
## Query data in HDFS from SQL Server master using HADOOP data source
6
+
7
+
**Applies to:** SQL Server 2019 big data cluster
8
+
9
+
In SQL Server 2019 big data cluster, the storage pool consists of HDFS data node with SQL Server & Spark endpoints. In this example, you are going to create an external table in the SQL Server Master instance that points to data in HDFS within the SQL Server Big data cluster using the HADOOP data source. You will then join the data in the external table with high value data in SQL Master instance. Or export data to HDFS from SQL Master instance.
10
+
11
+
### Instructions
12
+
13
+
1. Connect to HDFS/Knox gateway from Azure Data Studio using SQL Server big data cluster connection type.
14
+
15
+
1. Run the [../../spark/spark-sql.ipynb](../../spark/spark-sql.ipynb/) notebook to generate the sample parquet file(s).
16
+
17
+
1. Connect to SQL Server Master instance.
18
+
19
+
1. Execute the [web-clickstreams-hdfs-orc.sql](web-clickstreams-hdfs-orc.sql). This script demonstrates how to read ORC file(s) stored in HDFS.
20
+
21
+
1. Execute the [product-reviews-hdfs-orc.sql](product-reviews-hdfs-orc.sql). This script demonstrates how to read ORC file(s) stored in HDFS.
22
+
23
+
1. Execute the [inventory-hdfs-rcfile.sql](inventory-hdfs-rcfile.sql). This script demonstrates how to export data from SQL Server into HDFS using PolyBase v1 syntax. This script will export data from SQL Server into RCFILE format.
0 commit comments