|
1 | | -# Azure Arc Data Controller clusters |
2 | | - |
3 | | -Installation instructions for SQL Server 2019 big data clusters can be found [here](https://docs.microsoft.com/en-us/sql/big-data-cluster/deployment-guidance?view=sql-server-ver15). |
| 1 | +# Azure Arc Data Controller cluster |
4 | 2 |
|
5 | 3 | ## Samples Setup |
6 | | - |
7 | | -**Before you begin**, load the sample data into your big data cluster. For instructions, see [Load sample data into a SQL Server 2019 big data cluster](https://docs.microsoft.com/en-us/sql/big-data-cluster/tutorial-load-sample-data). |
8 | | - |
9 | | -## Executing the sample scripts |
10 | | -The scripts should be executed in a specific order to test the various features. Execute the scripts from each folder in below order: |
11 | | - |
12 | | -1. __[spark/data-loading/transform-csv-files.ipynb](spark/data-loading/transform-csv-files.ipynb)__ |
13 | | -1. __[data-virtualization/generic-odbc](data-virtualization/generic-odbc)__ |
14 | | -1. __[data-virtualization/hadoop](data-virtualization/hadoop)__ |
15 | | -1. __[data-virtualization/storage-pool](data-virtualization/storage-pool)__ |
16 | | -1. __[data-virtualization/oracle](data-virtualization/oracle)__ |
17 | | -1. __[data-pool](data-pool/)__ |
18 | | -1. __[machine-learning/sql/r](machine-learning/sql/r)__ |
19 | | -1. __[machine-learning/sql/python](machine-learning/sql/python)__ |
20 | | - |
21 | | -## __[data-pool](data-pool/)__ |
22 | | - |
23 | | -SQL Server 2019 big data cluster contains a data pool which consists of many SQL Server instances to store data & query in a scale-out manner. |
24 | | - |
25 | | -### Data ingestion using Spark |
26 | | -The sample script [data-pool/data-ingestion-spark.sql](data-pool/data-ingestion-spark.sql) shows how to perform data ingestion from Spark into data pool table(s). |
27 | | - |
28 | | -### Data ingestion using sql |
29 | | -The sample script [data-pool/data-ingestion-sql.sql](data-pool/data-ingestion-sql.sql) shows how to perform data ingestion from T-SQL into data pool table(s). |
30 | | - |
31 | | -## __[data-virtualization](data-virtualization/)__ |
32 | | - |
33 | | -SQL Server 2019 or SQL Server 2019 big data cluster can use PolyBase external tables to connect to other data sources. |
34 | | - |
35 | | -### External table over Generic ODBC data source |
36 | | -The [data-virtualization/generic-odbc](data-virtualization/generic-odbc) folder contains samples that demonstrate how to query data in MySQL & PostgreSQL using external tables and generic ODBC data source. The generic ODBC data soruce can be used only in SQL Server 2019 on Windows. |
37 | | - |
38 | | -### External table over Hadoop |
39 | | -The [data-virtualization/hadoop](data-virtualization/hadoop) folder contains samples that demonstrate how to query data in HDFS using external tables. This demonstrates the functionality available from SQL Server 2016 using the HADOOP data source. |
40 | | - |
41 | | -### External table over Oracle |
42 | | -The [data-virtualization/oracle](data-virtualization/oracle) folder contains samples that demonstrate how to query data in Oracle using external tables. |
43 | | - |
44 | | -### External table over Storage Pool |
45 | | -SQL Server 2019 big data cluster contains a storage pool consisting of HDFS, Spark and SQL Server instances. The [data-virtualization/storage-pool](data-virtualization/storage-pool) folder contains samples that demonstrate how to query data in HDFS inside SQL Server 2019 big data cluster. |
46 | | - |
47 | | -## __[deployment](deployment/)__ |
48 | | - |
49 | | -The [deployment](deployment) folder contains the scripts for deploying a Kubernetes cluster for SQL Server 2019 big data cluster. |
50 | | - |
51 | | -## __[machine-learning](machine-learning/)__ |
52 | | - |
53 | | -SQL Server 2016 added support executing R scripts from T-SQL. SQL Server 2017 added support for executing Python scripts from T-SQL. SQL Server 2019 adds support for executing Java code from T-SQL. SQL Server 2019 big data cluster adds support for executing Spark code inside the big data cluster. |
54 | | - |
55 | | -### SQL Server Machine Learning Services |
56 | | -The [machine-learning\sql](machine-learning\sql) folder contains the sample SQL scripts that show how to invoke R, Python, and Java code from T-SQL. |
57 | | - |
58 | | -### Spark Machine Learning |
59 | | -The [machine-learning\spark](machine-learning\spark) folder contains the Spark samples. |
| 4 | +Follow the instrutions here: https://raw.githubusercontent.com/ananto-msft/sql-server-samples/master/samples/features/azure-arc-data-controller/deployment/kubeadm/ubuntu-single-node-vm/README.md |
0 commit comments