Skip to content

Commit c61a478

Browse files
committed
Merge remote-tracking branch 'upstream/master'
2 parents 8dc164f + 6cf3543 commit c61a478

5 files changed

Lines changed: 128 additions & 31 deletions

File tree

samples/features/sql-big-data-cluster/deployment/kubeadm/ubuntu/setup-k8s-master.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,5 @@ sudo chown $(id -u):$(id -g) $HOME/.kube/config
1010
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
1111
helm init
1212
kubectl apply -f rbac.yaml
13-
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
13+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
1414
kubectl create clusterrolebinding kubernetes-dashboard --clusterrole=cluster-admin --serviceaccount=kube-system:kubernetes-dashboard
Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,27 @@
11
# SQL Server big data clusters
22

3-
The new built-in notebooks in Azure Data Studio enables data scientists and data engineers to run Python, R, Scala, or Spark SQL code against the cluster.
3+
SQL Server Big Data cluster bundles Spark and HDFS together with SQL server. Azure Data Studio IDE provides built in notebooks that enables data scientists and data engineers to run Spark notebooks and job in Python, R, or Scala code against the Big Data Cluster. This folder contains spark sample notebook on using Spark in SQL server Big data cluster
44

5-
## Instructions to open a notebook from Azure Data Studio and execute the commands
5+
## Folder contents
66

7-
1. Connect to the SQL Server Master instance in a big data cluster
7+
[PySpark Hello World](dataloading/hello_PySpark.ipynb)
88

9-
1. Right-click on the server name, select **Manage**, switch to **SQL Server Big Data Cluster** tab, and use open Notebook.
9+
[Scala Hello World ](dataloading/hello_Scala.ipynb)
1010

11-
1. Open the notebook in Azure Data Studio, wait for the “Kernel” and the target context (“Attach to”) to be populated.
11+
[SparkR Hello World ](dataloading/hello_sparkR.ipynb)
1212

13-
1. Run each cell in the Notebook sequentially.
13+
[DataLoading - Transforming CSV to Parquet](dataloading/transform-csv-files.ipynb/)
1414

15-
## __[data-loading](data-loading/)__
15+
[Data Transfer - Spark to SQL using Spark JDBC connector](data-virtualization/spark_to_sql_jdbc.ipynb/)
1616

17-
This folder contains samples that show how to load data using Spark and query them using SQL statements.
17+
[Data Transfer - Spark to SQL using MSSQL Spark connector](spark_to_sql/mssql_spark_connector.ipynb/)
1818

19-
[data-loading/transform-csv-files.ipynb](dataloading/transform-csv-files.ipynb/)
20-
21-
This samnple notebook shows how to transform CSV files in HDFS to parquet files.
19+
## Instructions on how to run in Azure Data Studio
2220

23-
[dataloading/spark-sql.ipynb](dataloading/spark-sql.ipynb/)
21+
[data-loading/transform-csv-files.ipynb](dataloading/transform-csv-files.ipynb/)
2422

25-
This samnple notebook shows how to query hive tables created from Spark.
23+
2. From Azure Data Studio Connect to the SQL Server Master instance in a big data cluster.
2624

27-
## __[data-virtualization](data-virtualization/)__
25+
3. Right-click on the server name, select **Manage**, switch to **SQL Server Big Data Cluster** tab, and open the notebook in Azure Data Studio. Wait for the “Kernel” and the target context (“Attach to”) to be populated. If required set the relevant “Kernel” ( e.g **PySpark3** ) and **Attach to** needs to be the IP address of your big data cluster endpoint.
2826

29-
This folder contains samples that show how to integrate Spark with other data sources.
27+
4. Run each cell in the Notebook sequentially.

0 commit comments

Comments
 (0)