Skip to content

Commit e67a4c8

Browse files
committed
fixed the readme doc
1 parent 41ec742 commit e67a4c8

1 file changed

Lines changed: 6 additions & 25 deletions

File tree

  • samples/features/sql-big-data-cluster/spark
Lines changed: 6 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,8 @@
11
# SQL Server big data clusters
22

3-
The new built-in notebooks in Azure Data Studio enables data scientists and data engineers to run Python, R, or Scala code against the cluster.
3+
SQL Server Big Data cluster bundles Spark and HDFS together with SQL server. Azure Data Studio IDE provides built in notebooks that enables data scientists and data engineers to run Spark notebooks and job in Python, R, or Scala code against the Big Data Cluster. This folder contains spark sample notebook on using Spark in SQL server Big data cluster
44

5-
## Instructions to open a notebook from Azure Data Studio
6-
7-
1. Connect to the SQL Server Master instance in a big data cluster
8-
9-
1. Right-click on the server name, select **Manage**, switch to **SQL Server Big Data Cluster** tab, and use open Notebook
10-
11-
## __[dataloading](dataloading/)__
12-
<<<<<<< HEAD
13-
14-
This folder contains samples that show how to load data using Spark.
5+
## Folder contents
156

167
[PySpark Hello World](dataloading/hello_PySpark.ipynb)
178

@@ -22,23 +13,13 @@ This folder contains samples that show how to load data using Spark.
2213
[DataLoading - Transforming CSV to Parquet](dataloading/transform-csv-files.ipynb/)
2314

2415
[Data Transfer - Spark to SQL using JDBC ](spark_to_sql/spark_to_sql_jdbc.ipynb/)
25-
=======
26-
27-
This folder contains samples that show how to load data using Spark.
28-
29-
[dataloading/transform-csv-files.ipynb](dataloading/transform-csv-files.ipynb/)
30-
>>>>>>> upstream/master
3116

32-
## Instructions
17+
## Instructions on how to run in Azure Data Studio
3318

3419
1. Download and save the notebook file [dataloading/transnform-csv-files.ipynb](dataloading/transform-csv-files.ipynb/) locally.
3520

36-
<<<<<<< HEAD
37-
2. Open the notebook in Azure Data Studio, wait for the “Kernel” and the target context (“Attach to”) to be populated. Set the “Kernel” to **PySpark3** and **Attach to** needs to be the IP address of your big data cluster endpoint.
21+
2. From Azure Data Studio Connect to the SQL Server Master instance in a big data cluster.
3822

39-
3. Run each cell in the Notebook sequentially.
40-
=======
41-
1. Open the notebook in Azure Data Studio, wait for the “Kernel” and the target context (“Attach to”) to be populated. Set the “Kernel” to **PySpark3** and **Attach to** needs to be the IP address of your big data cluster endpoint.
23+
3. Right-click on the server name, select **Manage**, switch to **SQL Server Big Data Cluster** tab, and open the notebook in Azure Data Studio. Wait for the “Kernel” and the target context (“Attach to”) to be populated. If required set the relevant “Kernel” ( e.g **PySpark3** ) and **Attach to** needs to be the IP address of your big data cluster endpoint.
4224

43-
1. Run each cell in the Notebook sequentially.
44-
>>>>>>> upstream/master
25+
4. Run each cell in the Notebook sequentially.

0 commit comments

Comments
 (0)