microsoft
diff --git a/‎…luster/data-pool/data-ingestion-spark.md‎ ‎…sql-big-data-cluster/data-pool/README.md‎samples/features/sql-big-data-cluster/data-pool/data-ingestion-spark.md renamed to samples/features/sql-big-data-cluster/data-pool/README.md
Lines changed: 16 additions & 2 deletions b/‎…luster/data-pool/data-ingestion-spark.md‎ ‎…sql-big-data-cluster/data-pool/README.md‎samples/features/sql-big-data-cluster/data-pool/data-ingestion-spark.md renamed to samples/features/sql-big-data-cluster/data-pool/README.md
Lines changed: 16 additions & 2 deletions
diff --git a/‎samples/features/sql-big-data-cluster/data-pool/data-ingestion-sql.md‎
Lines changed: 0 additions & 9 deletions b/‎samples/features/sql-big-data-cluster/data-pool/data-ingestion-sql.md‎
Lines changed: 0 additions & 9 deletions
diff --git a/‎samples/features/sql-big-data-cluster/data-virtualization/README.md‎
Lines changed: 25 additions & 0 deletions b/‎samples/features/sql-big-data-cluster/data-virtualization/README.md‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎samples/features/sql-big-data-cluster/data-virtualization/external-table-hdfs.md‎
Lines changed: 0 additions & 11 deletions b/‎samples/features/sql-big-data-cluster/data-virtualization/external-table-hdfs.md‎
Lines changed: 0 additions & 11 deletions
diff --git a/‎samples/features/sql-big-data-cluster/data-virtualization/external-table-oracle.md‎
Lines changed: 0 additions & 12 deletions b/‎samples/features/sql-big-data-cluster/data-virtualization/external-table-oracle.md‎
Lines changed: 0 additions & 12 deletions
diff --git a/‎…uster/machine-learning/spark/ml-spark.md‎ ‎…-data-cluster/machine-learning/README.md‎samples/features/sql-big-data-cluster/machine-learning/spark/ml-spark.md renamed to samples/features/sql-big-data-cluster/machine-learning/README.md
Lines changed: 19 additions & 7 deletions b/‎…uster/machine-learning/spark/ml-spark.md‎ ‎…-data-cluster/machine-learning/README.md‎samples/features/sql-big-data-cluster/machine-learning/spark/ml-spark.md renamed to samples/features/sql-big-data-cluster/machine-learning/README.md
Lines changed: 19 additions & 7 deletions
diff --git a/‎samples/features/sql-big-data-cluster/machine-learning/sql/ml-master.md‎
Lines changed: 0 additions & 10 deletions b/‎samples/features/sql-big-data-cluster/machine-learning/sql/ml-master.md‎
Lines changed: 0 additions & 10 deletions
@@ -1,6 +1,20 @@
-# Data ingestion using Spark streaming
+# Data pools in SQL Server 2019 big data cluster
 
-SQL Server Big Data clusters provide scale-out compute and storage to improve the performance of analyzing any data. Data from a variety of sources can be ingested and distributed across data pool instances for analysis. In this example, you are going to use Spark to read and transform data from HDFS and cache it in a data pool. Querying the external table created over this aggregated data stored in data pools will be much more efficient than going to the raw data always.  
+SQL Server Big Data clusters provide scale-out compute and storage to improve the performance of analyzing any data. Data from a variety of sources can be ingested and distributed across data pool instances for analysis. In this example, we will insert data from a SQL query into an external table stored in a data pool and query it.
+
+## Data ingestion using SQL stored procedure
+
+SQL Server Big Data clusters provide scale-out compute and storage to improve the performance of analyzing any data. Data from a variety of sources can be ingested and distributed across data pool instances for analysis. In this example, we will insert data from a SQL query into an external table stored in a data pool and query it.
+
+### Instructions
+
+1. Connect to SQL Server Master instance.
+
+1. Execute the .sql script [data-ingestion-sql.sql](data-ingestion-sql.sql).
+
+## Data ingestion using Spark streaming
+
+In this example, you are going to use Spark to read and transform data from HDFS and cache it in a data pool. Querying the external table created over this aggregated data stored in data pools will be much more efficient than going to the raw data always.  
 
 ### Instructions
 
 
@@ -0,0 +1,25 @@
+# Data virtualization in SQL Server 2019 big data cluster
+
+In SQL Server 2019 big data clusters, the SQL Server engine has gained the ability to natively read HDFS files, such as CSV and parquet files, by using SQL Server instances collocated on each of the HDFS data nodes to filter and aggregate data locally in parallel across all of the HDFS data nodes. SQL Server 2019 introduces new ODBC connectors to data sources like SQL Server, Oracle, MongoDB and Teradata.
+
+## Query data in HDFS from SQL Server master
+
+In this example, you are going to create an external table in the SQL Server Master instance that points to data in HDFS within the SQL Server Big data cluster. Then you will join the data in the external table with high value data in SQL Master instance.
+
+### Instructions
+
+1. Connect to SQL Server Master instance.
+
+1. Execute the [external-table-hdfs.sql](external-table-hdfs.sql).
+
+## Query data in Oracle from SQL Server master
+
+In this example, you are going to create an external table in SQL Server Master instance over the inventory table that sits on an Oracle server.
+
+**Before you begin**, you need to have an Oracle instance and credentials. Execute the SQL script [inventory-ora.sql](inventory-ora.sql/) in Oracle to create the table and import the "inventory.csv" file created by the bootstrap sample database.
+
+### Instructions
+
+1. Connect to SQL Server Master instance.
+
+1. Execute the SQL [external-table-oracle.sql](external-table-oracle.sql/).
@@ -1,26 +1,38 @@
-# Machine learning with Spark on SQL Server 2019 big data cluster
+# Machine learning in SQL Server 2019 big data cluster
+
+## SQL Server Machine Learning Services on SQL Master instance
+
+In this example, we are building a machine learning model using R and a logistic regression algorithm for a recommendation engine on an online store. Based on existing users' click pattern online and their interest in other categories and demographics, we are training a machine learning model. This model will then be used to predict if the visitor is interested in a given item category using the T-SQL PREDICT function.
+
+### Instructions
+
+1. Connect to SQL Server Master instance.
+
+1. Execute the SQL [sql/book-click-prediction-r.sql](sql/book-click-prediction-r.sql/).
+
+## Machine learning using Spark
 
 The new built-in notebooks in Azure Data Studio enables data scientists and data engineers to run Python, R, or Scala code against the cluster. This is a great way to explore the data and build machine learning models. Notebooks facilitate collaboration between teammates working on a shared data set.
 
 This sample builds a machine learning model using AdultCensusIncome.csv available [here](https://amldockerdatasets.azureedge.net/AdultCensusIncome.csv).
 
 
-## Instructions
+### Instructions
 
 In this example, you are going to run sample notebooks that build a machine learning model over a public data set.
 
 Follow the steps below to get up and running with the sample.
 
-## Upload the data for analysis
+#### Upload the data for analysis
 
 1. From Azure Data Studio, connect to the SQL Server big data cluster endpoint. Information about how you connect from Azure Data Studio can be found [here](https://docs.microsoft.com/en-us/sql/azure-data-studio/sql-server-2019-extension?view=sql-server-ver15).
 
 2. Download the data from https://amldockerdatasets.azureedge.net/AdultCensusIncome.csv and save AdultCensusIncome.csv in a folder called spark_ml in HDFS.
 
-## Run notebook for data preparation
+#### Run notebook for data preparation
 As a first step we'll load the data, do some basic cleanup on that data, choose the features that we want to build the machine learning model with. Finally we'll split the data set as training and test sets.
 
-1. Download and save the notebook file [1-data-prep.ipynb](1-data-prep.ipynb/) locally.
+1. Download and save the notebook file [spark/1-data-prep.ipynb](spark/1-data-prep.ipynb/) locally.
 
 1. Open the notebook file in Azure Data Studio (right click on the SQL Server big data cluster  server name-> **Manage**-> Open Notebook.
 
@@ -30,10 +42,10 @@ As a first step we'll load the data, do some basic cleanup on that data, choose
 
 1. The training and test sets created would be stored as /spark_ml/AdultCensusIncomeTrain and /spark_ml/AdultCensusIncomeTest
 
-## Run notebook to create a machine learning model and use it to predict
+#### Run notebook to create a machine learning model and use it to predict
 We'll now create the machine learning model, use the model to predict results on the test set and then save the created model to a file.
 
-1. Download and save the notebook (ipynb) file [2-build-ml-model.ipynb] (2-build-ml-model.ipynb/)
+1. Download and save the notebook (ipynb) file [spark\2-build-ml-model.ipynb](spark/2-build-ml-model.ipynb/)
 
 1. Open the notebook file in Azure Data Studio (right click on the SQL Server big data cluster  server name-> **Manage**-> Open Notebook.