Merge pull request #965 from DaniBunny/20211006-pkg-mgmt-changes

DaniBunny · web-flow · commit 18dc69fc1f19 · 2021-10-06T09:53:48.000-07:00
Pkg mgmt changes
diff --git a/samples/features/sql-big-data-cluster/spark/config-install/installpackage_Spark.ipynb b/samples/features/sql-big-data-cluster/spark/config-install/installpackage_Spark.ipynb
@@ -97,32 +97,11 @@
         {
             "cell_type": "markdown",
             "source": [
-                "# Install Python Packages at Runtime for use with PySpark\n",
-                "\n",
-                "The following code can be used to install packages on each executor node at runtime. \\\n",
-                "**Note**: This functionality is not available on a non-root BDC deployment (including OpenShift). This installation is temporary, and must be performed each time a new Spark session is invoked.\n",
-                "\n",
-                "If you want to use this from CU5 upwards, you must add two settings pre-deployment.\n",
-                "\n",
-                "In contron.json, add (under security):\n",
-                "\n",
-                "_\"allowRunAsRoot\": true_\n",
-                "\n",
-                "In BDC.json, add (under spec.services.spark.settings): \n",
-                "\n",
-                "_\"yarn-site.yarn.nodemanager.container-executor.class\": \"org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor\"_\n",
-                "\n",
-                "``` Python\n",
-                "import subprocess\n",
-                "import os\n",
-                "os.environ[\"XDG_CACHE_HOME\"]=\"/tmp\"\n",
-                "# Install TensorFlow\n",
-                "stdout = subprocess.check_output(\n",
-                "    \"pip3 install tensorflow\",\n",
-                "    stderr=subprocess.STDOUT,\n",
-                "    shell=True).decode(\"utf-8\")\n",
-                "print(stdout)\n",
-                "```"
+                "# Install Python Packages at Runtime for use with PySpark\r\n",
+                "\r\n",
+                "This capability changed significantly after SQL Server Big Data Clusters CU10.\r\n",
+                "\r\n",
+                "For more information on this scenario, refer to [Spark library management](https://docs.microsoft.com/sql/big-data-cluster/spark-install-packages?view=sql-server-ver15)\r\n"
             ],
             "metadata": {
                 "azdata_cell_guid": "07944b55-7266-4fcd-8e9b-9fd6cb8cfef5"