Skip to content

Commit 4dbb298

Browse files
authored
Update to pyspark package requirements (#798)
* separate SP creation separated Service Principal creation of aks due to issues on some machines with integrated SP creation fixed typo in README * Update deploy-sql-big-data-aks.py removed JSON output * adjusted for CU5 * change back * update to CU5 * Updated requirements for pyspark installation in CU5
1 parent ee4cfac commit 4dbb298

1 file changed

Lines changed: 12 additions & 1 deletion

File tree

samples/features/sql-big-data-cluster/spark/config-install/installpackage_Spark.ipynb

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,20 @@
106106
"The following code can be used to install packages on each executor node at runtime. \\\n",
107107
"**Note**: This functionality is not available on a non-root BDC deployment (including OpenShift). This installation is temporary, and must be performed each time a new Spark session is invoked.\n",
108108
"\n",
109+
"If you want to use this from CU5 upwards, you must add two settings pre-deployment.\n",
110+
"\n",
111+
"In contron.json, add (under security):\n",
112+
"\n",
113+
"_\"allowRunAsRoot\": true_\n",
114+
"\n",
115+
"In BDC.json, add (under spec.services.spark.settings): \n",
116+
"\n",
117+
"_\"yarn-site.yarn.nodemanager.container-executor.class\": \"org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor\"_\n",
118+
"\n",
109119
"``` Python\n",
110120
"import subprocess\n",
111-
"\n",
121+
"import os\n",
122+
"os.environ[\"XDG_CACHE_HOME\"]=\"/tmp\"\n",
112123
"# Install TensorFlow\n",
113124
"stdout = subprocess.check_output(\n",
114125
" \"pip3 install tensorflow\",\n",

0 commit comments

Comments
 (0)