+ "source": "# Configuration settings for scaling to larger data\n\n## Number and size of nodes in our Kubernetes cluster\nWe can control the number and size of nodes in our Kubernetes cluster via the node-vm-size and node-count switches in our `aks create` command:\n\n`az aks create --name mycluster --resource-group myrg --generate-ssh-keys --node-vm-size Standard_DS14_v2 --node-count 3 --kubernetes-version 1.10.9`\n\nMore information is available [here](https://docs.microsoft.com/en-us/sql/big-data-cluster/deploy-on-aks?view=sqlallproducts-allversions#create-a-kubernetes-cluster).\n\n## Number of Spark pods\nWe can control the number of Spark pods via the CLUSTER_STORAGE_POOL_REPLICAS environment variable used by `mssqlctl create cluster`:\n\nSET CLUSTER_STORAGE_POOL_REPLICAS=2\n\n## YARN scheduler memory and cores\nWe can control the YARN scheduler memory and cores via the following environment variable used by `mssqlctl create cluster`:\n\n- YARN_SCHEDULER_MAX_MEMORY\n- YARN_SCHEDULER_MAX_VCORES\n- YARN_NODEMANAGER_RESOURCE_MEMORY\n- YARN_NODEMANAGER_RESOURCE_VCORES\n\nFurther information regarding mssqlctl environtment variables is available [here](https://docs.microsoft.com/en-us/sql/big-data-cluster/deployment-guidance?view=sqlallproducts-allversions#define-environment-variables).\n\nIn CTP 2.5 and later, these environment variables are replaced by similarly named properties in a JSON file. See [Custom configurations](https://docs.microsoft.com/en-us/sql/big-data-cluster/deployment-guidance?view=sqlallproducts-allversions#customconfig).\n\n## Livy timeout\nThe Livy timeout sets a limit on the runtime of a cell in a PySpark3 Jupyter notebook. In SQL Server 2019 Big Data CTP 2.1, the Livy timeout defaults to 1 hour. In CTP 2.2, it defaults to 24 days. One can modify this as follows:\n\n- Log into the mssql-master-pool-0 pod using this command (requires permission to run kubectl):\n\n```\nkubectl exec -it mssql-master-pool-0 -n <your-cluster-name> -- /bin/bash\n```\n- To set the Livy timeout to 24 days, run the following command or edit /livy/conf/livy.conf accordingly:\n\n```\necho 'livy.server.session.timeout = 24d' | cat >> /livy/conf/livy.conf \n```\n- Then restart the Livy server by running the following command:\n\n```\nsupervisorctl restart livy\n```",
0 commit comments