Configure Spark Job Resource Allocation

Table of Contents

Number of instances and cores allocated
Memory allocation

For related topics, see Spark Operations.

Number of instances and cores allocated

To set the number of cores allocated for a job, add the following parameter keys and values in the Spark Settings field. This is done within the "advanced" job properties in the Fusion UI or the sparkConfig object, if defining a job via the Fusion API.

Parameter Key	Example Value
`spark.executor.instances`	3
`spark.kubernetes.executor.request.cores`	3
`spark.executor.cores`	6
`spark.driver.cores`	1

If spark.kubernetes.executor.request.cores is unset, the default configuration, Spark sets the number of CPUs for the executor pod to be the same number as spark.executor.cores. For exmaple, if spark.executor.cores is 3, Spark allocates 3 CPUs for the executor pod and runs 3 tasks in parallel. To under-allocate the CPU for the executor pod and still run multiple tasks in parallel, set spark.kubernetes.executor.request.cores to a lower value than spark.executor.cores.

The ratio for spark.kubernetes.executor.request.cores to spark.executor.cores depends on the type of job: either CPU-bound or I/O-bound. Allocate more memory to the executor if more tasks are running in parallel on a single executor pod.

If these settings not specified, the job launches with a driver using one core and 3GB of memory plus two executors, each using one core with 1GB of memory.

Memory allocation

The amount of memory allocated to the driver and executors is controlled on a per-job basis using the spark.executor.memory and spark.driver.memory parameters in the Spark Settings section of the job definition. This is found in the Fusion UI or within the sparkConfig object in the JSON definition of the job.

Parameter Key Example Value

Parameter Key	Example Value
`spark.executor.memory`	6g
`spark.driver.memory`	2g

spark.executor.memory

spark.driver.memory