Fusion Job REST Server APIs
id
field in the configuration file. Required field.script
field in the configuration file.parameter name:parameter value
options to use for Spark configuration. This is the sparkConfig
field in the configuration file.parameter name:parameter value
options to send to the Spark shell when the job is run. This is the shellOptions
field in the configuration file.parameter name:parameter value
options to bind the key:value
pairs to the Scala interpreter. This is the interpreterParams
field in the configuration file.api/configurations
which will update the stored value without restarting the service, therefore existing jobs and SparkContexts will not be affected.
The Fusion endpoint api/configurations
returns all configured properties for that installation.
You can examine spark default configurations in a Unix shell using the utilities curl
and grep
.
Here is an example that checks a local Fusion installation running on port FUSION_PORT:
Property | Description |
---|---|
spark.master.url | By default, left unset. This property is only specified when using an external Spark cluster; when Fusion is using its own standalone Spark cluster, this property is not set. |
spark.cores.max | The maximum number of cores across the cluster assigned to the application. If not specified, there is no limit. The default is unset, i.e., an unlimited number of cores. |
spark.executor.memory | Amount of memory assigned to each application’s executor. The default is 2G. |
spark.scheduler.mode | Controls how tasks are assigned to available resources. Can be either ‘FIFO’ or ‘FAIR’. Default value is ‘FAIR’. |
spark.dynamicAllocation.enabled | Boolean - whether or not to enable dynamic allocation of executors. Default value is ‘TRUE’. |
spark.shuffle.service.enabled | Boolean - whether or not to enable internal shuffle service for standalone Spark cluster. Default value is ‘TRUE’. |
spark.dynamicAllocation.executorIdleTimeout | Number of seconds after which idle executors are removed. Default value is ‘60s’. |
spark.dynamicAllocation.minExecutors | Number of executors to leave running even when idle. Default value is 0. |
spark.eventLog.enabled | Boolean - whether or not event log is enabled. Event log stores job details and can be accessed after application finishes. Default value is ‘TRUE’. |
spark.eventLog.dir | Directory that stores event logs. Default location is $FUSION_HOME/var/spark-eventlog . |
spark.eventLog.compress | Boolean - whether or not to compress event log data. Default value is ‘TRUE’. |
spark.logConf | Boolean - whether or not to log effective SparkConf of new SparkContext-s. Default value is ‘TRUE’. |
spark.deploy.recoveryMode | Default value is ‘ZOOKEEPER’ |
spark.deploy.zookeeper.url | ZooKeeper connect string. Default value is $FUSION_ZK |
spark.deploy.zookeeper.dir | ZooKeeper path, default value is /lucid/spark |
spark.worker.cleanup.enabled | Boolean - whether or not to periodically cleanup worker data. Default value is ‘TRUE’. |
spark.worker.cleanup.appDataTtl | Time-to-live in seconds. Default value is 86400 (24h). |
spark.deploy.retainedApplications | The maximum number of applications to show in the UI. Default value is 50. |
spark.deploy.retainedDrivers | The maximum number of drivers. Default value is 50. |
spark.worker.timeout | The maximum timeout in seconds allowed before a worker is considered lost. The default value is 30. |
spark.worker.memory | The maximum total heap allocated to all executors running on this worker. Defaults to value of the executor memory heap. |
Property | Description |
---|---|
fusion.spark.master.port | Spark master job submission port. Default value is 8766. |
fusion.spark.master.ui.port | Spark master web UI port. Default value is 8767. |
fusion.spark.idleTime | Maximum idle time in seconds, after which the application (ie. SparkContext) is shut down. Default value is 300. |
fusion.spark.executor.memory.min | Minimum executor memory in MB. Default value 450Mb, which is sufficient to let Fusion components in application task’s to initialize themselves |
fusion.spark.executor.memory.fraction | A float number in range (0.0, 1.0] indicating what portion of spark.executor.memory to allocate to this application. Default value is 1.0. |
fusion.spark.cores.fraction | A float number in range (0.0, 1.0] indicating what portion of spark.cores.max to allocate to this application. Default value is 1.0. |