As with any distributed computation system, taking full advantage of Dask distributed sometimes requires configuration. Some options can be passed as API parameters and/or command line options to the various Dask executables. However, some options can also be entered in the Dask configuration file.

User-wide configuration

Dask accepts some configuration options in a configuration file, which by default is a .dask/config.yaml file located in your home directory. The file path can be overriden using the DASK_CONFIG environment variable. In order to parse this configuration file, the pyyaml module needs to be installed. If the pyyaml module is not installed, the configuration file is ignored.

The file is written in the YAML format, which allows for a human-readable hierarchical key-value configuration. All keys in the configuration file are optional, though Dask will create a default configuration file for you on its first launch.

Here is a synopsis of the configuration file:

    distributed: info
    distributed.client: warning
    bokeh: critical

# Scheduler options
bandwidth: 100000000    # 100 MB/s estimated worker-worker bandwidth
allowed-failures: 3     # number of retries before a task is considered bad
pdb-on-err: False       # enter debug mode on scheduling error
transition-log-length: 100000

# Worker options
multiprocessing-method: forkserver

# Communication options
compression: auto
tcp-timeout: 30         # seconds delay before calling an unresponsive connection dead
default-scheme: tcp
require-encryption: False   # whether to require encryption on non-local comms
    ca-file: myca.pem
        cert: mycert.pem
        key: mykey.pem
        cert: mycert.pem
        key: mykey.pem
        cert: mycert.pem
        key: mykey.pem

# Bokeh web dashboard
bokeh-export-tool: False

We will review some of those options hereafter.

Communication options


This key configures the desired compression scheme when transferring data over the network. The default value, “auto”, applies heuristics to try and select the best compression scheme for each piece of data.


The communication scheme used by default. You can override the default (“tcp”) here, but it is recommended to use explicit URIs for the various endpoints instead (for example tls:// if you want to enable TLS communications).


Whether to require that all non-local communications be encrypted. If true, then Dask will refuse establishing any clear-text communications (for example over TCP without TLS), forcing you to use a secure transport such as TLS.


The default “timeout” on TCP sockets. If a remote endpoint is unresponsive (at the TCP layer, not at the distributed layer) for at least the specified number of seconds, the communication is considered closed. This helps detect endpoints that have been killed or have disconnected abruptly.


This key configures TLS communications. Several sub-keys are recognized:

  • ca-file configures the CA certificate file used to authenticate and authorize all endpoints.
  • ciphers restricts allowed ciphers on TLS communications.

Each kind of endpoint has a dedicated endpoint sub-key: scheduler, worker and client. Each endpoint sub-key also supports several sub-keys:

  • cert configures the certificate file for the endpoint.
  • key configures the private key file for the endpoint.

Scheduler options


The number of retries before a “suspicious” task is considered bad. A task is considered “suspicious” if the worker died while executing it.


The estimated network bandwidth, in bytes per second, from worker to worker. This value is used to estimate the time it takes to ship data from one node to another, and balance tasks and data accordingly.

Misc options


This key configures the logging settings. There are two possible formats. The simple, recommended format configures the desired verbosity level for each logger. It also sets default values for several loggers such as distributed unless explicitly configured.

A more extended format is possible following the logging module’s Configuration dictionary schema. To enable this extended format, there must be a version sub-key as mandated by the schema. The extended format does not set any default values.


Python’s logging module uses a hierarchical logger tree. For example, configuring the logging level for the distributed logger will also affect its children such as distributed.scheduler, unless explicitly overriden.


As an alternative to the two logging settings formats discussed above, you can specify a logging config file. Its format adheres to the logging module’s Configuration file format.


The configuration options logging-file-config and logging are mutually exclusive.