Contents

Changelog

Contents

Changelog

Note

Changelog entries for Distributed are now included in the Dask changelog.

2023.9.3

Released on September 29, 2023

Highlights

Reduce memory consumption during merge and shuffle graph optimizations

Previously there would be a large memory spike when optimizing task graphs for shuffling and merge operations (see GH#8196 for an example). This release removes that memory spike.

See GH#8197 from Patrick Hoefler for more details.

Quiet JupyterLab shutdown

Previously when running Jupyter on a scheduler (e.g. --jupyter CLI flag), an error would be raised when the notebook server was shutdown from the web application. This release ensures an error isn’t raised and the shutdown process is clean.

See GH#8220 from Thomas Grainger for details.

Additional changes

2023.9.2

Released on September 15, 2023

Highlights

Reduce memory footprint of P2P shuffling

Significantly reduced the peak and average memory used by P2P shuffling (up to a factor of 2x reduction). This change also increases the P2P minimum supported verions of pyarrow to pyarrow=12.

See GH#8157 from Hendrik Makait for details.

Improved plugin API

Two plugin changes have been introduced to provide a more consistent and convienent plugin UX:

  1. Plugins must now inherit from WorkerPlugin, SchedulerPlugin, or NannyPlugin base classes. Old-style plugins that don’t inherit from a base class will still work, but with a deprecation warning.

  2. A new Client.register_plugin() method has been introduced in favor of the previous Client.register_worker_plugin() and Client.register_scheduler_plugin() methods. All plugins should now be registered using the centralized Client.register_plugin() method.

from dask.distributed import WorkerPlugin, SchedulerPlugin

class MySchedulerPlugin(SchedulerPlugin):      # Inherits from SchedulerPlugin
    def start(self, scheduler):
        print("Hello from the scheduler!")

class MyWorkerPlugin(WorkerPlugin):            # Inherits from WorkerPlugin
    def setup(self, worker):
        print(f"Hello from Worker {worker}!")

client.register_plugin(MySchedulerPlugin())    # Single method to register both types of plugins
client.register_plugin(MyWorkerPlugin())

See GH#8169 and GH#8150 from Hendrik Makait for details.

Emit deprecation warnings for configuration option renames

When a Dask configuration option that has been renamed is used, users will now get a deprecation warning pointing them to the new name.

See GH#8179 from crusaderky for details.

Additional changes

2023.9.1

Released on September 6, 2023

Enhancements

Deprecations

Maintenance

2023.9.0

Released on September 1, 2023

Enhancements

Bug Fixes

Documentation

Maintenance

2023.8.1

Released on August 18, 2023

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2023.8.0

Released on August 4, 2023

Enhancements

Bug Fixes

Documentation

Maintenance

2023.7.1

Released on July 20, 2023

Enhancements

Bug Fixes

Documentation

Maintenance

2023.7.0

Released on July 7, 2023

Enhancements

Bug Fixes

Documentation

Maintenance

2023.6.1

Released on June 26, 2023

Enhancements

Bug Fixes

Maintenance

2023.6.0

Released on June 9, 2023

Enhancements

Bug Fixes

Maintenance

2023.5.1

Released on May 26, 2023

Note

This release drops support for Python 3.8. As of this release Dask supports Python 3.9, 3.10, and 3.11. See this community issue for more details.

Enhancements

Bug Fixes

Maintenance

2023.5.0

Released on May 12, 2023

Enhancements

  • Client.upload_file send to both Workers and Scheduler and rename scratch directory (GH#7802) Miles

  • Allow dashboard to be used with bokeh prereleases (GH#7814) James Bourbeau

Bug Fixes

Maintenance

2023.4.1

Released on April 28, 2023

Enhancements

Bug Fixes

Maintenance

2023.4.0

Released on April 14, 2023

Note

With this release we are making a change which will require the Dask scheduler to have consistent software and hardware capabilities as the client and workers.

It’s always been recommended that your client and workers have a consistent software and hardware environment so that data structures and dependencies can be pickled and passed between them. However recent changes to the Dask scheduler mean that we now also require your scheduler to have the same consistent environment as everything else.

Enhancements

Bug Fixes

Maintenance

2023.3.2.1

Released on April 5, 2023

Bug Fixes

2023.3.2

Released on March 24, 2023

Enhancements

Bug Fixes

Documentation

Maintenance

2023.3.1

Released on March 10, 2023

Enhancements

Bug Fixes

Documentation

  • Add notes to Client.submit, Client.map, and Client.scatter with the description of the current task graph resolution algorithm limitations (GH#7588) Eugene Druzhynin

Maintenance

2023.3.0

Released on March 1, 2023

Bug Fixes

Maintenance

2023.2.1

Released on February 24, 2023

Enhancements

Bug Fixes

Maintenance

2023.2.0

Released on February 10, 2023

Enhancements

Maintenance

2023.1.1

Released on January 27, 2023

Enhancements

Bug Fixes

Documentation

Maintenance

2023.1.0

Released on January 13, 2023

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2022.12.1

Released on December 16, 2022

Enhancements

Bug Fixes

Maintenance

2022.12.0

Released on December 2, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.11.1

Released on November 18, 2022

Enhancements

Documentation

Maintenance

2022.11.0

Released on November 15, 2022

Note

This release changes the default scheduling mode to use queuing. This will significantly reduce cluster memory use in most cases, and generally improve stability and performance. Learn more here and please provide any feedback on this discussion.

In rare cases, this could make some workloads slower. See the documentation for more information, and how to switch back to the old mode.

New Features

Enhancements

Documentation

Bug Fixes

Maintenance

2022.10.2

Released on October 31, 2022

This was a hotfix release

2022.10.1

Released on October 28, 2022

New Features

Enhancements

Documentation

Bug Fixes

Maintenance

2022.10.0

Released on October 14, 2022

Note

This release deprecates dask-scheduler, dask-worker, and dask-ssh CLIs in favor of dask scheduler, dask worker, and dask ssh, respectively. The old-style CLIs will continue to work for a time, but will be removed in a future release.

As part of this migration the --reconnect, --nprocs, --bokeh, --bokeh-port CLI options have also been removed for both the old- and new-style CLIs. These options had already previously been deprecated.

Enhancements

Bug Fixes

Maintenance

2022.9.2

Released on September 30, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.9.1

Released on September 16, 2022

Enhancements

Bug Fixes

Maintenance

2022.9.0

Released on September 2, 2022

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.8.1

Released on August 19, 2022

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2022.8.0

Released on August 5, 2022

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.7.1

Released on July 22, 2022

New Features

Enhancements

Bug Fixes

Maintenance

2022.7.0

Released on July 8, 2022

Enhancements

Bug Fixes

Maintenance

2022.6.1

Released on June 24, 2022

Highlights

This release includes the Worker State Machine refactor. The expectation should be that the worker state is its own synchronous subclass. Pulling all the state out into its own class allows us to write targeted unit tests without invoking any concurrent or asynchronous code.

See GH#5736 for more information.

Enhancements

Bug Fixes

Deprecations

Maintenance

2022.6.0

Released on June 10, 2022

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.05.2

Released on May 26, 2022

Enhancements

Bug Fixes

Maintenance

2022.05.1

Released on May 24, 2022

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.05.0

Released on May 2, 2022

Highlights

This is a bugfix release for this issue.

Enhancements

Bug Fixes

2022.04.2

Released on April 29, 2022

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.04.1

Released on April 15, 2022

New Features

Enhancements

Bug Fixes

Maintenance

2022.04.0

Released on April 1, 2022

Note

This is the first release with support for Python 3.10

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.03.0

Released on March 18, 2022

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2022.02.1

Released on February 25, 2022

New Features

Enhancements

Bug Fixes

  • Avoid deadlock when two tasks are concurrently waiting for an unresolved ActorFuture (GH#5709) Thomas Grainger

Deprecations

Documentation

Maintenance

2022.02.0

Released on February 11, 2022

Note

This is the last release with support for Python 3.7

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.01.1

Released on January 28, 2022

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2022.01.0

Released on January 14, 2022

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2021.12.0

Released on December 10, 2021

Enhancements

Bug fixes

Documentation

Maintenance

2021.11.2

Released on November 19, 2021

2021.11.1

Released on November 8, 2021

2021.11.0

Released on November 5, 2021

2021.10.0

Released on October 22, 2021

Note

This release fixed a potential security vulnerability relating to single-machine Dask clusters. Clusters started with dask.distributed.LocalCluster or dask.distributed.Client() (which defaults to using LocalCluster) would mistakenly configure their respective Dask workers to listen on external interfaces (typically with a randomly selected high port) rather than only on localhost. A Dask cluster created using this method AND running on a machine that has these ports exposed could be used by a sophisticated attacker to enable remote code execution. Users running on machines with standard firewalls in place should not be affected. This vulnerability is documented in CVE-2021-42343, and is fixed in this release (GH#5427). Thanks to Jean-Pierre van Riel for discovering and reporting the issue.

2021.09.1

Released on September 21, 2021

2021.09.0

Released on September 3, 2021

2021.08.1

Released on August 20, 2021

2021.08.0

Released on August 13, 2021

2021.07.2

Released on July 30, 2021

2021.07.1

Released on July 23, 2021

2021.07.0

Released on July 9, 2021

2021.06.2

Released on June 22, 2021

2021.06.1

Released on June 18, 2021

2021.06.0

Released on June 4, 2021

2021.05.1

Released on May 28, 2021

2021.05.0

Released on May 14, 2021

2021.04.1

Released on April 23, 2021

2021.04.0

Released on April 2, 2021

2021.03.1

Released on March 26, 2021

2021.03.0

Released on March 5, 2021

Note

This is the first release with support for Python 3.9 and the last release with support for Python 3.6

2021.02.0

Released on February 5, 2021

2021.01.1

Released on January 22, 2021

2021.01.0

Released on January 15, 2021

2020.12.0

Released on December 10, 2020

Highlights

  • Switched to CalVer for versioning scheme.

  • The scheduler can now receives Dask HighLevelGraph s instead of raw dictionary task graphs. This allows for a much more efficient communication of task graphs from the client to the scheduler.

  • Added support for using custom Layer-level annotations like priority, retries, etc. with the dask.annotations context manager.

  • Updated minimum supported version of Dask to 2020.12.0.

  • Added many type annotations and updates to allow for gradually Cythonizing the scheduler.

All changes

2.30.1 - 2020-11-03

2.30.0 - 2020-10-06

2.29.0 - 2020-10-02

2.28.0 - 2020-09-25

2.27.0 - 2020-09-18

2.26.0 - 2020-09-11

2.25.0 - 2020-08-28

2.24.0 - 2020-08-22

  • Move toolbar to above and fix y axis (#4043) Julia Signell

  • Make behavior clearer for how to get worker dashboard (#4047) Julia Signell

  • Worker dashboard clean up (#4046) Julia Signell

  • Add a default argument to the datasets and a possibility to override datasets (#4052) Nils Braun

  • Discover HTTP endpoints (#3744) Martin Durant

2.23.0 - 2020-08-14

2.22.0 - 2020-07-31

2.21.0 - 2020-07-17

2.20.0 - 2020-07-02

2.19.0 - 2020-06-19

2.18.0 - 2020-06-05

2.17.0 - 2020-05-26

2.16.0 - 2020-05-08

2.15.2 - 2020-05-01

2.15.1 - 2020-04-28

2.15.0 - 2020-04-24

2.14.0 - 2020-04-03

2.13.0 - 2020-03-25

2.12.0 - 2020-03-06

2.11.0 - 2020-02-19

2.10.0 - 2020-01-28

2.9.3 - 2020-01-17

2.9.2 - 2020-01-16

2.9.1 - 2019-12-27

2.9.0 - 2019-12-06

2.8.1 - 2019-11-22

2.8.0 - 2019-11-14

2.7.0 - 2019-11-08

This release drops support for Python 3.5

2.6.0 - 2019-10-15

2.5.2 - 2019-10-04

2.5.1 - 2019-09-27

2.5.0 - 2019-09-27

2.4.0 - 2019-09-13

2.3.2 - 2019-08-23

2.3.1 - 2019-08-22

2.3.0 - 2019-08-16

2.2.0 - 2019-07-31

2.1.0 - 2019-07-08

2.0.1 - 2019-06-26

We neglected to include python_requires= in our setup.py file, resulting in confusion for Python 2 users who erroneously get packages for 2.0.0. This is fixed in 2.0.1 and we have removed the 2.0.0 files from PyPI.

2.0.0 - 2019-06-25

1.28.1 - 2019-05-13

This is a small bugfix release due to a config change upstream.

1.28.0 - 2019-05-08

1.27.1 - 2019-04-29

1.27.0 - 2019-04-12

1.26.1 - 2019-03-29

1.26.0 - 2019-02-25

1.25.3 - 2019-01-31

1.25.2 - 2019-01-04

1.25.1 - 2018-12-15

1.25.0 - 2018-11-28

1.24.2 - 2018-11-15

1.24.1 - 2018-11-09

1.24.0 - 2018-10-26

1.23.3 - 2018-10-05

1.23.2 - 2018-09-17

1.23.1 - 2018-09-06

1.23.0 - 2018-08-30

1.22.1 - 2018-08-03

1.22.0 - 2018-06-14

1.21.8 - 2018-05-03

1.21.7 - 2018-05-02

1.21.6 - 2018-04-06

1.21.5 - 2018-03-31

1.21.4 - 2018-03-21

1.21.3 - 2018-03-08

1.21.2 - 2018-03-05

1.21.1 - 2018-02-22

1.21.0 - 2018-02-09

1.20.2 - 2017-12-07

1.20.1 - 2017-11-26

1.20.0 - 2017-11-17

1.19.3 - 2017-10-16

  • Handle None case in profile.identity (GH#1456)

  • Asyncio rewrite (GH#1458)

  • Add rejoin function partner to secede (GH#1462)

  • Nested compute (GH#1465)

  • Use LooseVersion when comparing Bokeh versions (GH#1470)

1.19.2 - 2017-10-06

  • as_completed doesn’t block on cancelled futures (GH#1436)

  • Notify waiting threads/coroutines on cancellation (GH#1438)

  • Set Future(inform=True) as default (GH#1437)

  • Rename Scheduler.transition_story to story (GH#1445)

  • Future uses default client by default (GH#1449)

  • Add keys= keyword to Client.call_stack (GH#1446)

  • Add get_current_task to worker (GH#1444)

  • Ensure that Client remains asynchronous before ioloop starts (GH#1452)

  • Remove “click for worker page” in bokeh plot (GH#1453)

  • Add Client.current() (GH#1450)

  • Clean handling of restart timeouts (GH#1442)

1.19.1 - 2017-09-25

  • Fix tool issues with TaskStream plot (GH#1425)

  • Move profile module to top level (GH#1423)

1.19.0 - 2017-09-24

  • Avoid storing messages in message log (GH#1361)

  • fileConfig does not disable existing loggers (GH#1380)

  • Offload upload_file disk I/O to separate thread (GH#1383)

  • Add missing SSLContext (GH#1385)

  • Collect worker thread information from sys._curent_frames (GH#1387)

  • Add nanny timeout (GH#1395)

  • Restart worker if memory use goes above 95% (GH#1397)

  • Track workers memory use with psutil (GH#1398)

  • Track scheduler delay times in workers (GH#1400)

  • Add time slider to profile plot (GH#1403)

  • Change memory-limit keyword to refer to maximum number of bytes (GH#1405)

  • Add cancel(force=) keyword (GH#1408)

1.18.2 - 2017-09-02

  • Silently pass on cancelled futures in as_completed (GH#1366)

  • Fix unicode keys error in Python 2 (GH#1370)

  • Support numeric worker names

  • Add dask-mpi executable (GH#1367)

1.18.1 - 2017-08-25

  • Clean up forgotten keys in fire-and-forget workloads (GH#1250)

  • Handle missing extensions (GH#1263)

  • Allow recreate_exception on persisted collections (GH#1253)

  • Add asynchronous= keyword to blocking client methods (GH#1272)

  • Restrict to horizontal panning in bokeh plots (GH#1274)

  • Rename client.shutdown to client.close (GH#1275)

  • Avoid blocking on event loop (GH#1270)

  • Avoid cloudpickle errors for Client.get_versions (GH#1279)

  • Yield on Tornado IOStream.write futures (GH#1289)

  • Assume async behavior if inside a sync statement (GH#1284)

  • Avoid error messages on closing (GH#1297), (GH#1296) (GH#1318) (GH#1319)

  • Add timeout= keyword to get_client (GH#1290)

  • Respect timeouts when restarting (GH#1304)

  • Clean file descriptor and memory leaks in tests (GH#1317)

  • Deprecate Executor (GH#1302)

  • Add timeout to ThreadPoolExecutor.shutdown (GH#1330)

  • Clean up AsyncProcess handling (GH#1324)

  • Allow unicode keys in Python 2 scheduler (GH#1328)

  • Avoid leaking stolen data (GH#1326)

  • Improve error handling on failed nanny starts (GH#1337), (GH#1331)

  • Make Adaptive more flexible

  • Support --contact-address and --listen-address in worker (GH#1278)

  • Remove old dworker, dscheduler executables (GH#1355)

  • Exit workers if nanny process fails (GH#1345)

  • Auto pep8 and flake (GH#1353)

1.18.0 - 2017-07-08

1.17.1 - 2017-06-14

  • Remove Python 3.4 testing from travis-ci (GH#1157)

  • Remove ZMQ Support (GH#1160)

  • Fix memoryview nbytes issue in Python 2.7 (GH#1165)

  • Re-enable counters (GH#1168)

  • Improve scheduler.restart (GH#1175)

1.17.0 - 2017-06-09

  • Reevaluate worker occupancy periodically during scheduler downtime (GH#1038) (GH#1101)

  • Add AioClient asyncio-compatible client API (GH#1029) (GH#1092) (GH#1099)

  • Update Keras serializer (GH#1067)

  • Support TLS/SSL connections for security (GH#866) (GH#1034)

  • Always create new worker directory when passed --local-directory (GH#1079)

  • Support pre-scattering data when using joblib frontend (GH#1022)

  • Make workers more robust to failure of sizeof function (GH#1108) and writing to disk (GH#1096)

  • Add is_empty and update methods to as_completed (GH#1113)

  • Remove _get coroutine and replace with get(..., sync=False) (GH#1109)

  • Improve API compatibility with async/await syntax (GH#1115) (GH#1124)

  • Add distributed Queues (GH#1117) and shared Variables (GH#1128) to enable inter-client coordination

  • Support direct client-to-worker scattering and gathering (GH#1130) as well as performance enhancements when scattering data

  • Style improvements for bokeh web dashboards (GH#1126) (GH#1141) as well as a removal of the external bokeh process

  • HTML reprs for Future and Client objects (GH#1136)

  • Support nested collections in client.compute (GH#1144)

  • Use normal client API in asynchronous mode (GH#1152)

  • Remove old distributed.collections submodule (GH#1153)

1.16.3 - 2017-05-05

  • Add bokeh template files to MANIFEST (GH#1063)

  • Don’t set worker_client.get as default get (GH#1061)

  • Clean up logging on Client().shutdown() (GH#1055)

1.16.2 - 2017-05-03

  • Support async with Client syntax (GH#1053)

  • Use internal bokeh server for default diagnostics server (GH#1047)

  • Improve styling of bokeh plots when empty (GH#1046) (GH#1037)

  • Support efficient serialization for sparse arrays (GH#1040)

  • Prioritize newly arrived work in worker (GH#1035)

  • Prescatter data with joblib backend (GH#1022)

  • Make client.restart more robust to worker failure (GH#1018)

  • Support preloading a module or script in dask-worker or dask-scheduler processes (GH#1016)

  • Specify network interface in command line interface (GH#1007)

  • Client.scatter supports a single element (GH#1003)

  • Use blosc compression on all memoryviews passing through comms (GH#998)

  • Add concurrent.futures-compatible Executor (GH#997)

  • Add as_completed.batches method and return results (GH#994) (GH#971)

  • Allow worker_clients to optionally stay within the thread pool (GH#993)

  • Add bytes-stored and tasks-processing diagnostic histograms (GH#990)

  • Run supports non-msgpack-serializable results (GH#965)

1.16.1 - 2017-03-22

  • Use inproc transport in LocalCluster (GH#919)

  • Add structured and queryable cluster event logs (GH#922)

  • Use connection pool for inter-worker communication (GH#935)

  • Robustly shut down spawned worker processes at shutdown (GH#928)

  • Worker death timeout (GH#940)

  • More visual reporting of exceptions in progressbar (GH#941)

  • Render disk and serialization events to task stream visual (GH#943)

  • Support async for / await protocol (GH#952)

  • Ensure random generators are re-seeded in worker processes (GH#953)

  • Upload sourcecode as zip module (GH#886)

  • Replay remote exceptions in local process (GH#894)

1.16.0 - 2017-02-24

  • First come first served priorities on client submissions (GH#840)

  • Can specify Bokeh internal ports (GH#850)

  • Allow stolen tasks to return from either worker (GH#853), (GH#875)

  • Add worker resource constraints during execution (GH#857)

  • Send small data through Channels (GH#858)

  • Better estimates for SciPy sparse matrix memory costs (GH#863)

  • Avoid stealing long running tasks (GH#873)

  • Maintain fortran ordering of NumPy arrays (GH#876)

  • Add --scheduler-file keyword to dask-scheduler (GH#877)

  • Add serializer for Keras models (GH#878)

  • Support uploading modules from zip files (GH#886)

  • Improve titles of Bokeh dashboards (GH#895)

1.15.2 - 2017-01-27

  • Fix a bug where arrays with large dtypes or shapes were being improperly compressed (GH#830 GH#832 GH#833)

  • Extend as_completed to accept new futures during iteration (GH#829)

  • Add --nohost keyword to dask-ssh startup utility (GH#827)

  • Support scheduler shutdown of remote workers, useful for adaptive clusters (:pr: 811 GH#816 GH#821)

  • Add Client.run_on_scheduler method for running debug functions on the scheduler (GH#808)

1.15.1 - 2017-01-11

  • Make compatible with Bokeh 0.12.4 (GH#803)

  • Avoid compressing arrays if not helpful (GH#777)

  • Optimize inter-worker data transfer (GH#770) (GH#790)

  • Add –local-directory keyword to worker (GH#788)

  • Enable workers to arrive to the cluster with their own data. Useful if a worker leaves and comes back (GH#785)

  • Resolve thread safety bug when using local_client (GH#802)

  • Resolve scheduling issues in worker (GH#804)

1.15.0 - 2017-01-02

  • Major Worker refactor (GH#704)

  • Major Scheduler refactor (GH#717) (GH#722) (GH#724) (GH#742) (GH#743

  • Add check (default is False) option to Client.get_versions to raise if the versions don’t match on client, scheduler & workers (GH#664)

  • Future.add_done_callback executes in separate thread (GH#656)

  • Clean up numpy serialization (GH#670)

  • Support serialization of Tornado v4.5 coroutines (GH#673)

  • Use CPickle instead of Pickle in Python 2 (GH#684)

  • Use Forkserver rather than Fork on Unix in Python 3 (GH#687)

  • Support abstract resources for per-task constraints (GH#694) (GH#720) (GH#737)

  • Add TCP timeouts (GH#697)

  • Add embedded Bokeh server to workers (GH#709) (GH#713) (GH#738)

  • Add embedded Bokeh server to scheduler (GH#724) (GH#736) (GH#738)

  • Add more precise timers for Windows (GH#713)

  • Add Versioneer (GH#715)

  • Support inter-client channels (GH#729) (GH#749)

  • Scheduler Performance improvements (GH#740) (GH#760)

  • Improve load balancing and work stealing (GH#747) (GH#754) (GH#757)

  • Run Tornado coroutines on workers

  • Avoid slow sizeof call on Pandas dataframes (GH#758)

1.14.3 - 2016-11-13

  • Remove custom Bokeh export tool that implicitly relied on nodejs (GH#655)

  • Clean up scheduler logging (GH#657)

1.14.2 - 2016-11-11

  • Support more numpy dtypes in custom serialization, (GH#627), (GH#630), (GH#636)

  • Update Bokeh plots (GH#628)

  • Improve spill to disk heuristics (GH#633)

  • Add Export tool to Task Stream plot

  • Reverse frame order in loads for very many frames (GH#651)

  • Add timeout when waiting on write (GH#653)

1.14.0 - 2016-11-03

  • Add Client.get_versions() function to return software and package information from the scheduler, workers, and client (GH#595)

  • Improved windows support (GH#577) (GH#590) (GH#583) (GH#597)

  • Clean up rpc objects explicitly (GH#584)

  • Normalize collections against known futures (GH#587)

  • Add key= keyword to map to specify keynames (GH#589)

  • Custom data serialization (GH#606)

  • Refactor the web interface (GH#608) (GH#615) (GH#621)

  • Allow user-supplied Executor in Worker (GH#609)

  • Pass Worker kwargs through LocalCluster

1.13.3 - 2016-10-15

  • Schedulers can retire workers cleanly

  • Add Future.add_done_callback for concurrent.futures compatibility

  • Update web interface to be consistent with Bokeh 0.12.3

  • Close streams explicitly, avoiding race conditions and supporting more robust restarts on Windows.

  • Improved shuffled performance for dask.dataframe

  • Add adaptive allocation cluster manager

  • Reduce administrative overhead when dealing with many workers

  • dask-ssh --log-directory . no longer errors

  • Microperformance tuning for the scheduler

1.13.2

  • Revert dask_worker to use fork rather than subprocess by default

  • Scatter retains type information

  • Bokeh always uses subprocess rather than spawn

1.13.1

  • Fix critical Windows error with dask_worker executable

1.13.0

  • Rename Executor to Client (GH#492)

  • Add --memory-limit option to dask-worker, enabling spill-to-disk behavior when running out of memory (GH#485)

  • Add --pid-file option to dask-worker and --dask-scheduler (GH#496)

  • Add upload_environment function to distribute conda environments. This is experimental, undocumented, and may change without notice. (GH#494)

  • Add workers= keyword argument to Client.compute and Client.persist, supporting location-restricted workloads with Dask collections (GH#484)

  • Add upload_environment function to distribute conda environments. This is experimental, undocumented, and may change without notice. (GH#494)

    • Add optional dask_worker= keyword to client.run functions that gets provided the worker or nanny object

    • Add nanny=False keyword to Client.run, allowing for the execution of arbitrary functions on the nannies as well as normal workers

1.12.2

This release adds some new features and removes dead code

  • Publish and share datasets on the scheduler between many clients (GH#453). See Publish Datasets.

  • Launch tasks from other tasks (experimental) (GH#471). See Launch Tasks from Tasks.

  • Remove unused code, notably the Center object and older client functions (GH#478)

  • Executor() and LocalCluster() is now robust to Bokeh’s absence (GH#481)

  • Removed s3fs and boto3 from requirements. These have moved to Dask.

1.12.1

This release is largely a bugfix release, recovering from the previous large refactor.

  • Fixes from previous refactor
    • Ensure idempotence across clients

    • Stress test losing scattered data permanently

  • IPython fixes
    • Add start_ipython_scheduler method to Executor

    • Add %remote magic for workers

    • Clean up code and tests

  • Pool connects to maintain reuse and reduce number of open file handles

  • Re-implement work stealing algorithm

  • Support cancellation of tuple keys, such as occur in dask.arrays

  • Start synchronizing against worker data that may be superfluous

  • Improve bokeh plots styling
    • Add memory plot tracking number of bytes

    • Make the progress bars more compact and align colors

    • Add workers/ page with workers table, stacks/processing plot, and memory

  • Add this release notes document

1.12.0

This release was largely a refactoring release. Internals were changed significantly without many new features.

  • Major refactor of the scheduler to use transitions system

  • Tweak protocol to traverse down complex messages in search of large bytestrings

  • Add dask-submit and dask-remote

  • Refactor HDFS writing to align with changes in the dask library

  • Executor reconnects to scheduler on broken connection or failed scheduler

  • Support sklearn.external.joblib as well as normal joblib