Software:Zstandard

From HandWiki
Short description: Lossless compression algorithm
Zstandard
Zstandard logo.png
Original author(s)Yann Collet
Developer(s)Yann Collet, Nick Terrell, Przemysław Skibiński[1]
Initial release23 January 2015 (2015-01-23)
Stable release
1.5.0 / 14 May 2021; 2 years ago (2021-05-14)[2]
Written inC
Operating systemCross-platform
PlatformPortable
TypeData compression
LicenseDual: BSD License, GPLv2

Zstandard (or zstd) is a lossless data compression algorithm developed by Yann Collet at Facebook. Zstd is the reference implementation in C. Version 1 of this implementation was released as open-source software on 31 August 2016.[3][4]

Features

Zstandard was designed to give a compression ratio comparable to that of the DEFLATE algorithm (developed in 1991 and used in the original ZIP and gzip programs), but faster, especially for decompression. It is tunable with compression levels ranging from negative 7 (fastest)[5] to 22 (slowest in compression speed, but best compression ratio).

The zstd package includes parallel (multi-threaded) implementations of both compression and decompression[citation needed]. Starting from version 1.3.2 (October 2017), zstd optionally implements very long range search and deduplication (--long, 128 MiB window) similar to rzip or lrzip.[6]

Compression speed can vary by a factor of 20 or more between the fastest and slowest levels, while decompression is uniformly fast, varying by less than 20% between the fastest and slowest levels.[7] Zstandard command-line has an "adaptive" (--adapt) mode that varies compression level depending on I/O conditions, mainly how fast it can write the output.

Zstd at its maximum compression level gives a compression ratio close to lzma, lzham, and ppmx, and performs better than lza, or bzip2.[8][9] Zstandard reaches the current Pareto frontier, as it decompresses faster than any other currently-available algorithm with similar or better compression ratio.[10][11]

Dictionaries can have a large impact on the compression ratio of small files, so Zstandard can use a user-provided compression dictionary. It also offers a training mode, able to generate a dictionary from a set of samples.[12][13] In particular, one dictionary can be loaded to process large sets of files with redundancy between files, but not necessarily within each file, e.g., log files.

Design

Zstandard combines a dictionary-matching stage (LZ77) with a large search window and a fast entropy coding stage, using both Finite State Entropy (a fast tabled version of ANS, tANS, used for entries in the Sequences section), and Huffman coding (used for entries in the Literals section).[14]

Because of the way that FSE carries over state between symbols, decompression involves processing symbols within the Sequences section of each block in reverse order (from last to first).

Usage

Zstandard
Filename extension.zst[15]
Internet media typeapplication/zstd[15]
Magic number28 b5 2f fd[15]
Type of formatData compression
StandardRFC 8478
Websitehttps://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md
Zstandard Dictionary
Magic number37 a4 30 ec[15]
StandardRFC 8478
Websitehttps://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#dictionary-format

The Linux kernel has included Zstandard since November 2017 (version 4.14) as a compression method for the btrfs and squashfs filesystems.[16][17][18]

In 2017, Allan Jude integrated Zstandard into the FreeBSD kernel[19] and it was subsequently integrated as a compressor option for core dumps (both user programs and kernel panics). It was also used to create a proof of concept OpenZFS compression method[7] which was integrated in 2020.[20]

The AWS Redshift and RocksDB databases include support for field compression using Zstandard.[21]

In March 2018, Canonical tested[22] the use of zstd as a deb package compression method by default for the Ubuntu Linux distribution. Compared with xz compression of deb packages, zstd at level 19 decompresses significantly faster, but at the cost of 6% larger package files. Debian developer Ian Jackson favored waiting several years before official adoption.[23][24][25]

In 2018 the algorithm was published as RFC 8478, which also defines an associated media type "application/zstd", filename extension "zst", and HTTP content encoding "zstd".[15]

Arch Linux added support for zstd as a package compression method in October 2019 with the release of the pacman 5.2 package manager,[26] and in January 2020 switched from xz to zstd for the packages in the official repository. Arch uses zstd -c -T0 --ultra -20 -, the size of all compressed packages combined increased by 0.8% (compared to xz), the decompression speed is 14 times faster, decompression memory increased by 50 MiB when using multiple threads, compression memory increases but scales with the number of threads used.[27][28][29]

Fedora added ZStandard support to RPM in May 2018 (Fedora release 28), and used it for packaging the release in October 2019 (Fedora 31)[30]

Full implementation of the algorithm with an option to choose the compression level is used in the .NSZ / .XCZ[31] file formats, developed by the homebrew community for the Nintendo Switch hybrid game console.[32]

7-Zip ZS, a port of 7-Zip FM with Zstandard (and other formats) support, is developed by Tino Reichardt.[33]

Modern7z, a Zstandard (and other formats) plugin for 7-Zip FM is developed by Denis Anisimov (TC4shell).[34]

7-zip-zstd, a fork of 7-zip with support for Zstandard.[35]

License

The reference implementation is licensed under the BSD license, published at GitHub.[36] Since version 1.0, it had an additional Grant of Patent Rights.[37]

From version 1.3.1,[38] this patent grant was dropped and the license was changed to a BSD + GPLv2 dual license.[39]

See also

References

  1. "Contributors to facebook/zstd". https://github.com/facebook/zstd/graphs/contributors. 
  2. "Releases - facebook/zstd". https://github.com/facebook/zstd/releases. 
  3. Sergio De Simone (2016-09-02). "Facebook Open-Sources New Compression Algorithm Outperforming Zlib". InfoQ. https://www.infoq.com/news/2016/09/facebook-zstandard-compression. 
  4. "Life imitates satire: Facebook touts zlib killer just like Silicon Valley's Pied Piper". The Register. 2016-08-31. https://www.theregister.co.uk/2016/08/31/facebook_open_source_database/. 
  5. https://github.com/facebook/zstd/releases/tag/v1.3.4 Faster compression levels
  6. "Command Line Interface for Zstandard library" (in en). 28 October 2021. https://github.com/facebook/zstd/blob/dev/programs/README.md. 
  7. 7.0 7.1 "ZStandard in ZFS". 2017. http://www.open-zfs.org/w/images/b/b3/03-OpenZFS_2017_-_ZStandard_in_ZFS.pdf. 
  8. Matt Mahoney. "Silesia Open Source Compression Benchmark". http://mattmahoney.net/dc/silesia.html. 
  9. Matt Mahoney (2016-08-29). "Large Text Compression Benchmark, .2157 zstd". http://mattmahoney.net/dc/text.html. 
  10. TurboBench: Static/Dynamic web content compression benchmark, PowTurbo, https://sites.google.com/site/powturbo/home/web-compression 
  11. Matt Mahoney, Silesia Open Source Compression Benchmark, http://mattmahoney.net/dc/silesia.html 
  12. "Facebook developers report massive speedups and compression ratio improvements when using dictionaries". https://indico.fnal.gov/event/15154/contribution/5/material/slides/0.pdf. 
  13. "Smaller and faster data compression with Zstandard". Facebook. 31 August 2016. https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/. 
  14. "facebook/zstd". 28 October 2021. https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md#entropy-encoding. 
  15. 15.0 15.1 15.2 15.3 15.4 Collet, Yann (October 2018), Kucherawy, Murray S., ed., Zstandard Compression and the application/zstd Media Type, Internet Engineering Task Force Request for Comments, doi:10.17487/RFC8478, RFC 8478, https://tools.ietf.org/html/rfc8478, retrieved 7 October 2020 
  16. "The rest of the 4.14 merge window [LWN.net"]. https://lwn.net/Articles/733846/. 
  17. "Linux_4.14 - Linux Kernel Newbies". Kernelnewbies.org. https://kernelnewbies.org/Linux_4.14. 
  18. "Zstd Compression For Btrfs & Squashfs Set For Linux 4.14, Already Used Within Facebook - Phoronix". https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.14-Zstd-Pull. 
  19. https://github.com/freebsd/freebsd/commit/28ef16535cde21eeeaf75f6006b3a77952b3b51
  20. "Add ZSTD support to ZFS · openzfs/ZFS@10b3c7f". https://github.com/openzfs/zfs/commit/10b3c7f5e424f54b3ba82dbf1600d866e64ec0a0. 
  21. "Zstandard Encoding - Amazon Redshift". 20 April 2019. https://docs.aws.amazon.com/redshift/latest/dg/zstd-encoding.html. 
  22. Larabel, Michael (12 March 2018). "Canonical Working On Zstd-Compressed Debian Packages For Ubuntu". Phoronix Media. https://www.phoronix.com/scan.php?page=news_item&px=Ubuntu-Zstd-Deb-Packages. "The developers at Canonical are considering a feature freeze exception to get this newly-developed Zstd Apt/Dpkg support in Ubuntu 18.04 LTS. In doing so, they mention they would be looking at enabling Zstd compression for packages by default in Ubuntu 18.10." 
  23. "New Ubuntu Installs Could Be Speed Up by 10% with the Zstd Compression Algorithm". Mar 12, 2018. https://news.softpedia.com/news/new-ubuntu-installs-could-be-speed-up-by-10-with-the-zstd-compression-algorithm-520177.shtml. 
  24. "Canonical Working On Zstd-Compressed Debian Packages For Ubuntu" (in en). 12 March 2018. https://www.phoronix.com/scan.php?page=news_item&px=Ubuntu-Zstd-Deb-Packages. 
  25. Jackson, Ian (2018-04-27). "RFC: Support for zstd in .deb packages?". debian-devel (Mailing list).
  26. "Arch Linux Nears Roll-Out of ZSTD Compressed Packages for Faster Pacman Installs - Phoronix". https://www.phoronix.com/scan.php?page=news_item&px=Arch-Linux-Pacman-Zstd-Near. 
  27. Broda, Robin (2020-01-04). "Now using Zstandard instead of xz for package compression". https://www.archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/. 
  28. Broda, Robin (2019-03-25). "RFC: (devtools) Changing default compression method to zstd". arch-dev-public (Mailing list).
  29. Broda, Robin; Polyak, Levente (2019-12-27). "makepkg.conf: change default compression method to zstd". https://git.archlinux.org/devtools.git/commit/?id=bcda211dd86b3bf54a9bc40d2e19f1aad4bbfbb8. 
  30. "Changes/Switch RPMS to ZSTD compression - Fedora Project Wiki". https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_compression. 
  31. "RELEASE - nsZip - NSP compressor/decompressor to reduce storage" (in en-US). https://gbatemp.net/threads/nszip-nsp-compressor-decompressor-to-reduce-storage.530313/. 
  32. Bosshard, Nico (2019-10-31), nsZip is a tool to compress/decompress Nintendo Switch games using the here specified NSZ file format: nicoboss/nsZip, https://github.com/nicoboss/nsZip, retrieved 2019-11-03 
  33. "Milkys Homepage - 7-Zip with support for Zstandard, Brotli, Lz4, Lz5 and Lizard Compression". https://mcmilk.de/projects/7-Zip-zstd. 
  34. "Modern7z". https://www.tc4shell.com/en/7zip/modern7z. 
  35. "README". 28 October 2021. https://github.com/mcmilk/7-Zip-zstd. 
  36. "Facebook open sources Zstandard data compression algorithm, aims to replace technology behind Zip". ZDnet. August 31, 2016. http://www.zdnet.com/article/facebook-open-sources-zstandard-data-compression-algorithm-aims-to-replace-technology-behind-zip/. 
  37. zstd/PATENTS "Additional Grant of Patent Rights Version 2", Facebook
  38. "Zstd v1.3.1 release", GitHub "facebook/zstd"
  39. "New license", GitHub "facebook/zstd"

External links