Original author(s)Divon Lan
Initial release2020 (2020)
Written inC
PlatformLinux, Mac, Windows, others
LicenseGenozip non-commerical license

Genozip[1][2][3] is a proprietary universal compressor for genomic files - it is optimized to compress FASTQ, SAM/BAM/CRAM, VCF/BCF, FASTA, GVF, PHYLIP and 23andMe files, but it can also compress any other file (including non-genomic files).

Genozip works by segmenting a source file into its individual data contexts, applying context-specific algorithms to exploit correlations between values within the same context or between contexts, and finally applying the appropriate compression codec to each context.[2][4]

Genozip is designed to be extensible: it may be extended either by adding new segmenters (to add support for compressing additional file formats) context-specific algorithms and/or codecs.[1]

Genozip is the first universal compressor of genomic file formats (i.e. able to compress all common genomic file formats), and as such it is frequently cited and benchmarked against in research papers related to compression of genomic data. [5][6][7][3]


