linux平台下一个好用的并行压缩工具（cpu核数越多越快，比tar -j可以快很多倍）_ztenv的博客-爱代码爱编程

2022-09-30 分类: 服务器 linux 运维

pigz, unpigz

It means that “A parallel implementation of gzip for modern multi-processor, multi-core machines”

pigz, which stands for parallel implementation of gzip, is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data. pigz was written by Mark Adler, and uses the zlib and pthread libraries.

Manual

SYNOPSIS

pigz [ −cdfhikKlLmMnNqrRtz0…9,11 ] [ -b blocksize ] [ -p threads ] [ -S suffix ] [ name … ]
unpigz [ −cfhikKlLmMnNqrRtz ] [ -b blocksize ] [ -p threads ] [ -S suffix ] [ name … ]

DESCRIPTION

Pigz compresses using threads to make use of multiple processors and cores. The input is broken up into 128 KB chunks with each compressed in parallel. The individual check value for eachchunk is also calculated in parallel. The compressed data is written in order to the output, and acombined check value is calculated from the individual check values.

The compressed data for mat generated is in the gzip, zlib, or single-entr y zip for mat using the deflate compression method. The compression produces partial raw deflate streams which are concatenated byasingle write thread and wrapped with the appropriate header and trailer, where the trailer contains the combined check value.

Each partial raw deflate stream is terminated by an empty stored block (using the Z_SYNC_FLUSH option of zlib), in order to end that partial bit stream at a byte boundary. That allows the partial streams to be concatenated simply as sequences of bytes. This adds a ver y
small four to five byte overhead to the output for each input chunk.

The default input block size is 128K, but can be changed with the -b option. The number of compress threads is set by default to the number of online processors, which can be changed using the -p option. Specifying -p 1 avoids the use of threads entirely.

The input blocks, while compressed independently, have the last 32K of the previous block loaded as a preset dictionary to preser ve the compression effectiveness of deflating in a single thread.This can be turned off using the -i or --independent option, so that the blocks can be decompressed independently for partial error recovery or for random access. This also inserts an extra empty block to flag independent blocks by prefacing each with the nine-byte sequence (in hex): 00 00 FF FF 00 00 00 FF FF.

Decompression can’t be parallelized, at least not without specially prepared deflate streams for that purpose. Asaresult, pigz uses a single thread (the main thread) for decompression, but will create three other threads for reading, writing, and check calculation, which can speed up decompression under some circumstances. Parallel decompression can be turned off by specifying one process ( -dp 1 or -tp 1 ).

All options on the command line are processed before any names are processed. If no names are provided on the command line, or if “-” is given as a name (but not after “–”), then the input is taken from stdin. If the GZIP or PIGZ environment var iables are set, then options are taken from their values before any command line options are processed, first from GZIP, then from PIGZ.

Compressed files can be restored to their original for m using pigz -d or unpigz.

OPTIONS

-# --fast --best

Regulate the speed of compression using the specified digit #, where −1 or −−fast indicates the fastest compression method (less compression) and −9 or −−best indicates the
slowest compression method (best compression). -0 is no compression. −11 gives a few
percent better compression at a severe cost in execution time, using the zopfli algorithm
by Jyr ki Alakuijala. The default is −6.

-A --alias xxx

Use xxx as the name for any --zip entry from stdin (the default name is “-”).

-b --blocksiz e mmm

Set compression block size to mmmK (default 128KiB).

-c --stdout --to-stdout

Wr ite all processed output to stdout (won’t delete).

-C --comment ccc

Include the provided comment in the gzip header or zip central file header.

-d --decompress --uncompress

Decompress the compressed input.

-f --force

Force overwr ite, compress .gz, links, and to terminal.

-h --help

Displayahelp screen and quit.

-H --huffman

Compress using the Huffman-only strategy.

-i --independent

Compress blocks independently for damage recovery.

-k --keep

Do not delete original file after processing.

-K --zip

Compress to PKWare zip (.zip) single entry for mat.

-l --list List the contents of the compressed input.

-L --license

Display the pigz license and quit.

-m --no-time

Do not store or restore the modification time. -Nm will store or restore the name, but not
the modification time. Note that the order of the options is important.

-M --time

Store or restore the modification time. -nM will store or restore the modification time, but
not the name. Note that the order of the options is important.

-n --no-name

Do not store or restore the file name or the modification time. This is the default when
decompressing. When the file name is not restored from the header, the name of the
compressed file with the suffix stripped is the name of the decompressed file. When the
modification time is not restored from the header, the modification time of the compressed file is used (not the current time).

-N --name

Store or restore both the file name and the modification time. This is the default when
compressing.

-p --processes n

Allow up to n processes (default is the number of online processors)

-q --quiet --silent

Pr int no messages, even on error.

-r --recursive

Process the contents of all subdirectories.

-R --rsyncable

Input-deter mined block locations for rsync.

-S --suffix .sss

Use suffix .sss instead of .gz (for compression).

-t --test

Test the integrity of the compressed input.

-U --rle

Compress using the run length encoding strategy.

-v --verbose

Provide more verbose output.

-V --version

Show the version of pigz. -vV also shows the zlib version.

-z --zlib

Compress to zlib (.zz) instead of gzip for mat.

–

All arguments after “–” are treated as file names (for names that start with “-”)
These options are unique to the -11 compression level:

-F --first

Do iterations first, before block split (default is last).

-I, --iterations n

Number of iterations for optimization (default 15).

-J, --maxsplits n

Maximum number of split blocks (default 15).

-O --oneblock

Do not split into smaller blocks (default is block splitting).

Install

sudo apt install pigz
2.source is here

Demos

pigz [options] [files …]
-0 to -9, -11 : 压缩级别
-p n : 指定压缩核心数，默认8个
-k :压缩后保留原文件

压缩文件，压缩后生成 filename.gz文件

pigz -9 -k filename #使用cpu 的所有核数进行压缩

pigz -6 -p 10 -k filename

pigz -5 -k -p 40 src.tar

tar cvf - test/ | pigz -9 -p 40 -f > test.tar.gz

tar cf - test/ | pigz -9 -f > test.tar.gz

解压文件

gzip -d filename.gz

ungzip filename.gz

pigz -d filename.gz

unpigz filename.gz

本文链接：https://blog.csdn.net/lianshaohua/article/details/127121507

入门知识——linux入门_jijerry的博客-爱代码爱编程

2018-11-16 分类: linux 编程语言 python学习里程

1.Linux系统简介：操作系统linux=系统调用和内核 linux本身只是操作系统的内核，内核是使其他程序运行的基础，它实现了多任务和硬件管理，用户和系统管理员交互运行的所有程序都运行在内核之上； shell（命令行解释器），用于用户交互和编写shell脚本 linux操作系统发行版，Ubuntu，CentOS，Fedora，OpenSUSE，De

【linux】基础入门全解_yyiverson的博客-爱代码爱编程

2019-07-04 分类: 操作系统

本文为本人在实验楼学习网站摘下的部分学习笔记目录 1、基础操作 2、用户管理 2.1 查看用户 2.2 创建用户 2.3 用户组 2.4 删除用户 3、文件权限管理 3.1 查看文件权限 3.2 变更文件所有者 3.3 修改文件权限 4、Linux目录结构 4.1 FHS标准 4.2 目录路径 5、Linux文件的

linux mysql 8.0.20 安装-爱代码爱编程

2021-01-04 分类: linux mysql

mysql8 镜像下载地址：mysql8镜像（用迅雷速度更快）安装源文件版本：mysql-8.0.20-linux-glibc2.12-x86_64.tar.xz mysql安装位置：/data/mysql8 数据库文件数据位置：/data/mysql8/data 注：未防止混淆，这里都用绝对路径执行命令除了文件内容中

Java面试题-JVM 和服务器性能评估-爱代码爱编程

2021-03-31 分类: jvm

1、JVM垃圾回收的时候如何确定垃圾？是否知道什么是GC Roots 垃圾指的是内存中不再使用的空间（主要指的就是堆内存），确定垃圾的方法有引用记数法和可达性分析法（但引用计数法存在对象之间循环引用的问题，因此 java 使用可达性分析法），可达性分析法指的是从 GC Roots 对象开始向下搜索，如果一个对象不是 GC Roots 对象，并且到

服务器与Linux初体验-爱代码爱编程

2021-04-14 分类: linux

服务器硬件知识 1、服务器概述：电源电源相当于人体的心脏，需保障电力供应，如果要买服务器，应选择质量好一点的电源。另外需要注意的是，在实际使用场景中，如果只是配置一个服务器负责核心业务，那么最好使用双电源，并且分别接不同的机房线路；如果服务器是集群中的一台（若干机器做一件事），则可以不用双电源。除此之外，运维工作中就不用再过多考虑电源的其他问题了

Linux 之日常必备技能-爱代码爱编程

2021-11-21 分类: 服务器 linux

Linux 文章目录 Linux1.Linux 内容简介1.1 Linux 入门介绍1.2 Linux 学习方向1.3 Linux 应用领域1.4 Linux 学习阶段和方法2.Linux 基础2.1 Linux 的介绍2.2 Linux 与 Unix 的历史关系2.3 安装 Vmware 和 CentOS2.3.1 基本说明2.3.1.1 安装

Hadoop大数据系列组键-部署-爱代码爱编程

2022-02-23 分类: HDFS hadoop big data

高可用完全分布式模式一、部署规划二、环境准备所有机器禁用seinux，firewalld 所有机器相互ping通/etc/hosts 配置主机名解析： vim /etc/hosts 192.168.66.61 hadoop-master1 192.168.66.62 hadoop-master2 192.168.66.

华为鲲鹏题库（一）_张忠伟的博客的博客-爱代码爱编程

2022-05-16 分类: 华为

华为鲲鹏 1、TaiShan2280服务器最多支持多少个PCIe扩展槽位（）[单选题]——[单选题] A 4 B 2 C 6 D 8 正确答案：D 2、下列哪个语言编写的程序不需要基于ARM重新编译即可在鲲鹏中运行（）[单选题]——[单选题] A C B C++ C 汇编 D Java 正确答案：D 3、以下哪个不是华为云鲲鹏生态帮助伙伴/开发者商业变