123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664 |
- LZMA SDK 4.57
- -------------
- LZMA SDK Copyright (C) 1999-2007 Igor Pavlov
- LZMA SDK provides the documentation, samples, header files, libraries,
- and tools you need to develop applications that use LZMA compression.
- LZMA is default and general compression method of 7z format
- in 7-Zip compression program (www.7-zip.org). LZMA provides high
- compression ratio and very fast decompression.
- LZMA is an improved version of famous LZ77 compression algorithm.
- It was improved in way of maximum increasing of compression ratio,
- keeping high decompression speed and low memory requirements for
- decompressing.
- LICENSE
- -------
- LZMA SDK is available under any of the following licenses:
- 1) GNU Lesser General Public License (GNU LGPL)
- 2) Common Public License (CPL)
- 3) Simplified license for unmodified code (read SPECIAL EXCEPTION)
- 4) Proprietary license
- It means that you can select one of these four options and follow rules of that license.
- 1,2) GNU LGPL and CPL licenses are pretty similar and both these
- licenses are classified as
- - "Free software licenses" at http://www.gnu.org/
- - "OSI-approved" at http://www.opensource.org/
- 3) SPECIAL EXCEPTION
- Igor Pavlov, as the author of this code, expressly permits you
- to statically or dynamically link your code (or bind by name)
- to the files from LZMA SDK without subjecting your linked
- code to the terms of the CPL or GNU LGPL.
- Any modifications or additions to files from LZMA SDK, however,
- are subject to the GNU LGPL or CPL terms.
- SPECIAL EXCEPTION allows you to use LZMA SDK in applications with closed code,
- while you keep LZMA SDK code unmodified.
- SPECIAL EXCEPTION #2: Igor Pavlov, as the author of this code, expressly permits
- you to use this code under the same terms and conditions contained in the License
- Agreement you have for any previous version of LZMA SDK developed by Igor Pavlov.
- SPECIAL EXCEPTION #2 allows owners of proprietary licenses to use latest version
- of LZMA SDK as update for previous versions.
- SPECIAL EXCEPTION #3: Igor Pavlov, as the author of this code, expressly permits
- you to use code of the following files:
- BranchTypes.h, LzmaTypes.h, LzmaTest.c, LzmaStateTest.c, LzmaAlone.cpp,
- LzmaAlone.cs, LzmaAlone.java
- as public domain code.
- 4) Proprietary license
- LZMA SDK also can be available under a proprietary license which
- can include:
- 1) Right to modify code without subjecting modified code to the
- terms of the CPL or GNU LGPL
- 2) Technical support for code
- To request such proprietary license or any additional consultations,
- send email message from that page:
- http://www.7-zip.org/support.html
- You should have received a copy of the GNU Lesser General Public
- License along with this library; if not, write to the Free Software
- Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
- You should have received a copy of the Common Public License
- along with this library.
- LZMA SDK Contents
- -----------------
- LZMA SDK includes:
- - C++ source code of LZMA compressing and decompressing
- - ANSI-C compatible source code for LZMA decompressing
- - C# source code for LZMA compressing and decompressing
- - Java source code for LZMA compressing and decompressing
- - Compiled file->file LZMA compressing/decompressing program for Windows system
- ANSI-C LZMA decompression code was ported from original C++ sources to C.
- Also it was simplified and optimized for code size.
- But it is fully compatible with LZMA from 7-Zip.
- UNIX/Linux version
- ------------------
- To compile C++ version of file->file LZMA, go to directory
- C/7zip/Compress/LZMA_Alone
- and type "make" or "make clean all" to recompile all.
- In some UNIX/Linux versions you must compile LZMA with static libraries.
- To compile with static libraries, change string in makefile
- LIB = -lm
- to string
- LIB = -lm -static
- Files
- ---------------------
- C - C source code
- CPP - CPP source code
- CS - C# source code
- Java - Java source code
- lzma.txt - LZMA SDK description (this file)
- 7zFormat.txt - 7z Format description
- 7zC.txt - 7z ANSI-C Decoder description (this file)
- methods.txt - Compression method IDs for .7z
- LGPL.txt - GNU Lesser General Public License
- CPL.html - Common Public License
- lzma.exe - Compiled file->file LZMA encoder/decoder for Windows
- history.txt - history of the LZMA SDK
- Source code structure
- ---------------------
- C - C files
- Compress - files related to compression/decompression
- Lz - files related to LZ (Lempel-Ziv) compression algorithm
- Lzma - ANSI-C compatible LZMA decompressor
- LzmaDecode.h - interface for LZMA decoding on ANSI-C
- LzmaDecode.c - LZMA decoding on ANSI-C (new fastest version)
- LzmaDecodeSize.c - LZMA decoding on ANSI-C (old size-optimized version)
- LzmaTest.c - test application that decodes LZMA encoded file
- LzmaTypes.h - basic types for LZMA Decoder
- LzmaStateDecode.h - interface for LZMA decoding (State version)
- LzmaStateDecode.c - LZMA decoding on ANSI-C (State version)
- LzmaStateTest.c - test application (State version)
- Branch - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
- Archive - files related to archiving
- 7z_C - 7z ANSI-C Decoder
- CPP -- CPP files
- Common - common files for C++ projects
- Windows - common files for Windows related code
- 7zip - files related to 7-Zip Project
- Common - common files for 7-Zip
- Compress - files related to compression/decompression
- LZ - files related to LZ (Lempel-Ziv) compression algorithm
- Copy - Copy coder
- RangeCoder - Range Coder (special code of compression/decompression)
- LZMA - LZMA compression/decompression on C++
- LZMA_Alone - file->file LZMA compression/decompression
- Branch - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
- Archive - files related to archiving
- Common - common files for archive handling
- 7z - 7z C++ Encoder/Decoder
- Bundles - Modules that are bundles of other modules
-
- Alone7z - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2
- Format7zR - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2
- Format7zExtractR - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2.
- UI - User Interface files
-
- Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll
- Common - Common UI files
- Console - Code for console archiver
- CS - C# files
- 7zip
- Common - some common files for 7-Zip
- Compress - files related to compression/decompression
- LZ - files related to LZ (Lempel-Ziv) compression algorithm
- LZMA - LZMA compression/decompression
- LzmaAlone - file->file LZMA compression/decompression
- RangeCoder - Range Coder (special code of compression/decompression)
- Java - Java files
- SevenZip
- Compression - files related to compression/decompression
- LZ - files related to LZ (Lempel-Ziv) compression algorithm
- LZMA - LZMA compression/decompression
- RangeCoder - Range Coder (special code of compression/decompression)
- C/C++ source code of LZMA SDK is part of 7-Zip project.
- You can find ANSI-C LZMA decompressing code at folder
- C/7zip/Compress/Lzma
- 7-Zip doesn't use that ANSI-C LZMA code and that code was developed
- specially for this SDK. And files from C/7zip/Compress/Lzma do not need
- files from other directories of SDK for compiling.
- 7-Zip source code can be downloaded from 7-Zip's SourceForge page:
- http://sourceforge.net/projects/sevenzip/
- LZMA features
- -------------
- - Variable dictionary size (up to 1 GB)
- - Estimated compressing speed: about 1 MB/s on 1 GHz CPU
- - Estimated decompressing speed:
- - 8-12 MB/s on 1 GHz Intel Pentium 3 or AMD Athlon
- - 500-1000 KB/s on 100 MHz ARM, MIPS, PowerPC or other simple RISC
- - Small memory requirements for decompressing (8-32 KB + DictionarySize)
- - Small code size for decompressing: 2-8 KB (depending from
- speed optimizations)
- LZMA decoder uses only integer operations and can be
- implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
- Some critical operations that affect to speed of LZMA decompression:
- 1) 32*16 bit integer multiply
- 2) Misspredicted branches (penalty mostly depends from pipeline length)
- 3) 32-bit shift and arithmetic operations
- Speed of LZMA decompressing mostly depends from CPU speed.
- Memory speed has no big meaning. But if your CPU has small data cache,
- overall weight of memory speed will slightly increase.
- How To Use
- ----------
- Using LZMA encoder/decoder executable
- --------------------------------------
- Usage: LZMA <e|d> inputFile outputFile [<switches>...]
- e: encode file
- d: decode file
- b: Benchmark. There are two tests: compressing and decompressing
- with LZMA method. Benchmark shows rating in MIPS (million
- instructions per second). Rating value is calculated from
- measured speed and it is normalized with AMD Athlon 64 X2 CPU
- results. Also Benchmark checks possible hardware errors (RAM
- errors in most cases). Benchmark uses these settings:
- (-a1, -d21, -fb32, -mfbt4). You can change only -d. Also you
- can change number of iterations. Example for 30 iterations:
- LZMA b 30
- Default number of iterations is 10.
- <Switches>
-
- -a{N}: set compression mode 0 = fast, 1 = normal
- default: 1 (normal)
- d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB)
- The maximum value for dictionary size is 1 GB = 2^30 bytes.
- Dictionary size is calculated as DictionarySize = 2^N bytes.
- For decompressing file compressed by LZMA method with dictionary
- size D = 2^N you need about D bytes of memory (RAM).
- -fb{N}: set number of fast bytes - [5, 273], default: 128
- Usually big number gives a little bit better compression ratio
- and slower compression process.
- -lc{N}: set number of literal context bits - [0, 8], default: 3
- Sometimes lc=4 gives gain for big files.
- -lp{N}: set number of literal pos bits - [0, 4], default: 0
- lp switch is intended for periodical data when period is
- equal 2^N. For example, for 32-bit (4 bytes)
- periodical data you can use lp=2. Often it's better to set lc0,
- if you change lp switch.
- -pb{N}: set number of pos bits - [0, 4], default: 2
- pb switch is intended for periodical data
- when period is equal 2^N.
- -mf{MF_ID}: set Match Finder. Default: bt4.
- Algorithms from hc* group doesn't provide good compression
- ratio, but they often works pretty fast in combination with
- fast mode (-a0).
- Memory requirements depend from dictionary size
- (parameter "d" in table below).
- MF_ID Memory Description
- bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing.
- bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing.
- bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing.
- hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing.
- -eos: write End Of Stream marker. By default LZMA doesn't write
- eos marker, since LZMA decoder knows uncompressed size
- stored in .lzma file header.
- -si: Read data from stdin (it will write End Of Stream marker).
- -so: Write data to stdout
- Examples:
- 1) LZMA e file.bin file.lzma -d16 -lc0
- compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)
- and 0 literal context bits. -lc0 allows to reduce memory requirements
- for decompression.
- 2) LZMA e file.bin file.lzma -lc0 -lp2
- compresses file.bin to file.lzma with settings suitable
- for 32-bit periodical data (for example, ARM or MIPS code).
- 3) LZMA d file.lzma file.bin
- decompresses file.lzma to file.bin.
- Compression ratio hints
- -----------------------
- Recommendations
- ---------------
- To increase compression ratio for LZMA compressing it's desirable
- to have aligned data (if it's possible) and also it's desirable to locate
- data in such order, where code is grouped in one place and data is
- grouped in other place (it's better than such mixing: code, data, code,
- data, ...).
- Using Filters
- -------------
- You can increase compression ratio for some data types, using
- special filters before compressing. For example, it's possible to
- increase compression ratio on 5-10% for code for those CPU ISAs:
- x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
- You can find C/C++ source code of such filters in folder "7zip/Compress/Branch"
- You can check compression ratio gain of these filters with such
- 7-Zip commands (example for ARM code):
- No filter:
- 7z a a1.7z a.bin -m0=lzma
- With filter for little-endian ARM code:
- 7z a a2.7z a.bin -m0=bc_arm -m1=lzma
- With filter for big-endian ARM code (using additional Swap4 filter):
- 7z a a3.7z a.bin -m0=swap4 -m1=bc_arm -m2=lzma
- It works in such manner:
- Compressing = Filter_encoding + LZMA_encoding
- Decompressing = LZMA_decoding + Filter_decoding
- Compressing and decompressing speed of such filters is very high,
- so it will not increase decompressing time too much.
- Moreover, it reduces decompression time for LZMA_decoding,
- since compression ratio with filtering is higher.
- These filters convert CALL (calling procedure) instructions
- from relative offsets to absolute addresses, so such data becomes more
- compressible. Source code of these CALL filters is pretty simple
- (about 20 lines of C++), so you can convert it from C++ version yourself.
- For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
- LZMA compressed file format
- ---------------------------
- Offset Size Description
- 0 1 Special LZMA properties for compressed data
- 1 4 Dictionary size (little endian)
- 5 8 Uncompressed size (little endian). -1 means unknown size
- 13 Compressed data
- ANSI-C LZMA Decoder
- ~~~~~~~~~~~~~~~~~~~
- To compile ANSI-C LZMA Decoder you can use one of the following files sets:
- 1) LzmaDecode.h + LzmaDecode.c + LzmaTest.c (fastest version)
- 2) LzmaDecode.h + LzmaDecodeSize.c + LzmaTest.c (old size-optimized version)
- 3) LzmaStateDecode.h + LzmaStateDecode.c + LzmaStateTest.c (zlib-like interface)
- Memory requirements for LZMA decoding
- -------------------------------------
- LZMA decoder doesn't allocate memory itself, so you must
- allocate memory and send it to LZMA.
- Stack usage of LZMA decoding function for local variables is not
- larger than 200 bytes.
- How To decompress data
- ----------------------
- LZMA Decoder (ANSI-C version) now supports 5 interfaces:
- 1) Single-call Decompressing
- 2) Single-call Decompressing with input stream callback
- 3) Multi-call Decompressing with output buffer
- 4) Multi-call Decompressing with input callback and output buffer
- 5) Multi-call State Decompressing (zlib-like interface)
- Variant-5 is similar to Variant-4, but Variant-5 doesn't use callback functions.
- Decompressing steps
- -------------------
- 1) read LZMA properties (5 bytes):
- unsigned char properties[LZMA_PROPERTIES_SIZE];
- 2) read uncompressed size (8 bytes, little-endian)
- 3) Decode properties:
- CLzmaDecoderState state; /* it's 24-140 bytes structure, if int is 32-bit */
- if (LzmaDecodeProperties(&state.Properties, properties, LZMA_PROPERTIES_SIZE) != LZMA_RESULT_OK)
- return PrintError(rs, "Incorrect stream properties");
- 4) Allocate memory block for internal Structures:
- state.Probs = (CProb *)malloc(LzmaGetNumProbs(&state.Properties) * sizeof(CProb));
- if (state.Probs == 0)
- return PrintError(rs, kCantAllocateMessage);
- LZMA decoder uses array of CProb variables as internal structure.
- By default, CProb is unsigned_short. But you can define _LZMA_PROB32 to make
- it unsigned_int. It can increase speed on some 32-bit CPUs, but memory
- usage will be doubled in that case.
- 5) Main Decompressing
- You must use one of the following interfaces:
- 5.1 Single-call Decompressing
- -----------------------------
- When to use: RAM->RAM decompressing
- Compile files: LzmaDecode.h, LzmaDecode.c
- Compile defines: no defines
- Memory Requirements:
- - Input buffer: compressed size
- - Output buffer: uncompressed size
- - LZMA Internal Structures (~16 KB for default settings)
- Interface:
- int res = LzmaDecode(&state,
- inStream, compressedSize, &inProcessed,
- outStream, outSize, &outProcessed);
- 5.2 Single-call Decompressing with input stream callback
- --------------------------------------------------------
- When to use: File->RAM or Flash->RAM decompressing.
- Compile files: LzmaDecode.h, LzmaDecode.c
- Compile defines: _LZMA_IN_CB
- Memory Requirements:
- - Buffer for input stream: any size (for example, 16 KB)
- - Output buffer: uncompressed size
- - LZMA Internal Structures (~16 KB for default settings)
- Interface:
- typedef struct _CBuffer
- {
- ILzmaInCallback InCallback;
- FILE *File;
- unsigned char Buffer[kInBufferSize];
- } CBuffer;
- int LzmaReadCompressed(void *object, const unsigned char **buffer, SizeT *size)
- {
- CBuffer *bo = (CBuffer *)object;
- *buffer = bo->Buffer;
- *size = MyReadFile(bo->File, bo->Buffer, kInBufferSize);
- return LZMA_RESULT_OK;
- }
- CBuffer g_InBuffer;
- g_InBuffer.File = inFile;
- g_InBuffer.InCallback.Read = LzmaReadCompressed;
- int res = LzmaDecode(&state,
- &g_InBuffer.InCallback,
- outStream, outSize, &outProcessed);
- 5.3 Multi-call decompressing with output buffer
- -----------------------------------------------
- When to use: RAM->File decompressing
- Compile files: LzmaDecode.h, LzmaDecode.c
- Compile defines: _LZMA_OUT_READ
- Memory Requirements:
- - Input buffer: compressed size
- - Buffer for output stream: any size (for example, 16 KB)
- - LZMA Internal Structures (~16 KB for default settings)
- - LZMA dictionary (dictionary size is encoded in stream properties)
-
- Interface:
- state.Dictionary = (unsigned char *)malloc(state.Properties.DictionarySize);
- LzmaDecoderInit(&state);
- do
- {
- LzmaDecode(&state,
- inBuffer, inAvail, &inProcessed,
- g_OutBuffer, outAvail, &outProcessed);
- inAvail -= inProcessed;
- inBuffer += inProcessed;
- }
- while you need more bytes
- see LzmaTest.c for more details.
- 5.4 Multi-call decompressing with input callback and output buffer
- ------------------------------------------------------------------
- When to use: File->File decompressing
- Compile files: LzmaDecode.h, LzmaDecode.c
- Compile defines: _LZMA_IN_CB, _LZMA_OUT_READ
- Memory Requirements:
- - Buffer for input stream: any size (for example, 16 KB)
- - Buffer for output stream: any size (for example, 16 KB)
- - LZMA Internal Structures (~16 KB for default settings)
- - LZMA dictionary (dictionary size is encoded in stream properties)
-
- Interface:
- state.Dictionary = (unsigned char *)malloc(state.Properties.DictionarySize);
-
- LzmaDecoderInit(&state);
- do
- {
- LzmaDecode(&state,
- &bo.InCallback,
- g_OutBuffer, outAvail, &outProcessed);
- }
- while you need more bytes
- see LzmaTest.c for more details:
- 5.5 Multi-call State Decompressing (zlib-like interface)
- ------------------------------------------------------------------
- When to use: file->file decompressing
- Compile files: LzmaStateDecode.h, LzmaStateDecode.c
- Compile defines:
- Memory Requirements:
- - Buffer for input stream: any size (for example, 16 KB)
- - Buffer for output stream: any size (for example, 16 KB)
- - LZMA Internal Structures (~16 KB for default settings)
- - LZMA dictionary (dictionary size is encoded in stream properties)
-
- Interface:
- state.Dictionary = (unsigned char *)malloc(state.Properties.DictionarySize);
-
- LzmaDecoderInit(&state);
- do
- {
- res = LzmaDecode(&state,
- inBuffer, inAvail, &inProcessed,
- g_OutBuffer, outAvail, &outProcessed,
- finishDecoding);
- inAvail -= inProcessed;
- inBuffer += inProcessed;
- }
- while you need more bytes
- see LzmaStateTest.c for more details:
- 6) Free all allocated blocks
- Note
- ----
- LzmaDecodeSize.c is size-optimized version of LzmaDecode.c.
- But compiled code of LzmaDecodeSize.c can be larger than
- compiled code of LzmaDecode.c. So it's better to use
- LzmaDecode.c in most cases.
- EXIT codes
- -----------
- LZMA decoder can return one of the following codes:
- #define LZMA_RESULT_OK 0
- #define LZMA_RESULT_DATA_ERROR 1
- If you use callback function for input data and you return some
- error code, LZMA Decoder also returns that code.
- LZMA Defines
- ------------
- _LZMA_IN_CB - Use callback for input data
- _LZMA_OUT_READ - Use read function for output data
- _LZMA_LOC_OPT - Enable local speed optimizations inside code.
- _LZMA_LOC_OPT is only for LzmaDecodeSize.c (size-optimized version).
- _LZMA_LOC_OPT doesn't affect LzmaDecode.c (speed-optimized version)
- and LzmaStateDecode.c
- _LZMA_PROB32 - It can increase speed on some 32-bit CPUs,
- but memory usage will be doubled in that case
- _LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler
- and long is 32-bit.
- _LZMA_SYSTEM_SIZE_T - Define it if you want to use system's size_t.
- You can use it to enable 64-bit sizes supporting
- C++ LZMA Encoder/Decoder
- ~~~~~~~~~~~~~~~~~~~~~~~~
- C++ LZMA code use COM-like interfaces. So if you want to use it,
- you can study basics of COM/OLE.
- By default, LZMA Encoder contains all Match Finders.
- But for compressing it's enough to have just one of them.
- So for reducing size of compressing code you can define:
- #define COMPRESS_MF_BT
- #define COMPRESS_MF_BT4
- and it will use only bt4 match finder.
- ---
- http://www.7-zip.org
- http://www.7-zip.org/support.html
|