12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152 |
- This is a patched version of zlib, modified to use
- Pentium-Pro-optimized assembly code in the deflation algorithm. The
- files changed/added by this patch are:
- README.686
- match.S
- The speedup that this patch provides varies, depending on whether the
- compiler used to build the original version of zlib falls afoul of the
- PPro's speed traps. My own tests show a speedup of around 10-20% at
- the default compression level, and 20-30% using -9, against a version
- compiled using gcc 2.7.2.3. Your mileage may vary.
- Note that this code has been tailored for the PPro/PII in particular,
- and will not perform particuarly well on a Pentium.
- If you are using an assembler other than GNU as, you will have to
- translate match.S to use your assembler's syntax. (Have fun.)
- Brian Raiter
- breadbox@muppetlabs.com
- April, 1998
- Added for zlib 1.1.3:
- The patches come from
- http://www.muppetlabs.com/~breadbox/software/assembly.html
- To compile zlib with this asm file, copy match.S to the zlib directory
- then do:
- CFLAGS="-O3 -DASMV" ./configure
- make OBJA=match.o
- Update:
- I've been ignoring these assembly routines for years, believing that
- gcc's generated code had caught up with it sometime around gcc 2.95
- and the major rearchitecting of the Pentium 4. However, I recently
- learned that, despite what I believed, this code still has some life
- in it. On the Pentium 4 and AMD64 chips, it continues to run about 8%
- faster than the code produced by gcc 4.1.
- In acknowledgement of its continuing usefulness, I've altered the
- license to match that of the rest of zlib. Share and Enjoy!
- Brian Raiter
- breadbox@muppetlabs.com
- April, 2007
|