| <?xml version="1.0"?> |
| <!-- |
| |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| |
| --> |
| <document> |
| <properties> |
| <title>Commons Compress Examples</title> |
| <author email="dev@commons.apache.org">Commons Documentation Team</author> |
| </properties> |
| <body> |
| <section name="Examples"> |
| |
| <subsection name="Archivers and Compressors"> |
| <p>Commons Compress calls all formats that compress a single |
| stream of data compressor formats while all formats that |
| collect multiple entries inside a single (potentially |
| compressed) archive are archiver formats.</p> |
| |
| <p>The compressor formats supported are gzip, bzip2, xz, lzma, |
| Pack200, DEFLATE and Z, the archiver formats are 7z, ar, arj, |
| cpio, dump, tar and zip. Pack200 is a special case as it can |
| only compress JAR files.</p> |
| |
| <p>We currently only provide read support for arj, |
| dump and Z. arj can only read uncompressed archives, 7z can read |
| archives with many compression and encryption algorithms |
| supported by 7z but doesn't support encryption when writing |
| archives.</p> |
| </subsection> |
| |
| <subsection name="Common Notes"> |
| <p>The stream classes all wrap around streams provided by the |
| calling code and they work on them directly without any |
| additional buffering. On the other hand most of them will |
| benefit from buffering so it is highly recommended that |
| users wrap their stream |
| in <code>Buffered<em>(In|Out)</em>putStream</code>s before |
| using the Commons Compress API.</p> |
| |
| </subsection> |
| |
| <subsection name="Factories"> |
| |
| <p>Compress provides factory methods to create input/output |
| streams based on the names of the compressor or archiver |
| format as well as factory methods that try to guess the |
| format of an input stream.</p> |
| |
| <p>To create a compressor writing to a given output by using |
| the algorithm name:</p> |
| <source><![CDATA[ |
| CompressorOutputStream gzippedOut = new CompressorStreamFactory() |
| .createCompressorOutputStream(CompressorStreamFactory.GZIP, myOutputStream); |
| ]]></source> |
| |
| <p>Make the factory guess the input format for a given |
| archiver stream:</p> |
| <source><![CDATA[ |
| ArchiveInputStream input = new ArchiveStreamFactory() |
| .createArchiveInputStream(originalInput); |
| ]]></source> |
| |
| <p>Make the factory guess the input format for a given |
| compressor stream:</p> |
| <source><![CDATA[ |
| CompressorInputStream input = new CompressorStreamFactory() |
| .createCompressorInputStream(originalInput); |
| ]]></source> |
| |
| <p>Note that there is no way to detect the lzma format so only |
| the two-arg version of |
| <code>createCompressorInputStream</code> can be used. Prior |
| to Compress 1.9 the .Z format hasn't been auto-detected |
| either.</p> |
| |
| </subsection> |
| |
| <subsection name="Unsupported Features"> |
| <p>Many of the supported formats have developed different |
| dialects and extensions and some formats allow for features |
| (not yet) supported by Commons Compress.</p> |
| |
| <p>The <code>ArchiveInputStream</code> class provides a method |
| <code>canReadEntryData</code> that will return false if |
| Commons Compress can detect that an archive uses a feature |
| that is not supported by the current implementation. If it |
| returns false you should not try to read the entry but skip |
| over it.</p> |
| |
| </subsection> |
| |
| <subsection name="Concatenated Streams"> |
| <p>For the bzip2, gzip and xz formats a single compressed file |
| may actually consist of several streams that will be |
| concatenated by the command line utilities when decompressing |
| them. Starting with Commons Compress 1.4 the |
| <code>*CompressorInputStream</code>s for these formats support |
| concatenating streams as well, but they won't do so by |
| default. You must use the two-arg constructor and explicitly |
| enable the support.</p> |
| </subsection> |
| |
| <subsection name="ar"> |
| |
| <p>In addition to the information stored |
| in <code>ArchiveEntry</code> a <code>ArArchiveEntry</code> |
| stores information about the owner user and group as well as |
| Unix permissions.</p> |
| |
| <p>Adding an entry to an ar archive:</p> |
| <source><![CDATA[ |
| ArArchiveEntry entry = new ArArchiveEntry(name, size); |
| arOutput.putArchiveEntry(entry); |
| arOutput.write(contentOfEntry); |
| arOutput.closeArchiveEntry(); |
| ]]></source> |
| |
| <p>Reading entries from an ar archive:</p> |
| <source><![CDATA[ |
| ArArchiveEntry entry = (ArArchiveEntry) arInput.getNextEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| arInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| |
| <p>Traditionally the AR format doesn't allow file names longer |
| than 16 characters. There are two variants that circumvent |
| this limitation in different ways, the GNU/SRV4 and the BSD |
| variant. Commons Compress 1.0 to 1.2 can only read archives |
| using the GNU/SRV4 variant, support for the BSD variant has |
| been added in Commons Compress 1.3. Commons Compress 1.3 |
| also optionally supports writing archives with file names |
| longer than 16 characters using the BSD dialect, writing |
| the SVR4/GNU dialect is not supported.</p> |
| |
| <p>It is not possible to detect the end of an AR archive in a |
| reliable way so <code>ArArchiveInputStream</code> will read |
| until it reaches the end of the stream or fails to parse the |
| stream's content as AR entries.</p> |
| |
| </subsection> |
| |
| <subsection name="cpio"> |
| |
| <p>In addition to the information stored |
| in <code>ArchiveEntry</code> a <code>CpioArchiveEntry</code> |
| stores various attributes including information about the |
| original owner and permissions.</p> |
| |
| <p>The cpio package supports the "new portable" as well as the |
| "old" format of CPIO archives in their binary, ASCII and |
| "with CRC" variants.</p> |
| |
| <p>Adding an entry to a cpio archive:</p> |
| <source><![CDATA[ |
| CpioArchiveEntry entry = new CpioArchiveEntry(name, size); |
| cpioOutput.putArchiveEntry(entry); |
| cpioOutput.write(contentOfEntry); |
| cpioOutput.closeArchiveEntry(); |
| ]]></source> |
| |
| <p>Reading entries from an cpio archive:</p> |
| <source><![CDATA[ |
| CpioArchiveEntry entry = cpioInput.getNextCPIOEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| cpioInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| |
| <p>Traditionally CPIO archives are written in blocks of 512 |
| bytes - the block size is a configuration parameter of the |
| <code>Cpio*Stream</code>'s constuctors. Starting with version |
| 1.5 <code>CpioArchiveInputStream</code> will consume the |
| padding written to fill the current block when the end of the |
| archive is reached. Unfortunately many CPIO implementations |
| use larger block sizes so there may be more zero-byte padding |
| left inside the original input stream after the archive has |
| been consumed completely.</p> |
| |
| </subsection> |
| |
| <subsection name="dump"> |
| |
| <p>In addition to the information stored |
| in <code>ArchiveEntry</code> a <code>DumpArchiveEntry</code> |
| stores various attributes including information about the |
| original owner and permissions.</p> |
| |
| <p>As of Commons Compress 1.3 only dump archives using the |
| new-fs format - this is the most common variant - are |
| supported. Right now this library supports uncompressed and |
| ZLIB compressed archives and can not write archives at |
| all.</p> |
| |
| <p>Reading entries from an dump archive:</p> |
| <source><![CDATA[ |
| DumpArchiveEntry entry = dumpInput.getNextDumpEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| dumpInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| |
| <p>Prior to version 1.5 <code>DumpArchiveInputStream</code> |
| would close the original input once it had read the last |
| record. Starting with version 1.5 it will not close the |
| stream implicitly.</p> |
| |
| </subsection> |
| |
| <subsection name="tar"> |
| |
| <p>The TAR package has a <a href="tar.html">dedicated |
| documentation page</a>.</p> |
| |
| <p>Adding an entry to a tar archive:</p> |
| <source><![CDATA[ |
| TarArchiveEntry entry = new TarArchiveEntry(name); |
| entry.setSize(size); |
| tarOutput.putArchiveEntry(entry); |
| tarOutput.write(contentOfEntry); |
| tarOutput.closeArchiveEntry(); |
| ]]></source> |
| |
| <p>Reading entries from an tar archive:</p> |
| <source><![CDATA[ |
| TarArchiveEntry entry = tarInput.getNextTarEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| tarInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| </subsection> |
| |
| <subsection name="zip"> |
| <p>The ZIP package has a <a href="zip.html">dedicated |
| documentation page</a>.</p> |
| |
| <p>Adding an entry to a zip archive:</p> |
| <source><![CDATA[ |
| ZipArchiveEntry entry = new ZipArchiveEntry(name); |
| entry.setSize(size); |
| zipOutput.putArchiveEntry(entry); |
| zipOutput.write(contentOfEntry); |
| zipOutput.closeArchiveEntry(); |
| ]]></source> |
| |
| <p><code>ZipArchiveOutputStream</code> can use some internal |
| optimizations exploiting <code>SeekableByteChannel</code> if it |
| knows it is writing to a seekable output rather than a non-seekable |
| stream. If you are writing to a file, you should use the |
| constructor that accepts a <code>File</code> or |
| <code>SeekableByteChannel</code> argument rather |
| than the one using an <code>OutputStream</code> or the |
| factory method in <code>ArchiveStreamFactory</code>.</p> |
| |
| <p>Reading entries from an zip archive:</p> |
| <source><![CDATA[ |
| ZipArchiveEntry entry = zipInput.getNextZipEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| zipInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| |
| <p>Reading entries from an zip archive using the |
| recommended <code>ZipFile</code> class:</p> |
| <source><![CDATA[ |
| ZipArchiveEntry entry = zipFile.getEntry(name); |
| InputStream content = zipFile.getInputStream(entry); |
| try { |
| READ UNTIL content IS EXHAUSTED |
| } finally { |
| content.close(); |
| } |
| ]]></source> |
| |
| <p>Reading entries from an in-memory zip archive using |
| <code>SeekableInMemoryByteChannel</code> and <code>ZipFile</code> class:</p> |
| <source><![CDATA[ |
| byte[] inputData; // zip archive contents |
| SeekableInMemoryByteChannel inMemoryByteChannel = new SeekableInMemoryByteChannel(inputData); |
| ZipFile zipFile = new ZipFile(inMemoryByteChannel); |
| ZipArchiveEntry archiveEntry = zipFile.getEntry("entryName"); |
| InputStream inputStream = zipFile.getInputStream(archiveEntry); |
| inputStream.read() // read data from the input stream |
| ]]></source> |
| |
| <p>Creating a zip file with multiple threads:</p> |
| |
| A simple implementation to create a zip file might look like this: |
| |
| <source> |
| public class ScatterSample { |
| |
| ParallelScatterZipCreator scatterZipCreator = new ParallelScatterZipCreator(); |
| ScatterZipOutputStream dirs = ScatterZipOutputStream.fileBased(File.createTempFile("scatter-dirs", "tmp")); |
| |
| public ScatterSample() throws IOException { |
| } |
| |
| public void addEntry(ZipArchiveEntry zipArchiveEntry, InputStreamSupplier streamSupplier) throws IOException { |
| if (zipArchiveEntry.isDirectory() && !zipArchiveEntry.isUnixSymlink()) |
| dirs.addArchiveEntry(ZipArchiveEntryRequest.createZipArchiveEntryRequest(zipArchiveEntry, streamSupplier)); |
| else |
| scatterZipCreator.addArchiveEntry( zipArchiveEntry, streamSupplier); |
| } |
| |
| public void writeTo(ZipArchiveOutputStream zipArchiveOutputStream) |
| throws IOException, ExecutionException, InterruptedException { |
| dirs.writeTo(zipArchiveOutputStream); |
| dirs.close(); |
| scatterZipCreator.writeTo(zipArchiveOutputStream); |
| } |
| } |
| </source> |
| </subsection> |
| |
| <subsection name="jar"> |
| <p>In general, JAR archives are ZIP files, so the JAR package |
| supports all options provided by the ZIP package.</p> |
| |
| <p>To be interoperable JAR archives should always be created |
| using the UTF-8 encoding for file names (which is the |
| default).</p> |
| |
| <p>Archives created using <code>JarArchiveOutputStream</code> |
| will implicitly add a <code>JarMarker</code> extra field to |
| the very first archive entry of the archive which will make |
| Solaris recognize them as Java archives and allows them to |
| be used as executables.</p> |
| |
| <p>Note that <code>ArchiveStreamFactory</code> doesn't |
| distinguish ZIP archives from JAR archives, so if you use |
| the one-argument <code>createArchiveInputStream</code> |
| method on a JAR archive, it will still return the more |
| generic <code>ZipArchiveInputStream</code>.</p> |
| |
| <p>The <code>JarArchiveEntry</code> class contains fields for |
| certificates and attributes that are planned to be supported |
| in the future but are not supported as of Compress 1.0.</p> |
| |
| <p>Adding an entry to a jar archive:</p> |
| <source><![CDATA[ |
| JarArchiveEntry entry = new JarArchiveEntry(name, size); |
| entry.setSize(size); |
| jarOutput.putArchiveEntry(entry); |
| jarOutput.write(contentOfEntry); |
| jarOutput.closeArchiveEntry(); |
| ]]></source> |
| |
| <p>Reading entries from an jar archive:</p> |
| <source><![CDATA[ |
| JarArchiveEntry entry = jarInput.getNextJarEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| jarInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| </subsection> |
| |
| <subsection name="7z"> |
| |
| <p>Note that Commons Compress currently only supports a subset |
| of compression and encryption algorithms used for 7z archives. |
| For writing only uncompressed entries, LZMA, LZMA2, BZIP2 and |
| Deflate are supported - reading also supports |
| AES-256/SHA-256.</p> |
| |
| <p>Multipart archives are not supported at all.</p> |
| |
| <p>7z archives can use multiple compression and encryption |
| methods as well as filters combined as a pipeline of methods |
| for its entries. Prior to Compress 1.8 you could only specify |
| a single method when creating archives - reading archives |
| using more than one method has been possible before. Starting |
| with Compress 1.8 it is possible to configure the full |
| pipeline using the <code>setContentMethods</code> method of |
| <code>SevenZOutputFile</code>. Methods are specified in the |
| order they appear inside the pipeline when creating the |
| archive, you can also specify certain parameters for some of |
| the methods - see the Javadocs of |
| <code>SevenZMethodConfiguration</code> for details.</p> |
| |
| <p>When reading entries from an archive the |
| <code>getContentMethods</code> method of |
| <code>SevenZArchiveEntry</code> will properly represent the |
| compression/encryption/filter methods but may fail to |
| determine the configuration options used. As of Compress 1.8 |
| only the dictionary size used for LZMA2 can be read.</p> |
| |
| <p>Currently solid compression - compressing multiple files |
| as a single block to benefit from patterns repeating accross |
| files - is only supported when reading archives. This also |
| means compression ratio will likely be worse when using |
| Commons Compress compared to the native 7z executable.</p> |
| |
| <p>Reading or writing requires a |
| <code>SeekableByteChannel</code> that will be obtained |
| transparently when reading from or writing to a file. The |
| class |
| <code>org.apache.commons.compress.utils.SeekableInMemoryByteChannel</code> |
| allows you to read from or write to an in-memory archive.</p> |
| |
| <p>Adding an entry to a 7z archive:</p> |
| <source><![CDATA[ |
| SevenZOutputFile sevenZOutput = new SevenZOutputFile(file); |
| SevenZArchiveEntry entry = sevenZOutput.createArchiveEntry(fileToArchive, name); |
| sevenZOutput.putArchiveEntry(entry); |
| sevenZOutput.write(contentOfEntry); |
| sevenZOutput.closeArchiveEntry(); |
| ]]></source> |
| |
| <p>Uncompressing a given 7z archive (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| SevenZFile sevenZFile = new SevenZFile(new File("archive.7z")); |
| SevenZArchiveEntry entry = sevenZFile.getNextEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| sevenZFile.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| |
| <p>Uncompressing a given in-memory 7z archive:</p> |
| <source><![CDATA[ |
| byte[] inputData; // 7z archive contents |
| SeekableInMemoryByteChannel inMemoryByteChannel = new SeekableInMemoryByteChannel(inputData); |
| SevenZFile sevenZFile = new SevenZFile(inMemoryByteChannel); |
| SevenZArchiveEntry entry = sevenZFile.getNextEntry(); |
| sevenZFile.read(); // read current entry's data |
| ]]></source> |
| </subsection> |
| |
| <subsection name="arj"> |
| |
| <p>Note that Commons Compress doesn't support compressed, |
| encrypted or multi-volume ARJ archives, yet.</p> |
| |
| <p>Uncompressing a given arj archive (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| ArjArchiveEntry entry = arjInput.getNextEntry(); |
| byte[] content = new byte[entry.getSize()]; |
| LOOP UNTIL entry.getSize() HAS BEEN READ { |
| arjInput.read(content, offset, content.length - offset); |
| } |
| ]]></source> |
| </subsection> |
| |
| <subsection name="bzip2"> |
| |
| <p>Note that <code>BZipCompressorOutputStream</code> keeps |
| hold of some big data structures in memory. While it is |
| recommended for <em>any</em> stream that you close it as soon as |
| you no longer need it, this is even more important |
| for <code>BZipCompressorOutputStream</code>.</p> |
| |
| <p>Uncompressing a given bzip2 compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.bz2")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| BZip2CompressorInputStream bzIn = new BZip2CompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = bzIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| bzIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using bzip2 (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.tar.gz")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| BZip2CompressorOutputStream bzOut = new BZip2CompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| bzOut.write(buffer, 0, n); |
| } |
| bzOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="gzip"> |
| |
| <p>The implementation of the DEFLATE/INFLATE code used by this |
| package is provided by the <code>java.util.zip</code> package |
| of the Java class library.</p> |
| |
| <p>Uncompressing a given gzip compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.gz")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| GZipCompressorInputStream gzIn = new GZipCompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = gzIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| gzIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using gzip (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.tar.gz")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| GZipCompressorOutputStream gzOut = new GZipCompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| gzOut.write(buffer, 0, n); |
| } |
| gzOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="Pack200"> |
| |
| <p>The Pack200 package has a <a href="pack200.html">dedicated |
| documentation page</a>.</p> |
| |
| <p>The implementation of this package is provided by |
| the <code>java.util.zip</code> package of the Java class |
| library.</p> |
| |
| <p>Uncompressing a given pack200 compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.pack")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.jar")); |
| Pack200CompressorInputStream pIn = new Pack200CompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = pIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| pIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given jar using pack200 (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.jar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.pack")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| Pack200CompressorOutputStream pOut = new Pack200CompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| pOut.write(buffer, 0, n); |
| } |
| pOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="XZ"> |
| |
| <p>The implementation of this package is provided by the |
| public domain <a href="http://tukaani.org/xz/java.html">XZ |
| for Java</a> library.</p> |
| |
| <p>When you try to open an XZ stream for reading using |
| <code>CompressorStreamFactory</code>, Commons Compress will |
| check whether the XZ for Java library is available. Starting |
| with Compress 1.9 the result of this check will be cached |
| unless Compress finds OSGi classes in its classpath. You can |
| use <code>XZUtils#setCacheXZAvailability</code> to overrride |
| this default behavior.</p> |
| |
| <p>Uncompressing a given XZ compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.xz")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| XZCompressorInputStream xzIn = new XZCompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = xzIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| xzIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using XZ (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.tar.xz")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| XZCompressorOutputStream xzOut = new XZCompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| xzOut.write(buffer, 0, n); |
| } |
| xzOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="Z"> |
| |
| <p>Uncompressing a given Z compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.Z")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| ZCompressorInputStream zIn = new ZCompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = zIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| zIn.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="lzma"> |
| |
| <p>The implementation of this package is provided by the |
| public domain <a href="http://tukaani.org/xz/java.html">XZ |
| for Java</a> library.</p> |
| |
| <p>Uncompressing a given lzma compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.lzma")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| LZMACompressorInputStream lzmaIn = new LZMACompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = xzIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| lzmaIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using lzma (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.tar.lzma")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| LZMACompressorOutputStream lzOut = new LZMACompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| lzOut.write(buffer, 0, n); |
| } |
| lzOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="DEFLATE"> |
| |
| <p>The implementation of the DEFLATE/INFLATE code used by this |
| package is provided by the <code>java.util.zip</code> package |
| of the Java class library.</p> |
| |
| <p>Uncompressing a given DEFLATE compressed file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("some-file")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| DeflateCompressorInputStream defIn = new DeflateCompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = defIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| defIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using DEFLATE (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("some-file")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| DeflateCompressorOutputStream defOut = new DeflateCompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| defOut.write(buffer, 0, n); |
| } |
| defOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="Snappy"> |
| |
| <p>There are two different "formats" used for <a |
| href="https://github.com/google/snappy/">Snappy</a>, one only |
| contains the raw compressed data while the other provides a |
| higher level "framing format" - Commons Compress offers two |
| different stream classes for reading either format.</p> |
| |
| <p>Starting with 1.12 we've added support for different |
| dialects of the framing format that can be specified when |
| constructing the stream. The <code>STANDARD</code> dialect |
| follows the "framing format" specification while the |
| <code>IWORK_ARCHIVE</code> dialect can be used to parse IWA |
| files that are part of Apple's iWork 13 format. If no dialect |
| has been specified, <code>STANDARD</code> is used. Only the |
| <code>STANDARD</code> format can be detected by |
| <code>CompressorStreamFactory</code>.</p> |
| |
| <p>Uncompressing a given framed Snappy file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.sz")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| FramedSnappyCompressorInputStream zIn = new FramedSnappyCompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = zIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| zIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using framed Snappy (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.tar.sz")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| FramedSnappyCompressorOutputStream snOut = new FramedSnappyCompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| snOut.write(buffer, 0, n); |
| } |
| snOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="LZ4"> |
| |
| <p>There are two different "formats" used for <a |
| href="http://lz4.github.io/lz4/">lz4</a>. The format called |
| "block format" only contains the raw compressed data while the |
| other provides a higher level "frame format" - Commons |
| Compress offers two different stream classes for reading or |
| writing either format.</p> |
| |
| <p>Uncompressing a given frame LZ4 file (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream fin = Files.newInputStream(Paths.get("archive.tar.lz4")); |
| BufferedInputStream in = new BufferedInputStream(fin); |
| OutputStream out = Files.newOutputStream(Paths.get("archive.tar")); |
| FramedLZ4CompressorInputStream zIn = new FramedLZ4CompressorInputStream(in); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = zIn.read(buffer))) { |
| out.write(buffer, 0, n); |
| } |
| out.close(); |
| zIn.close(); |
| ]]></source> |
| |
| <p>Compressing a given file using the LZ4 frame format (you would |
| certainly add exception handling and make sure all streams |
| get closed properly):</p> |
| <source><![CDATA[ |
| InputStream in = Files.newInputStream(Paths.get("archive.tar")); |
| OutputStream fout = Files.newOutputStream(Paths.get("archive.tar.lz4")); |
| BufferedOutputStream out = new BufferedInputStream(out); |
| FramedLZ4CompressorOutputStream lzOut = new FramedLZ4CompressorOutputStream(out); |
| final byte[] buffer = new byte[buffersize]; |
| int n = 0; |
| while (-1 != (n = in.read(buffer))) { |
| lzOut.write(buffer, 0, n); |
| } |
| lzOut.close(); |
| in.close(); |
| ]]></source> |
| |
| </subsection> |
| |
| <subsection name="Extending Commons Compress"> |
| |
| <p> |
| Starting in release 1.13, it is now possible to add Compressor- and ArchiverStream implementations using the |
| Java's <a href="https://docs.oracle.com/javase/7/docs/api/java/util/ServiceLoader.html">ServiceLoader</a> |
| mechanism. |
| </p> |
| </subsection> |
| |
| <subsection name="Extending Commons Compress Compressors"> |
| |
| <p> |
| To provide your own compressor, you must make available on the classpath a file called |
| <code>META-INF/services/org.apache.commons.compress.compressors.CompressorStreamProvider</code>. |
| </p> |
| <p> |
| This file MUST contain one fully-qualified class name per line. |
| </p> |
| <p> |
| For example: |
| </p> |
| <pre>org.apache.commons.compress.compressors.TestCompressorStreamProvider</pre> |
| <p> |
| This class MUST implement the Commons Compress interface |
| <a href="apidocs/org/apache/commons/compress/compressors/CompressorStreamProvider.html">org.apache.commons.compress.compressors.CompressorStreamProvider</a>. |
| </p> |
| </subsection> |
| |
| <subsection name="Extending Commons Compress Archivers"> |
| |
| <p> |
| To provide your own compressor, you must make available on the classpath a file called |
| <code>META-INF/services/org.apache.commons.compress.archivers.ArchiveStreamProvider</code>. |
| </p> |
| <p> |
| This file MUST contain one fully-qualified class name per line. |
| </p> |
| <p> |
| For example: |
| </p> |
| <pre>org.apache.commons.compress.archivers.TestArchiveStreamProvider</pre> |
| <p> |
| This class MUST implement the Commons Compress interface |
| <a href="apidocs/org/apache/commons/compress/archivers/ArchiveStreamProvider.html">org.apache.commons.compress.archivers.ArchiveStreamProvider</a>. |
| </p> |
| </subsection> |
| |
| </section> |
| </body> |
| </document> |