In Java, we have ZipOutputStream to create a zip file, and GZIPOutputStream to compress a file using Gzip, but there is no official API to create a tar.gz file.
In Java, we can use Apache Commons Compress (Still active in development) to create a .tar.gz file.
pom.xml
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.20</version>
</dependency>
Notes
- The tar is for collecting files into one archive file, aka tarball, and generally has the suffix
.tar - The Gzip is for compress files to save space and generally has the suffix
.gz - The
tar.gzor.tgzmeans group all files into one archive file, and compress it using Gzip.
The below code snippets will create a tar.gz file.
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
//...
try (OutputStream fOut = Files.newOutputStream(Paths.get("output.tar.gz"));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
TarArchiveEntry tarEntry = new TarArchiveEntry(file,fileName);
tOut.putArchiveEntry(tarEntry);
// copy file to TarArchiveOutputStream
Files.copy(path, tOut);
tOut.closeArchiveEntry();
tOut.finish();
}
The below code snippets will decompress a .tar.gz file.
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
//...
try (InputStream fi = Files.newInputStream(Paths.get("input.tar.gz"));
BufferedInputStream bi = new BufferedInputStream(fi);
GzipCompressorInputStream gzi = new GzipCompressorInputStream(bi);
TarArchiveInputStream ti = new TarArchiveInputStream(gzi)) {
ArchiveEntry entry;
while ((entry = ti.getNextEntry()) != null) {
// create a new path, remember check zip slip attack
Path newPath = filename(entry, targetDir);
//checking
// copy TarArchiveInputStream to newPath
Files.copy(ti, newPath);
}
}
1. Add two files to tar.gz
This example shows how to add two files into a tar.gz file.
TarGzipExample1.java
package com.favtuts.io.howto.compress;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import java.io.BufferedOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
public class TarGzipExample {
public static void main(String[] args) {
try {
Path path1 = Paths.get("/home/tvt/workspace/favtuts/sitemap.xml");
Path path2 = Paths.get("/home/tvt/workspace/favtuts/file.txt");
Path output = Paths.get("/home/tvt/workspace/favtuts/output.tar.gz");
List<Path> paths = Arrays.asList(path1, path2);
createTarGzipFiles(paths, output);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Done");
}
// tar.gz few files
public static void createTarGzipFiles(List<Path> paths, Path output)
throws IOException {
try (OutputStream fOut = Files.newOutputStream(output);
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
for (Path path : paths) {
if (!Files.isRegularFile(path)) {
throw new IOException("Support only file!");
}
TarArchiveEntry tarEntry = new TarArchiveEntry(
path.toFile(),
path.getFileName().toString());
tOut.putArchiveEntry(tarEntry);
// copy file to TarArchiveOutputStream
Files.copy(path, tOut);
tOut.closeArchiveEntry();
}
tOut.finish();
}
}
}
Output – It adds sitemap.xml and file.txt into one archive file output.tar and compress it using Gzip, and the result is a output.tar.gz
$ tar -tvf /home/tvt/workspace/favtuts/output.tar.gz
-rw-r--r-- 0/0 260295 2022-05-20 16:15 sitemap.xml
-rw-r--r-- 0/0 24 2022-05-06 17:14 file.txt
2. Add a directory to tar.gz
This example adds a directory, including its sub-files and sub-directories into one archive file and Gzip compress it into a .tar.gz
The idea is to use Files.walkFileTree to walk a file tree and add the file one by one into the TarArchiveOutputStream.
TarGzipExample2.java
package com.favtuts.io.howto.compress;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import java.io.*;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.Arrays;
import java.util.List;
public class TarGzipExample {
public static void main(String[] args) {
try {
// tar.gz a folder
Path source = Paths.get("/home/tvt/workspace/favtuts/test");
createTarGzipFolder(source);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Done");
}
// generate .tar.gz file at the current working directory
// tar.gz a folder
public static void createTarGzipFolder(Path source) throws IOException {
if (!Files.isDirectory(source)) {
throw new IOException("Please provide a directory.");
}
// get folder name as zip file name
String tarFileName = source.getFileName().toString() + ".tar.gz";
try (OutputStream fOut = Files.newOutputStream(Paths.get(tarFileName));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
Files.walkFileTree(source, new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file,
BasicFileAttributes attributes) {
// only copy files, no symbolic links
if (attributes.isSymbolicLink()) {
return FileVisitResult.CONTINUE;
}
// get filename
Path targetFile = source.relativize(file);
try {
TarArchiveEntry tarEntry = new TarArchiveEntry(
file.toFile(), targetFile.toString());
tOut.putArchiveEntry(tarEntry);
Files.copy(file, tOut);
tOut.closeArchiveEntry();
System.out.printf("file : %s%n", file);
} catch (IOException e) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, e);
}
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, exc);
return FileVisitResult.CONTINUE;
}
});
tOut.finish();
}
}
}
3. Add String to tar.gz
This example adds String into a ByteArrayInputStream and put it into the TarArchiveOutputStream directly. It means to create a file without saving it into the local disk and put the file into the tar.gz directly.
public static void createTarGzipFilesOnDemand() throws IOException {
String data1 = "Test data 1";
String fileName1 = "111.txt";
String data2 = "Test data 2 3 4";
String fileName2 = "folder/222.txt";
String outputTarGzip = "/home/tvt/workspace/favtuts/output.tar.gz";
try (OutputStream fOut = Files.newOutputStream(Paths.get(outputTarGzip));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
createTarArchiveEntry(fileName1, data1, tOut);
createTarArchiveEntry(fileName2, data2, tOut);
tOut.finish();
}
}
private static void createTarArchiveEntry(String fileName,
String data,
TarArchiveOutputStream tOut)
throws IOException {
byte[] dataInBytes = data.getBytes();
// create a byte[] input stream
ByteArrayInputStream baOut1 = new ByteArrayInputStream(dataInBytes);
TarArchiveEntry tarEntry = new TarArchiveEntry(fileName);
// need defined the file size, else error
tarEntry.setSize(dataInBytes.length);
// tarEntry.setSize(baOut1.available()); alternative
tOut.putArchiveEntry(tarEntry);
// copy ByteArrayInputStream to TarArchiveOutputStream
byte[] buffer = new byte[1024];
int len;
while ((len = baOut1.read(buffer)) > 0) {
tOut.write(buffer, 0, len);
}
tOut.closeArchiveEntry();
}
4. Decompress file – tar.gz
This example shows how to decompress and extract a tar.gz file, and it also checks the zip slip vulnerability.
TarGzipExample4.java
package com.favtuts.io.howto.compress;
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import java.io.*;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.Arrays;
import java.util.List;
public class TarGzipExample {
public static void main(String[] args) {
try {
// decompress .tar.gz
Path source = Paths.get("/home/tvt/workspace/favtuts/output.tar.gz");
Path target = Paths.get("/home/tvt/workspace/favtuts/tartest");
decompressTarGzipFile(source, target);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Done");
}
public static void decompressTarGzipFile(Path source, Path target)
throws IOException {
if (Files.notExists(source)) {
throw new IOException("File doesn't exists!");
}
try (InputStream fi = Files.newInputStream(source);
BufferedInputStream bi = new BufferedInputStream(fi);
GzipCompressorInputStream gzi = new GzipCompressorInputStream(bi);
TarArchiveInputStream ti = new TarArchiveInputStream(gzi)) {
ArchiveEntry entry;
while ((entry = ti.getNextEntry()) != null) {
// create a new path, zip slip validate
Path newPath = zipSlipProtect(entry, target);
if (entry.isDirectory()) {
Files.createDirectories(newPath);
} else {
// check parent folder again
Path parent = newPath.getParent();
if (parent != null) {
if (Files.notExists(parent)) {
Files.createDirectories(parent);
}
}
// copy TarArchiveInputStream to Path newPath
Files.copy(ti, newPath, StandardCopyOption.REPLACE_EXISTING);
}
}
}
}
private static Path zipSlipProtect(ArchiveEntry entry, Path targetDir)
throws IOException {
Path targetDirResolved = targetDir.resolve(entry.getName());
// make sure normalized file still has targetDir as its prefix,
// else throws exception
Path normalizePath = targetDirResolved.normalize();
if (!normalizePath.startsWith(targetDir)) {
throw new IOException("Bad entry: " + entry.getName());
}
return normalizePath;
}
}
Further Reading
Please check the official Apache Commons Compress examples.
Download Source Code
$ git clone https://github.com/favtuts/java-core-tutorials-examples
$ cd java-io/howto/compress