How to Untar / Extract a TAR file using Java

Posted on by By Nikhilesh, in Miscellaneous | 1

Hi,

we will explain how to extract the contents of a TAR file through a Java . In order to decompress TAR file, we will be using Apache Commons Compress library, so make sure you have a copy of this library commons-compress-1.4.1.jar loaded into your classpath. You will also require Apache Commons IO library commons-io-2.4.jar in your classpath as we will use this to write every single extracted file from the TAR archive to disk.

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

1)You use TarArchiveInputStream to read a TAR file as an InputStream. Once you get a TAR file as an object of this type, you can easily start processing on the file.

————————

import java.io.*;
import org.apache.commons.compress.utils.*;
import java.util.zip.*;
import org.apache.commons.compress.archivers.tar.*;
import org.apache.commons.compress.archivers.*;
import org.apache.commons.compress.compressors.gzip.*;
import java.util.*;

————————–

File tarFile = new File(c:/test.tar);
File dest = new File(c:/temp/);

TarArchiveInputStream tarIn = new TarArchiveInputStream(
new GzipCompressorInputStream(
new BufferedInputStream(
new FileInputStream(
tarFile
)
)
)
);

TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
// tarIn is a TarArchiveInputStream
while (tarEntry != null) {
// create a file with the same name as the tarEntry
File destPath = new File(dest, tarEntry.getName());
System.out.println(“working: ” + destPath.getCanonicalPath());
if (tarEntry.isDirectory()) {
destPath.mkdirs();
} else {
destPath.createNewFile();
byte [] btoRead = new byte[2048];
BufferedOutputStream bout =
new BufferedOutputStream(new FileOutputStream(destPath));
int len;
while((len = tarIn.read(btoRead)) != -1)
{
bout.write(btoRead,0,len);
}

bout.close();
btoRead = null;

}
tarEntry = tarIn.getNextTarEntry();
}
tarIn.close();

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

——————————————–

Finally , you close all output streams / files opened and that completes the program.

Thanks & Regards,

Vishwanth S

Senior ETL Developer.

 

logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Exception in thread “main” java.io.IOException: Input is not in the .gz format
at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.init(GzipCompressorInputStream.java:149)
at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.(GzipCompressorInputStream.java:132)
at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.(GzipCompressorInputStream.java:97)
at test.NewClass.main(NewClass.java:18)

Not work with a tar file. Sample code there