ETL JOB FOR DOWNLOADING AND UNTARING TAR FILE FROM FTP

Posted on by By Nikhilesh, in ETL | 0

etlpic1

 

ETL JOB FOR DOWNLOADING AND UNTARING TAR FILE FROM FTP

1. To Download File from FTP , we first have to create connection by providing all credentials of FTP Server (Host-URl, Username, Password, Port No and which type of connection it is like FTP or SFTP)component Name : tFTPConnection
2. Then Next Task is to Provide Ftp File path (For this mention Ftp Location of file’s ) component Name : tFTPFileList
3. Once Done then we have to mention where we want to put that file (Here we mention Local System path where we want to put our Ftp file’s) component Name : tFTPGet

 filedownload

  ETL  JOB   TO PROCESSED TAR  FILE’S  AND ITERATE THEM ONE-BY-ONE

1. To Untar a tar file there is a component tFileArchieve but instead of that I am using GZipCompressor by using Java code in tJava component .
2. Here we just need to drag-n-drop tJava component , and in that provide the location of tar  file  and path  where you  untaring  your  tar file…

File dest = new File(dirName);

TarArchiveInputStream tarIn = new TarArchiveInputStream(

new GzipCompressorInputStream(

new BufferedInputStream( new FileInputStream( TarName ) )

);

TarArchiveEntry tarEntry = tarIn.getNextTarEntry();

while (tarEntry != null)

{

// create a file with the same name as the tarEntry

File destPath = new File(dest, tarEntry.getName());

System.out.println(“working: ” + destPath.getCanonicalPath()+”— Tar Entry: “);

context.csvloc=””+destPath.getParentFile();

System.out.println(“\nCSV FILE Location ::::”+context.csvloc+”\n”);

if(!(destPath.getParentFile().exists()))

{

System.out.println(“Dest: “+dest);

destPath.getParentFile().mkdirs();

}

if (tarEntry.isDirectory())

{

System.out.println(“Createing directory: “+tarEntry.getName());

destPath.mkdirs();

}

else

{

destPath.createNewFile();

byte [] btoRead = newbyte[2048];

BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream(destPath));

int len; //variable declared

while((len = tarIn.read(btoRead)) != -1)

{

bout.write(btoRead,0,len);

}

bout.close();

btoRead = null;

}

tarEntry = tarIn.getNextTarEntry();

}//while loop end here

tarIn.close();

(This  code is capable  of  searching  tar  file  in  given  folder as  well  as  untaring  that  file into  specified  folder  path)

Here “ dirName” denotes location where Tar file is present and “TarName” denotes name of the Tar file.
3. Regarding Iteration you can connect tFTPGet-component to this tJava-component by Iterate. By this way tJava-component get one Tar file at a time and processed it.

So  lastly the  flow  is  similar to  the  below  picture….

etlpic

logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments