How to Get Multiple Files and Merge in Pentaho

Posted on by By admin, in Pentaho | 0

Introduction:The Get File Names step allows you to get information associated with file names on the file system. The retrieved file names are added as rows onto the stream.

Make data easy with Helical Insight.
Helical Insight is world’s best open source business intelligence tool.

Click Here to Free Download

Components used in Pentaho:

  1. .Get file names
  2. .CSV file input
  3. .Text file output

FILES AND MERGE IN PENTAHO

Steps to get the files:

Here “Get file names” component is used to get the files from the particular directory.We can see the get file names component in the transformation level.

FILES AND MERGE IN PENTAHO

Drag and drop the Get file names component onto your canvas and double click on it, we can see the the file and filter tabs.

File Tab:

In file tab we have the following tabs.

  1. File or directory.
  2. Regular expression.
  3. Exclude regular expression.
  4. Selected files.

FILES AND MERGE IN PENTAHO

File or directory path:

This tab defines the location of the files you want to retrieve filenames for.In the file tab browse for the file and click on add.

FILES AND MERGE IN PENTAHO

After clicking add the we can see the file path in the selected files tab and we can add multiple paths.

FILES AND MERGE IN PENTAHO

The path to the folder should now be entered in the File\Folder box. We can see the regular expression box beside it. you can now enter your RegEx file name. For the example above, I would enter “.+\.txt”.

Make data easy with Helical Insight.
Helical Insight is world’s best open source business intelligence tool.

Click Here to Free Download

Click on preview rows you can see the following fields in the output.

  1. Filename – the complete filename, including the path (/tmp/kettle/somefile.txt)
  2. short_filename – only the filename, without the path (somefile.txt)
  3. path – only the path (/tmp/kettle/)
  4. type
  5. exists
  6. ishidden
  7. isreadable
  8. iswriteable
  9. lastmodifiedtime
  10. size
  11. extension
  12. uri
  13. rooturi

FILES AND MERGE IN PENTAHO

CSV FILE INPUT:

Drag and drop the CSV file input component onto the canvas and connect the get file components to the CSV file input component.Double click on the CSV file input component and select the filenames field in the “The file name field” box and click on get fields.

FILES AND MERGE IN PENTAHO

TEXT FILE OUTPUT:

Connect the CSV file input component to the Text file output component and Give the path where you want to see your output and click on Get fields.

FILES AND MERGE IN PENTAHO

Run the transformation and see your output in the path you gave in the Text file output component.

FILES AND MERGE IN PENTAHO

Input file data:

FILES AND MERGE IN PENTAHO

I have added the same paths two times in the selected files box. So my output should merge the data in the two path files.

OUTPUT OF THE TRANSFORMATION:
FILES AND MERGE IN PENTAHO
Thank You
Bolle Vani
Helical IT Solutions Pvt Ltd

logo

Best Open Source Business Intelligence Software Helical Insight Here

logo

A Business Intelligence Framework


logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments