Introduction:The Get File Names step allows you to get information associated with file names on the file system. The retrieved file names are added as rows onto the stream.
Components used in Pentaho:
- .Get file names
- .CSV file input
- .Text file output
Steps to get the files:
Here “Get file names” component is used to get the files from the particular directory.We can see the get file names component in the transformation level.
Drag and drop the Get file names component onto your canvas and double click on it, we can see the the file and filter tabs.
In file tab we have the following tabs.
- File or directory.
- Regular expression.
- Exclude regular expression.
- Selected files.
File or directory path:
This tab defines the location of the files you want to retrieve filenames for.In the file tab browse for the file and click on add.
After clicking add the we can see the file path in the selected files tab and we can add multiple paths.
The path to the folder should now be entered in the File\Folder box. We can see the regular expression box beside it. you can now enter your RegEx file name. For the example above, I would enter “.+\.txt”.
Click on preview rows you can see the following fields in the output.
- Filename – the complete filename, including the path (/tmp/kettle/somefile.txt)
- short_filename – only the filename, without the path (somefile.txt)
- path – only the path (/tmp/kettle/)
CSV FILE INPUT:
Drag and drop the CSV file input component onto the canvas and connect the get file components to the CSV file input component.Double click on the CSV file input component and select the filenames field in the “The file name field” box and click on get fields.
TEXT FILE OUTPUT:
Connect the CSV file input component to the Text file output component and Give the path where you want to see your output and click on Get fields.
Run the transformation and see your output in the path you gave in the Text file output component.
Input file data:
I have added the same paths two times in the selected files box. So my output should merge the data in the two path files.
OUTPUT OF THE TRANSFORMATION:
Helical IT Solutions Pvt Ltd
Best Open Source Business Intelligence Software Helical Insight Here
A Business Intelligence Framework
Best Open Source Business Intelligence Software Helical Insight is Here