Introduction: When working on ETL flows, its sometimes useful to store information in temporary files as long as you clean those files up when you are finished. Pentaho Data Integration (aka Kettle or PDI) has two steps for deleting file(s) – one handles a single file, and one handles multiple files. Both are in the File Management section of the Design node in the Job designer.
Components used in job:
- .Delete files
Here “Delete files” component is used to delete the files from the particular directory.
Steps to delete files:
- If you have a single file that you need to delete as part of your workflow and you use the same name over and over every time your job runs, then deleting it is pretty straightforward. For a single file, drag and drop the Delete a file component onto your canvas.
- Open the step up, and enter a name for it. If the file is created/stored on the computer you are developing on, you can navigate to it by clicking the Browse button(file).
- We can enable the include subfolders, if we want to delete the files which are stored in the the subfolders.
- We have three files in the directory (C:\test\Test1\file1.txt).
- Run the job and search the for the file in the directory.(C:\test\Test1\file1.txt)
- You can not see that file in the directory because the file got deleted after the job ran.
Delete multiple files with a pattern:
As I said earlier some of the workflows I create generate temporary files with the same basic name, but they append a date to the end of the file. For example, there may be a file called “testfile20140403.txt” generated on one day and one called”testfile20140404.txt” generated on the next. In order to have a reusable workflow, I need to use the Wildcard box in the Delete File process task.
The path to the folder should now be entered in the File\Folder box. In the box underneath it, you can now enter your RegEx file name. For the example above, I would enter “.+\.txt”
After run the job you can see no text files in that particular selected directory.
Checking for the files in the directory
Helical IT Solutions Pvt Ltd
Best Open Source Business Intelligence Software Helical Insight Here
A Business Intelligence Framework
Best Open Source Business Intelligence Software Helical Insight is Here