How to Delete Files in Pentaho

Posted on by By admin, in Pentaho | 0

Introduction: When working on ETL flows, its sometimes useful to store information in temporary files as long as you clean those files up when you are finished. Pentaho Data Integration (aka Kettle or PDI) has two steps for deleting file(s) – one handles a single file, and one handles multiple files. Both are in the File Management section of the Design node in the Job designer.

Make data easy with Helical Insight.
Helical Insight is world’s best open source business intelligence tool.

Grab The Free Trail

Components used in job:

  1. .start
  2. .Delete files
  3. .Success

Here “Delete files” component is used to delete the files from the particular directory.

Steps to delete files:

  1. If you have a single file that you need to delete as part of your workflow and you use the same name over and over every time your job runs, then deleting it is pretty straightforward. For a single file, drag and drop the Delete a file component onto your canvas.
  2. Delete Pentaho Files

  3. Open the step up, and enter a name for it. If the file is created/stored on the computer you are developing on, you can navigate to it by clicking the Browse button(file).
  4. We can enable the include subfolders, if we want to delete the files which are stored in the the subfolders.
  5. Delete Pentaho Files

  6. We have three files in the directory (C:\test\Test1\file1.txt).
  7. Delete Pentaho Files

  8. Run the job and search the for the file in the directory.(C:\test\Test1\file1.txt)
  9. You can not see that file in the directory because the file got deleted after the job ran.
  10. Delete Pentaho Files

Delete multiple files with a pattern:

As I said earlier some of the workflows I create generate temporary files with the same basic name, but they append a date to the end of the file. For example, there may be a file called “testfile20140403.txt” generated on one day and one called”testfile20140404.txt” generated on the next. In order to have a reusable workflow, I need to use the Wildcard box in the Delete File process task.

Make data easy with Helical Insight.
Helical Insight is world’s best open source business intelligence tool.

Get your 30 Days Trail Version

The path to the folder should now be entered in the File\Folder box. In the box underneath it, you can now enter your RegEx file name. For the example above, I would enter “.+\.txt”

Delete Pentaho Files

After run the job you can see no text files in that particular selected directory.

Delete Pentaho Files

Checking for the files in the directory

Delete Pentaho Files
Thank You
Vani
BI Developer
Helical IT Solutions Pvt Ltd

logo

Best Open Source Business Intelligence Software Helical Insight Here

logo

A Business Intelligence Framework


logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments