Loops in Pentaho Data Integration 2.0

Posted on by By Sohail, in Pentaho | 0

Hi Guys on my previous Blog on Looping in PDI, I got a couple of feedback on how my approach can be more simplified so I decided to look into it and to my surprise, I was able to achieve the looping with very few components and Transformations. Please do refer to the previous blog Looping in PDI to see my use case and how I can achieve it with a more simplified approach.

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

In Summary, I have a query that is Unique and for every Line or Row of that Query, I will like to print an output

Hence without further delay let’s dig in.

 

So I’m actually taking A JOB and within the Job, I’m having two transformations

One Transformation to get my data via query and the other Transformation to Loop over each row of my result Query

Loop in PDI 2.0

 

Let’s look at our first Transformation getData

Transformation getData

 

It comprises of a Table Input to run my Query as shown below

Select code,value, case when grp ilike '%group%'

then replace(grp,'group','Group ') end as grp from (

Select 1 as code, 'One' as value, 'group1' as grp

Union

Select 2 as code, 'Two' as value, 'group1' as grp

Union

Select 3 as code, 'Three' as value, 'group1' as grp

Union

Select 4 as code, 'Four' as value, 'group2' as grp

Union

Select 5 as code, 'Five' as value, 'group2' as grp

Union

Select 6 as code, 'Six' as value, 'group3' as grp

)base

Order by code,grp

So this means my result set of Text files or CSV files should be 6

Next, we send our result set to Copy Rows to Result

And we are done with this transformation

Next, we go to the next Transformation Execute For Every Row/Loop

Execute For Every Row/Loop

Here we can see that we are Getting our Copied Rows Set by using the Get Rows from Result and then just sending the row Value to the Text file Output. We have a Delay Row because even if we run the Whole Transformation we want to get Unique Timestamps

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

Unique Timestamps

So we can differentiate our outputs (This is done for this blog, a better way will be taking a unique column from your output and sending it as a file name).

Now to achieve our goal we need to go on our Main Job and double-click on Execute for Every Row/Loop Transformation

Execute for Every Row/Loop Transformation

And click on Execute for every input row.

Now here is my output folder

output folder

Now after I run with a delay of 10secs for each row

delay of 10secs for each row

Now in each file, I should have different lines of my query

different lines of my query

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

Lets Register For Live Demo

Do Comment below if you find this more simplified and if you do need another blog or Example on any PDI related Topic Do drop a comment

logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

If you have any queries please get us at support@helicaltech.com

Thank You

Sohail Ehizogie Izebhijie

Helical IT Solutions Pvt Ltd

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments