Hi Guys on my previous Blog on Looping in PDI, I got a couple of feedback on how my approach can be more simplified so I decided to look into it and to my surprise, I was able to achieve the looping with very few components and Transformations. Please do refer to the previous blog Looping in PDI to see my use case and how I can achieve it with a more simplified approach.
In Summary, I have a query that is Unique and for every Line or Row of that Query, I will like to print an output
Hence without further delay let’s dig in.
So I’m actually taking A JOB and within the Job, I’m having two transformations
One Transformation to get my data via query and the other Transformation to Loop over each row of my result Query
Let’s look at our first Transformation getData
It comprises of a Table Input to run my Query as shown below
Select code,value, case when grp ilike '%group%' then replace(grp,'group','Group ') end as grp from ( Select 1 as code, 'One' as value, 'group1' as grp Union Select 2 as code, 'Two' as value, 'group1' as grp Union Select 3 as code, 'Three' as value, 'group1' as grp Union Select 4 as code, 'Four' as value, 'group2' as grp Union Select 5 as code, 'Five' as value, 'group2' as grp Union Select 6 as code, 'Six' as value, 'group3' as grp )base Order by code,grp
So this means my result set of Text files or CSV files should be 6
Next, we send our result set to Copy Rows to Result
And we are done with this transformation
Next, we go to the next Transformation Execute For Every Row/Loop
Here we can see that we are Getting our Copied Rows Set by using the Get Rows from Result and then just sending the row Value to the Text file Output. We have a Delay Row because even if we run the Whole Transformation we want to get Unique Timestamps
So we can differentiate our outputs (This is done for this blog, a better way will be taking a unique column from your output and sending it as a file name).
Now to achieve our goal we need to go on our Main Job and double-click on Execute for Every Row/Loop Transformation
And click on Execute for every input row.
Now here is my output folder
Now after I run with a delay of 10secs for each row
Now in each file, I should have different lines of my query
Do Comment below if you find this more simplified and if you do need another blog or Example on any PDI related Topic Do drop a comment
Best Open Source Business Intelligence Software Helical Insight is Here
A Business Intelligence Framework
If you have any queries please get us at firstname.lastname@example.org
Sohail Ehizogie IzebhijieBI tool Business Intelligence data integration How to Loop inside Pentaho Data Integration Transformation Loops in Pentaho Data Integration 2.0 open source PDI pentaho pentaho data integration Pentaho Kettle Data Integration How to do a Loop