Get the Row Count in PDI Dynamically
Often people use the data input component in pentaho with count(*) select query to get the row counts. And pass the row count value from the source query to the variable and use it in further transformations.The more optimised way to do so can be through the built in number of options available in the pentaho. One of such component available in the penatho is ‘Memory Group By’.
In this blog, I’ll discuss how we can use the ‘Memory Group By’ component available in the pentaho to count the rows.
Here is an example, Suppose you have a ready query to pull records from the database then what you can do is pass the record value coming from the input component to the memory group by component as shown below:
Get the ‘Memory Group by’ component from: Design->Statistics-> Memory Group by
After that pass the count records from ‘Memory Group by’ component to the ‘Set Variables’.
Click on the ‘Memory Group by’ and in ‘Aggregates’, write the name as ‘CR_count’
In subject put the column name on which the records should be grouped and give the final count.
In type, from the drop down list select the option shown below as ‘Number of rows (without field argument)’ to get the records count.
Pass the output from memory group by option to the Get Variables, there configure the variable with the same name as you wrote in ‘Memory Group by’ i.e ‘CR_count’ overhere we have used.
Finally, we can use the records count via variable throughout the job by accessing the variable.
Best Open Source Business Intelligence Software Helical Insight is Here
A Business Intelligence Framework