Implementing Loops in Pentaho Data Integration

Posted on September 30, 2015 by By Nikhilesh, in Business Intelligence, Open Source Business Intelligence, Pentaho | 0

Issues while implementing loops in Pentaho Data Integration

Generally for implementing batch processing we use the looping concept provided by Pentaho in their ETL jobs. The loops in PDI are supported only on jobs(kjb) and it is not supported in transformations(ktr).

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

Get your 30 Days Trail Version

While implementing loops in PDI, we have come across many blog suggesting us to use “Wait For” step and join the output hop to the previous step. Look into the below screenshot for more clarification,

However the limitation in this kind of looping is that in PDI this causes recursive stack allocation by JVM during job execution and the system may run out of memory after a high number of iterations (depending the system available available memory). While implementing this, the JVM may run out of memory and the program crashes. So it is not advisable to implement to have higher number of iterations while implementing loops in PDI.

Possible Solutions:

1. The first thing you have to take is to minimize the number of iterations. The looping works properly up to 500 iterations. Try reducing it to less than 500 iterations.

2. Never use loops for scheduling. For scheduling purposes if we use the looping concept, it goes into an infinite loop which crashes the whole program.

3. Increase your batch size so that number of iterations is less. While implementing external batch processing, take this thing into consideration.

4. For incrementing the value, it is advisable to use another separate transformation instead of a javascript because the javascript cosumes more memory compared to a separate transformation. Create a new transformation, use the formula step to increment the values and then set those variables.

5. Suggested approach for infinite looping – One of the possible way is to use the settings of ‘Start’ step. Set the ‘Repeat’ flag and add interval configuration. This cause the job to be re-initialize completely as a new instance and does not cause any memory issue.

Thanks,

Nitish Kumar Mishra

Best Open Source Business Intelligence Software Helical Insight is Here

A Business Intelligence Framework

Concept of Loop in Pentaho ETL How to Loop inside Pentaho Data Integration Implementing Loops in Pentaho Data Integration kettle PDI pentaho pentaho loop through files While loop implementation in Pentaho Kettle

0 0 votes

Article Rating

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

You might also like..

BI Tools

Best GitHub-Hosted Open Source BI Tools in 2026: A Complete Feature-by-Feature Comparison

By admin

📌 TL;DR Choosing the right GitHub-hosted Open Source BI Tool is no longer just about dashboards. Modern organizations need AI-powered analytics, enterprise reporting, embedded BI, flexible deployment, strong security, and developer extensibility. In this comprehensive comparison, we evaluate Helical Insight,...

Jaspersoft

Top 5 Alternatives to JasperReports for Pixel-Perfect Reporting in 2026

By admin

Key Takeaways Helical Insight stands out as one of the strongest JasperReports alternatives by combining pixel-perfect reporting, interactive dashboards, embedded analytics, white-labeling, and AI-assisted analytics within a single unified platform. JasperReports remains a popular reporting engine, but many organizations now...

Helical Insight 6.2.1

Helical IT Solutions Unveils Helical Insight 6.2: The Ultimate Unified, Modern Open-Source Alternative to Legacy BI

By admin

Major update introduces revolutionary Streaming Cache Architecture, delivering a 90% performance leap and cementing its position as the industry’s most cost-effective, multi-generational Business Intelligence platform. HYDERABAD, Telangana, India — May 26, 2026 — Helical IT Solutions, a trailblazer in open-source...

About Helical IT Solutions Pvt Ltd

Location

Contact Us

Search what you are looking for..

Implementing Loops in Pentaho Data Integration

Posted on September 30, 2015 by By Nikhilesh, in Business Intelligence, Open Source Business Intelligence, Pentaho | 0

Issues while implementing loops in Pentaho Data Integration

A Business Intelligence Framework

You might also like..

BI Tools

Best GitHub-Hosted Open Source BI Tools in 2026: A Complete Feature-by-Feature Comparison

By admin

Jaspersoft

Top 5 Alternatives to JasperReports for Pixel-Perfect Reporting in 2026

By admin

Helical Insight 6.2.1

Helical IT Solutions Unveils Helical Insight 6.2: The Ultimate Unified, Modern Open-Source Alternative to Legacy BI

By admin

Contact Form