In this article we are going to see how to clone input data in Pentaho. The Pentaho is a popular open-source platform for extracting, transforming and loading (ETL) data.
What is CLONE Component?
The Clone component in Pentaho is a transformation step that allows us to create multiple copies, or clones, of input rows. Each clone represents an independent copy of the input data, which can be processed or modified separately.
Here are the steps:
Open Pentaho Data Integration tool and create new transformation.(Go to file then new and click on transformation).
Create a sample data using data grid component as shown in the below screenshot.
In this example we are going to duplicate the rows by 3 whose salary is greater than 10000.
Take modified java script component to define a number to the clone step based on condition. Connect data grid and Java script components, and Open the java script component and write the simple “if else” logic to define a number.
Search for “CLONE ROW” component in Design tab and drag and drop onto the canvas in the pdi workspace and connect java component to it as shown in the below screenshot and add the text file output component at the end to see the result.
Double click on Clone row component and add the no of rows to be added based on condition.
Check the “Nr clone in field” box and select the field as we are giving number through field as shown in the above screenshot.
Check the 2 boxes which are under “output fields”, If we want to see the no of rows and flags to find out which one is original row.
Give the output filename in the “text output” component and click on get fields.
Run the transformation and check the output file.
As we expected the rows with greater than 10000 in salary field, are repeated 3 times.
The Clone component in Pentaho is a versatile tool that allows users to duplicate data rows. By leveraging the features and benefits of the Clone component, data integration and transformation workflows can be streamlined, leading to enhanced performance and resource optimization.
Helical IT Solutions
Best Open Source Business Intelligence Software Helical Insight is Here