Pentaho Data Integration (PDI), it is also called as Kettle. Pentaho responsible for the Extract, Transform and Load (ETL) processes to the PDI component.
Pentaho Data Integration:
- Pentaho Data Integration (PDI), it is also called as Kettle. Pentaho responsible for the Extract, Transform and Load (ETL) processes to the PDI component.
- Data warehouses environments are most frequently used by this ETL tools. Other purposes are also used this PDI: Migrating data between applications or databases.
- Lets create a simple transformation to convert a CSV into an XML file.
- Our Transformation has to do the following:
- Read the CSV file.
- Build the greetings message.
- Save the greetings in the XML file.
Create a Transformation:
• Now Start the Transformation
- Left workspace is the Palette, Select the Input category
- Drag the CSV file onto the workspace on the right.
- Select the Scripting category.
- Select the Output category.
- Drag the XML Output icon to the workspace.
• Now link the CSV file input with the Modified Java Script Value by creating a Hop:
Configuring the CSV File Input:
- Double-click on the CSV file input. The configuration window for the step will appear. The file location will be indicated, file format (e.g. delimiters, enclosure characters, etc.) and column metadata (e.g. column name, data type, etc).
- Change the step name with one that is more representative of this Step’s function. In this case, list out type in name list.
- For the Filename field, click Browse and select the input file
- o Click Get Fields to add the list of column names of the input file to the grid. By default, the Step assumes that the file has headers (the Header row present checkbox is checked)
- The grid has now the names of the columns of your file.
- Click Preview to ensure that the file will be read as expected. The file will be appear then data showing in window. Click OK to finish.
- Name this Step Greetings.
- The main area of the configuration window is for coding. To the left, there is a tree with a set of available functions that you can use in the code. In particular, the last two branches have the input and output fields, ready to use in the code. In this example there are two fields: last_name and name. Write the following code:
var msg = 'Hello, ' + name + "!";
- Variable created in the code through the bottom. Variable named msg we have created. This message will be send to the output file, the variable name in the grid to write.
- Configuring the Modified Java Script Value step on click ok to finish.
- On click-right the Step to bring up a context menu.
- Input Fields show to select. The CSV file input Step come to the Input Fields are last_name and name.
- Output Fields show to select. We see that not only do we have the existing fields, but also the new msg field.
Configuring the XML Output File:
- Double-click the XML Output. This kind of step will appear while configuration in window. To set the name and location of the output file, and we want to include which of the fields that to be established. We may include all or some of the fields.
- Name the Step File: Greetings.
- To write in the File box:
- Click Get Fields to fill the grid with the three input fields.
- Save the Transformation again.
- Click on the RUN button on the menu bar and launch the transformation.
- We also create a Job which may be used to schedule multiple transformations and then run it.
Helical IT Solutions Pvt Ltd
Best Open Source Business Intelligence Software Helical Insight is Here