Flat Files and Pentaho (PDI)

Posted on by By Sohail, in Pentaho | 0

What’s a Flat File?

Flat files are the universal mechanism for moving data from one database or system to another. There are two common types of flat files: CSV (comma separated values) and delimited files. Both are file formats that represent relational data in a text file.

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

Click Here to Free Download

PDI enables us to create Flat Files, and PDI can have components that can make the process really Simple. How do we create a Flat File?

  1. Have a Table Input to retrieve data via SQL;
  2. Have a Text File Output (Assuming we need a csv) to give an output
  3. On Text File Output>Fields Tab>We can Select Get Fields to fetch all data columns we require

Now we can Make the Process more automated without opening the JOB or Transformation instead (Assuming our SQL Doesn’t Change Frequently) we can make the changes right from the Config File

  1. We need to have a SetVariable Transformation :
  2. https://www.helicaltech.com/set-properties-pentaho/

  3. Have a Table Input to retrieve data via SQL with all Columns;
  4. A Modified JavaScript with the Below Code:
  5. {code: JavaScript}
    		var record = "";
    		var d  = getVariable('Delimiter', "");
    			if(d == "TAB") {
    				d = "	";
    			} else {
    				d = d;
    			}
    		var selectColumns = getVariable('SelectedColumns', "");
    		var sc = selectColumns.split(",");
    		var columns = "";
    
    		for (var i = 0; i < sc.length; i++) {
    			column_name = sc[i];
    			var column_index = getInputRowMeta().indexOfValue(column_name);
    
    			if (column_index >= 0) {
    				var column_value = row[column_index];
    					if(i == sc.length-1)
    						columns += column_value;	
    					else
    						columns += column_value + d;		
    			} else {
    				var column_value = sc[i];
    					if(i == sc.length-1)
    						columns += column_value;
    					else
    						columns += column_value + d;
    				//Alert(column_value);
    			}
    		}
    		record += columns;
    {code: JavaScript}
    
  6. Have a Text File Output (Assuming we need a csv) to give an output
  7. on Text File Output>Fields Tab>get only the record Variable
  8. Config File should have the following:

#LIST OF COLUMNS:

#category, product_description, color, legacy_code, ship_date, fbe, inbound_freight_cost, total_units, total_box_sales, unit_cost, customer_code, duty, sales_type, type_sale, vendor,

#vendor_code, handling, ctdollar, market_begin_date, market_end_date, pre_market_begin_date, pre_market_end_date, pre_same_market_begin_date, pre_same_market_end_date
#delimiter = TAB , |

Delimiter = ,

SelectedColumns = 1,category,product_description,color,legacy_code,pre_market_begin_date

This will enable users to create Flat Files instead of Developers as you can see we have a column named 1, that happens to be a hard-coded field/column, instead of having to pass that hard-coded column in our SQL, we can pass it as a column in our

SelectedColumns variable

Adding Header to our in our Next Blog

Sample Outputs:










Now let me change the delimiter to | and add a new column (I added fbe column from the List Of Columns available in our Config File)

Let’s see our Output File

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

Get your 30 Days Trail Version



From here we can do whatever we want based on your requirement.

In case if you have any queries please get us at support@helicaltech.com

Sohail Izebhijie
Helical IT Solutions Pvt Ltd

logo

Best Open Source Business Intelligence Software Helical Insight Here

logo

A Business Intelligence Framework


logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments