Kettle is an open source ETL tool acquired by Pentaho in 2005. Pentaho then also launched an enterprise version of this ETL Tool called Pentaho Data Integration (PDI) while the community version continues to exist. Obviously, PDI has more capabilities and features compared with the community version.

Pentaho sells PDI along with their BI offering which could be used for various data related operations like Data Cleaning, Data migration, Data loading, Data processing, Data governance etc. There are various components present in Pentaho ETL tool used for operations such as:

  • Spoon – data modeling and development tool for ETL developers. It allows creation of transformations (elementary data flows) and jobs (execution sequences of transformations and other jobs)
  • Pan – executes transformations modeled in Spoon
  • Kitchen – is an application which executes jobs designed in Spoon
  • Carte – a simple webserver used for running and monitoring data integration tasks

We have worked on and implemented various ETL works using Pentaho for clients including University of Bridgeport, Canadian Bearings, SyncHR, New Healthcare Analytics, Numerify, Mozaic Limited, etc.