Implement Rollback in Pentaho

Posted on April 7, 2016 by By Nikhilesh, in Business Intelligence, Open Source Business Intelligence, Pentaho | 1

A Simple yet Effective way to implement Rollback in Pentaho

What is a Rollback mechanism?

In database technologies, a rollback is an operation which returns the database to some previous state. Rollbacks are important for database integrity, because they mean that the database can be restored to a clean copy even after erroneous operations are performed. They are crucial for recovering from database server crashes; by rolling back any transaction which was active at the time of the crash, the database is restored to a consistent state .

Why do we need it?

There comes scenarios where there could be a Job failure or a database crashing at a crucial juncture , in that case going back to the previous stable state always helps and the rollback mechanism helps out in the same.

The Process:

In our case we have a situation where we have a Parent job named Rollback Parent job and a subsequent job i.e Main Job under it that holds the functional part of things. The Main job follows a batchwise data loading process into the target Dim_OrderDetails. Under it we have the LoadDimOrder transformation where the data loading process happens.

To start lets go to the main transformation i.e Rollback transformation where we are using the GetSystemInfo component which will help us in the rollback mechanism. If you go through the below screenshot we have set a new column id_job with the type as parent Job BatchID which sets the value to a constant 0. This is the same field which is used in the logging table provided by Pentaho. Below screenshot will give you an idea.The reason it sets as 0 is because we havent passed the id_job onto the transformation so it sets a default value 0.

Finally we pass this onto the Dim_orderDetails table. So now everytime a new record is pushed to the target table it sets the value for the id_job column as 0.

Now when you go back to the Parent Job you can see after the Main Job step we have two directed hops where one is an error handling step and the other follows the condition when the previous step runs successfully. Now when the MainJOB step runs successfully the data flow continues to the next step where all the id_batch values are set to a constant value 1. This becomes our savepoint where we know that all the perfectly loaded records are carrying the id_job value as 1.

Similarly if for some reason the MainJob fails the data flow will move to the DeleteRecords step where all the latest records carrying the id_job = 0 will be deleted.

Hence the next time our job runs the previous successfully loaded records will carry the id_job as 1 and the new records will be set 0 initially so that we can handle the errors with the latest records. So this is one way to implement rollback mechanism at the ETL front.

Sayed Shakeeb Jamal

Best Open Source Business Intelligence Software Helical Insight is Here

A Business Intelligence Framework

0 0 votes

Article Rating

1 Comment

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

No Thanks

7 years ago

That’s not actually a Rollback, but rather a back-out.

You might also like..

Business Intelligence

Installation of Firebird db

By admin

Steps to install firebird db 1. Go to google and type firebird in search box and then click on first link. License aggrement 2. Click on downloads and then install Firebird latest version(5.0.0). 3. It will navigate to the below...

Software Testing

Defect Life Cycle

By admin

This blog explains about the complete life cycle of a bug and different status of bug from the stage it was identified,fixed,retest and close. What is Defect life cycle? Defect life cycle is the life cycle of a defect or...

Software Testing

Different Levels of Testing in Software Testing

By admin

What are the Levels of Software Testing? In this blog,we are going to understand the various levels of software testing In Software Testing,we have four different levels of testing,which are as mentioned below: Unit Testing Integration Testing System Testing Acceptance...

About Helical IT Solutions Pvt Ltd

Location

Contact Us

Search what you are looking for..

Implement Rollback in Pentaho

Posted on April 7, 2016 by By Nikhilesh, in Business Intelligence, Open Source Business Intelligence, Pentaho | 1

A Business Intelligence Framework

You might also like..

Business Intelligence

Installation of Firebird db

By admin

Software Testing

Defect Life Cycle

By admin

Software Testing

Different Levels of Testing in Software Testing

By admin

Contact Form