Dynamic Schema Processor: How to implement security in Pentaho using Dynamic Schema Processor

Implementation of Security in Pentaho using Dynamic Schema Processor

         In my earlier post, I had explained the meaning of Dynamic Schema processor (DSP) and why do we need it. Now if you have a Mondrian Schema uploaded to the BA Server and you would like to modify it in the run time. For instance in our project we change it for every user authenticated to the platform to apply security restrictions. This modification aims to limit records that users are able to see in reports based on that Mondrian Schema. So, basically DSP is used to implement security on the uploaded Mondrian schema.

1. How to implement DSP?

The solution is we have to create a JAR file which extends the Dynamic Schema Processor from Mondrian library. I would be explaining the implementation of DSP using an example,

  • Pentaho Server: Pentaho BA Server 6
  • Database Name: MySQL/sampledata
  • Schema Name: Sales

The aim is to show data based upon the country to which the logged in user belongs to. Below is the resultset of the Users belonging to various countries.

Resultset

Now on top of it, a mondrian schema is created using Schema workbench.

schema

  • Original Schema:

Originalschema

Let’s upload the schema to the BA Server. Click on the Manage Data Source button in Home Page, then open the link Import Analysis from the drop-down menu under the icon on the left to New Data Source button. Select the file with our Mondrian Schema and a data source that links to our database. Alternatively we can also publish it through our Schema workbench using the publish option inside the File menu.

Now we have the schema uploaded on the server. The Mondrian cubes can be accessed using Pentaho analyzer/jPivot/Saiku plug-in. The schema will display the data for all dimensions and facts.

  • Modifying the schema:

Now let’s modify the schema slightly by adding a SQL dialect to the xml schema where the CUSTOMER_W_TER table is mentioned (You can also modify the schema from the DSP itself but for now let’s modify this thing in the schema itself).

 

modifiedSchema

Note: The ‘%PLACE_HOLDER%’ value is a where clause query that is defined in the properties file. It is basically the column in which you want to filter data on. Here the column is country in which the user is present. We have to make sure there is a user table and it has some relationship with the column with which we are filtering the data.

2. Creating Dynamic Schema Processor:

Let’s create the Dynamic Schema Processor that will modify the Mondrian schema file so that when we use the Market dimension, only the records for the specified country are displayed. In order to implement that, we need to create a class that implement a DynamicSchemaProcessor interface (we need to extend the filterDynamicSchemaProcessor/ localizingDynamicSchemaProcessor which implements the interface). Below are the code which extends the DynamicSchemaProcessor interface,

 

DSPCode

The variable filter fetches the where clause query from the properties file which will also be kept in the server folder. The function which fetches the details have been defined in another class,

DSPCode1

DSPCode3

 

3. How the code works?

  • Once user logs into the server, the class “ChangeDynamicSchemaProcessor” is called and the schema is passed into the code and is assigned to the variable “originalschema”.
  • All the properties are loaded using a properties file.
  • The properties file is used to configure the db connection,the where clause which will be added to the schema file, the query to generate Country for that particular user.
  • Once the properties is loaded, the class calls a function “getdetails”(present in the class Connectiontest which is initialized first) which passes the properties file along with username and the role of that particular user.
  • Inside the Connectiontest class, it is checked against the role of that particular user. If the role is Admin, then it will return “1=1” or else it will execute the query defined inside the properties file. Once the query is executed, it will return the country name for that particular user.
  • Once the country name is returned, it returns to the main class where the schema is modified and the where clause is added in place of “%PLACE_HOLDER%” variable.
  • Once the schema is modified, the modified schema is displayed with only the data for that particular country.

 

4. Configuring Dynamic Schema Processor on BA Server:

Once the code is completed, we need to create a JAR file to upload the class on the server. In order to use the newly created Dynamic Schema Processor you have to add the jar to the following location: /biserver-ee/tomcat/webapps/pentaho/WEB-INF/lib/. Then you have to edit the configuration of the schema in the BA Server to specify that this particular DSP should be used for this Schema. You go again to Manage Data Sources select the schema and then click Edit from drop-down menu and then add a new parameter called DynamicSchemaProcessor with value linking to your class (see picture below). Also the properties file needs to added to the following location: /biserver-ee/tomcat/webapps/pentaho/WEB-INF/classes. Once complete, restart the server.

DatasourceLocation

Adding the class name to the parameter list so that whenever the cube is executed, the class is called. Please see the image below

datasourceConfig

Now if we create the same report we will get different results than in the beginning.

 

5. Configuring the Saiku analytics to work with DSP:

The above configuration will work with Pentaho Analyzer. But in case of Community Edition, the Saiku plugin is required to run the analysis. Now since Saiku has its own Mondrian engine, we need to configure on the saiku configuration as well.

Saiku has its own Mondrian engine. In order to work, we have to add the JAR file to the lib folder of Saiku present in pentaho-solutions/system/saiku/lib folder. Once the JAR is loaded, we need to do one more thing in order to make the plugin work.

A file named as plugin.spring.xml is present in the pentaho-solutions/system/saiku folder where the configuration property for Dynamic Schema is commented out. We just need to remove the comments so that Saiku will activate the DSP in its engine. Just uncomment the dynamicschema processor property as shown below,

SaikuConfig

Once it is completed, restart the BA Server. The saiku analytics will work.

6. Checking on the errors and logging:

The problem with DSP is that, if there is any error it won’t let the cube to be displayed on the BA Server. In order to verify and check the errors, we need to directly look into the log files present inside the log folder of the Pentaho Server. We wont be able to see the cube, if we have any issue on the code. If the code is working properly, it would display the modified schema on the pentaho-log.log file.

7. Disadvantages and things to be taken care of:

Dynamic Schema processor is an added scenario that has been provided by Pentaho so that we can have enhanced security on reports. However there are some disadvantages with DSP,

  • DSP doesn’t support jPivot: In the Pentaho Server Community Edition, jPivot is used for analysis reports. But DSP doesn’t work with JPivot reports. We did all kinds of research but it didn’t work. One can configure the Saiku Plugin present in the marketplace and then we can use DSP on CE Server.
  • Performance issues: If the data is huge, then the report might run slowly since it is calling the underlying class.
  • Mention the class name properly because a minor issue and the cube wont display on the BA Server.
  • It is advisable to extend the LocalizingDynamicSchemaProcessor class instead of filterDynamicSchemaProcessor because in some cases the code doesn’t work.

 

8. Conclusion:

DSP is a wonderful thing that pentaho has provided to implement on reports. It helps us to maintain enhanced security. It can be implemented on cubes, prpt reports and also on the interactive reports. The reports get modified dynamically when a user logs into the BA server.

Introduction to Dynamic Schema Processor- Modifying Mondrian Schema in runtime

Introduction to Dynamic Schema Processor- Modifying Mondrian Schema in runtime

What is Mondrian?

Mondrian is an open source OLAP Server which is written in Java. It supports MDX (Multi Dimensional Expression) query language and the XML for analytical and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache.

What is Dynamic Schema Processor?

A dynamic schema processor is used to dynamically change a Mondrian schema at runtime. Once the Mondrian schema is uploaded on the BA Server and you’d like to modify it on the runtime, then we need to implement Dynamic Schema Processor. For instance, let’s say we have to show the data related to the user logging into the server then we have to modify the schema dynamically in order to achieve this. This modification aims to limit records that users are able to see in reports based on the Mondrian Schema.

Why do we need Dynamic Schema Processor(DSP)?

Mondrian allows us to define user roles on the Schema itself. But we cannot add the data restriction dynamically. We can add various roles and then we can restrict the data based on that role. However, it is a very lengthy process and it is not advisable if we have many users. Hence, Mondrian has provided a dynamic schema processor which dynamically changes the schema at runtime. To do that we need to create a class that implements a DynamicSchemaProcessor interface (we need to extend the filterDynamicSchemaProcessor/ localizingDynamicSchemaProcessor which implements the interface). Once the class is created, we need to upload the JAR into the server and also we need to call the class (the class created for Dynamic Schema) where the Schema is uploaded.

Architecture of Dynamic Schema Processor:

DSP_Architecture

 

Above is the basic flow of how Dynamic Schema works in the Pentaho Server:

  1. User logs in Pentaho BA Server.
  2. The JAR loaded inside the server is called by the server internally and the class calls the DSP.
  3. The DSP class is called.
  4. The session variable sets the username which is passed into the SQL Query.
  5. The username is passed to the SQL query defined in the properties file(upload the property file in the classes folder) and it fetches the corresponding ID from the user table.
  6. The query is generated and the ID of the corresponding user is fetched.
  7. DSP returns the ID to the schema.
  8. User then access the Analyzer report inside the BA Server.
  9. The cube calls the DSP class (the class name is defined inside the datasource configuration).
  10. The DSP modifies the cube by adding a where clause condition to the schema to filter out the data based on the ID for the corresponding user.
  11. The data for that particular ID is generated.
  12. The analyzer report is generated for the particular user logged in.

Exception: In case of the Admin user, the schema file will not add the where clause for filtering and the user will have access to the whole data.

Conclusion

So this is how the Dynamic Schema Processor works in a Pentaho server. It is an added feature that Pentaho has provided to provide enhanced security on the data that are getting generated on the reports. The reports gets modified dynamically based upon the user who logs into the server.

Regards,

Nitish Kumar Mishra