Mondrian Cache Refresh Automation Mechanism in Pentaho BA Server

REFRESH MONDRIAN CACHE ON THE BA SERVER

One of the strengths of mondrian’s design is that you don’t need to do any processing to populate special data structures before you start running OLAP queries. More than a few people have observed that this makes mondrian an excellent choice for ‘real-time OLAP’ — running multi-dimensional queries on a database which is constantly changing.

The Cache stores data in byte code format in memory which helps in great performance enhancements. However, when you have a cube with (near-)real-time data, the caching can get in your way. If, for instance, your data is updated every ten minutes, but the data from an hour ago is still cached, your dashboard won’t show your new data.

There are two steps to clear the cache from the server.

Step 1: After you login to the BA Server, Go to Tools –> Refresh. There you can manually clear the cache.

Step 2: There is another way of clearing the Mondrian Cache is by using a Kettle job to clear your cache after you complete the insertion/updating of the new data. First we have to create a transformation. Inside the transformation, using the HTTP Client Step we can clear the cache through the ETL(Before that just use a Generate Rows step since the HTTP step works if there is a step before it) by passing the following URL inside the step “localhost:8080/pentaho/api/system/refresh/modrianSchemaCache” and also the user credentials. It clears the Mondrian cache every time the ETL runs on the server. It clears cache for all the cubes that are present in the server so that it is ready for the fresh data and we wont face any cache related issue which has always been a headache for real time data.

Similarly, we can also use this mechanism to clear the cache for reporting metadata and also for the repository cache. We just have to follow the same procedure in this also.

 

Regards,

Nitish Kumar Mishra

Difference between SQL and MDX

Difference between SQL and MDX

The MDX (Multi Dimensional Expressions) syntax appears very similar to the SQL syntax and also functions somewhat similar to the way SQL performs. But there are many differences, basically conceptual differences, between SQL and MDX.

The basic difference between MDX and SQL is the ability of MDX to reference multiple dimensions.

  1. Data Representation:

SQL refers to only two dimensions, rows and columns, when processing queries since SQL was designed to handle tabular data in two dimensional formats. Here rows and columns have some meaning in SQL syntax. However, in MDX Queries, it can refer to multiple dimensions distributed along different axis. Here rows and columns are different to that of SQL. They are depicted as Horizontal (rows) and vertical (column) axis.

  1. Difference in Select and Where clause:

In SQL, the Select clause is used to depict the column information i.e. it defines the column layout for a query, while the where clause is used to define the row layout. In MDX, the select clause is used to define several axis dimensions, while the where clause is used to restrict the data to a specific dimension or a member.

  1. Visualization:

In SQL, the visualization of data is in a two dimensional format consisting of rows and columns. It is more like a tabular form. In MDX, the visualization of data is multi dimensional on horizontal and vertical axis. It is more like in a cross tabular form.

  1. JOINS:

MDX supports all the joins that SQL supports. The join condition between the fact and the dimension will be specified while designing the cube. Following are the examples in which a join is specified,

  1. Inner Join:

In SQL,


SELECT table1.<column_name>,table2.<column_name>FROM table1 inner join table2

ON table1.column1 = table2.column2;

 

 

In MDX,

SELECT NON EMPTY {[DimensionName].[LevelName].Members} ON ROWS,
NON EMPTY {[Measures].[MeasureName]} ON COLUMNS
FROM [CubeName];

 

  1. Left Outer Join:

In SQL,

SELECT table1.<column_name>,table2.<column_name>
FROM table1 left join table2
ON table1.column1 = table2.column2;

 

In MDX,

SELECT {[DimensionName].[LevelName].Members} ON ROWS,
NON EMPTY {[Measures].[MeasureName]} ON COLUMNS
FROM [CubeName];

 

  1. Right Outer Join:

In SQL,

SELECT table1.<column_name>,table2.<column_name>
FROM table1 right join table2
ON table1.column1 = table2.column2;

 

In MDX,

SELECT NON EMPTY {[DimensionName].[LevelName].Members} ON ROWS,
{[Measures].[MeasureName]} ON COLUMNS
FROM [CubeName];

 

  1. Full Outer Join:

In SQL,

SELECT table1.<column_name>,table2.<column_name>
FROM table1 right join table2
ON table1.column1 = table2.column2;

 

In MDX,

SELECT {[DimensionName].[LevelName].Members} ON ROWS,
{[Measures].[MeasureName]} ON COLUMNS
FROM [CubeName];

 

  1. FUNCTIONS:

 

MDX supports a lot of functions which makes its execution much simpler. In SQL, generating the desired result makes query very complex. For example, generating the Sales value for current year and previous year along with the YTD value is very easy in MDX since there are some predefined functions like Parallel period and YTD which you need to define in the WITH clause of the MDX statement. In SQL, it becomes complicated since you need to create aliases for each value (Current Year, Previous Year and YTD) and then you have to join all the aliases to generate the desired result. It makes query very complex.

 

  • Example in MDX:
WITH member [Measures].[Last Period] as ‘([Time].[Years].CurrentMember.PrevMember, [Measures].[Sales])’ member [Measures].[Growth] as ‘[Measures].[Sales] – [Measures].[Last Period]’
SELECT CROSSJOIN({[Time].[Years].Members},{[Measures].[Sales],[Measures].[Last Period],[Measures].[Growth]}) ON COLUMNS FROM [SteelWheelsSales];

 

CONCLUSION:

MDX and SQL performs the same execution but the results are portrayed in different ways. SQL shows data in a tabular two dimensional form whereas MDX shows data in a multidimensional form.

 

Regards,

Nitish Kumar Mishra

Dynamic Schema Processor: How to implement security in Pentaho using Dynamic Schema Processor

Implementation of Security in Pentaho using Dynamic Schema Processor

         In my earlier post, I had explained the meaning of Dynamic Schema processor (DSP) and why do we need it. Now if you have a Mondrian Schema uploaded to the BA Server and you would like to modify it in the run time. For instance in our project we change it for every user authenticated to the platform to apply security restrictions. This modification aims to limit records that users are able to see in reports based on that Mondrian Schema. So, basically DSP is used to implement security on the uploaded Mondrian schema.

1. How to implement DSP?

The solution is we have to create a JAR file which extends the Dynamic Schema Processor from Mondrian library. I would be explaining the implementation of DSP using an example,

  • Pentaho Server: Pentaho BA Server 6
  • Database Name: MySQL/sampledata
  • Schema Name: Sales

The aim is to show data based upon the country to which the logged in user belongs to. Below is the resultset of the Users belonging to various countries.

Resultset

Now on top of it, a mondrian schema is created using Schema workbench.

schema

  • Original Schema:

Originalschema

Let’s upload the schema to the BA Server. Click on the Manage Data Source button in Home Page, then open the link Import Analysis from the drop-down menu under the icon on the left to New Data Source button. Select the file with our Mondrian Schema and a data source that links to our database. Alternatively we can also publish it through our Schema workbench using the publish option inside the File menu.

Now we have the schema uploaded on the server. The Mondrian cubes can be accessed using Pentaho analyzer/jPivot/Saiku plug-in. The schema will display the data for all dimensions and facts.

  • Modifying the schema:

Now let’s modify the schema slightly by adding a SQL dialect to the xml schema where the CUSTOMER_W_TER table is mentioned (You can also modify the schema from the DSP itself but for now let’s modify this thing in the schema itself).

 

modifiedSchema

Note: The ‘%PLACE_HOLDER%’ value is a where clause query that is defined in the properties file. It is basically the column in which you want to filter data on. Here the column is country in which the user is present. We have to make sure there is a user table and it has some relationship with the column with which we are filtering the data.

2. Creating Dynamic Schema Processor:

Let’s create the Dynamic Schema Processor that will modify the Mondrian schema file so that when we use the Market dimension, only the records for the specified country are displayed. In order to implement that, we need to create a class that implement a DynamicSchemaProcessor interface (we need to extend the filterDynamicSchemaProcessor/ localizingDynamicSchemaProcessor which implements the interface). Below are the code which extends the DynamicSchemaProcessor interface,

 

DSPCode

The variable filter fetches the where clause query from the properties file which will also be kept in the server folder. The function which fetches the details have been defined in another class,

DSPCode1

DSPCode3

 

3. How the code works?

  • Once user logs into the server, the class “ChangeDynamicSchemaProcessor” is called and the schema is passed into the code and is assigned to the variable “originalschema”.
  • All the properties are loaded using a properties file.
  • The properties file is used to configure the db connection,the where clause which will be added to the schema file, the query to generate Country for that particular user.
  • Once the properties is loaded, the class calls a function “getdetails”(present in the class Connectiontest which is initialized first) which passes the properties file along with username and the role of that particular user.
  • Inside the Connectiontest class, it is checked against the role of that particular user. If the role is Admin, then it will return “1=1” or else it will execute the query defined inside the properties file. Once the query is executed, it will return the country name for that particular user.
  • Once the country name is returned, it returns to the main class where the schema is modified and the where clause is added in place of “%PLACE_HOLDER%” variable.
  • Once the schema is modified, the modified schema is displayed with only the data for that particular country.

 

4. Configuring Dynamic Schema Processor on BA Server:

Once the code is completed, we need to create a JAR file to upload the class on the server. In order to use the newly created Dynamic Schema Processor you have to add the jar to the following location: /biserver-ee/tomcat/webapps/pentaho/WEB-INF/lib/. Then you have to edit the configuration of the schema in the BA Server to specify that this particular DSP should be used for this Schema. You go again to Manage Data Sources select the schema and then click Edit from drop-down menu and then add a new parameter called DynamicSchemaProcessor with value linking to your class (see picture below). Also the properties file needs to added to the following location: /biserver-ee/tomcat/webapps/pentaho/WEB-INF/classes. Once complete, restart the server.

DatasourceLocation

Adding the class name to the parameter list so that whenever the cube is executed, the class is called. Please see the image below

datasourceConfig

Now if we create the same report we will get different results than in the beginning.

 

5. Configuring the Saiku analytics to work with DSP:

The above configuration will work with Pentaho Analyzer. But in case of Community Edition, the Saiku plugin is required to run the analysis. Now since Saiku has its own Mondrian engine, we need to configure on the saiku configuration as well.

Saiku has its own Mondrian engine. In order to work, we have to add the JAR file to the lib folder of Saiku present in pentaho-solutions/system/saiku/lib folder. Once the JAR is loaded, we need to do one more thing in order to make the plugin work.

A file named as plugin.spring.xml is present in the pentaho-solutions/system/saiku folder where the configuration property for Dynamic Schema is commented out. We just need to remove the comments so that Saiku will activate the DSP in its engine. Just uncomment the dynamicschema processor property as shown below,

SaikuConfig

Once it is completed, restart the BA Server. The saiku analytics will work.

6. Checking on the errors and logging:

The problem with DSP is that, if there is any error it won’t let the cube to be displayed on the BA Server. In order to verify and check the errors, we need to directly look into the log files present inside the log folder of the Pentaho Server. We wont be able to see the cube, if we have any issue on the code. If the code is working properly, it would display the modified schema on the pentaho-log.log file.

7. Disadvantages and things to be taken care of:

Dynamic Schema processor is an added scenario that has been provided by Pentaho so that we can have enhanced security on reports. However there are some disadvantages with DSP,

  • DSP doesn’t support jPivot: In the Pentaho Server Community Edition, jPivot is used for analysis reports. But DSP doesn’t work with JPivot reports. We did all kinds of research but it didn’t work. One can configure the Saiku Plugin present in the marketplace and then we can use DSP on CE Server.
  • Performance issues: If the data is huge, then the report might run slowly since it is calling the underlying class.
  • Mention the class name properly because a minor issue and the cube wont display on the BA Server.
  • It is advisable to extend the LocalizingDynamicSchemaProcessor class instead of filterDynamicSchemaProcessor because in some cases the code doesn’t work.

 

8. Conclusion:

DSP is a wonderful thing that pentaho has provided to implement on reports. It helps us to maintain enhanced security. It can be implemented on cubes, prpt reports and also on the interactive reports. The reports get modified dynamically when a user logs into the BA server.

What is a virtual cube and what is its significance in Mondrian

Virtual Cubes and their significance in Mondrian

Before going into the Virtual cubes, first lets just refresh our memory about what is a cube. A cube is basically a multi dimensional views of data,querying and analytical compoenets to clients. Its is a data processing unit composed of dimensions and fact tables. It acts as a business layer between an end user and the DW. A cube works on the mechanism of Star schema where there can be a fact table and multiple dimension tables and all the dimensions are related to the fact.

Virtual Cubes:

We all now know that each cube can have one fact table and all the dimensions present in the cube are having a relationship with that fact table. Now there are some scenarios in which we have two different cubes and we need to establish a relationship between them. In those kind of scenarios, we have the virtual cubes concept. A virtual cube is a cube in which we can use the dimensions and measures from multiple cubes and generate a analysis report using that cube. if we need to analyze data across multiple cubes, or need to combine information from two fact tables on the same subject but with different granularity — then you must create a virtual cube.

Significance of Virtual cubes:

1. A virtual cube can also be based on a single cube to expose only selected subsets of its measures and dimensions.

2. A virtual cube can include normal or linked cubes as component cubes.

3. Since virtual cubes store only their definitions and not the data of their component cubes, they require virtually no physical storage space. You can use virtual cubes to create combinations and variants of existing cubes without using significant additional storage.

4. A virtual cube can provide a valuable security function by limiting the access of some users when viewing the underlying cubes. If some of a cube’s information is sensitive and not suitable for all users, you can create a virtual cube from the existing cube and omit the sensitive information.

5. Virtual cubes can also be used to implement the concept of Snowflake schema. In virtual cubes, we can combine the information from the fact table on the same subject and also information between two dimension from different cubes. it improves the performance and consumes lesser memory.

implementation of Virtual cube:

1. Dimensions which are defined outside the cube (non-confirmed dimensions) can be used directly inside the virtual cube. However, the dimensions which are defined inside the cube can be accessed by adding the cube name inside the VirtualCubeDimension component.

2. The VirtualCubeMeasure component imports a measure from one of the constituent cubes. It is imported with the same name. If you want to create a formula or rename a measure as you import it, use the CalculatedMember component instead.

However, virtual cubes has a disadvantage if the elements present inside the cube is large in number. Processing a virtual cube automatically triggers processing of all underlying cubes that need to be processed, which can add significant time. So it is advisable to use lesser amount of components and only those components that are actually required in the virtual cube.

Understanding and Usage of Mondrian

This blog will be talking about the different layers of Mondrian engine, its components, introduction to ROLAP etc.

Mondrian Architecture

 

Layers of a Mondrian system

A Mondrian OLAP System consists of four layers; these are as follows:

  1. Presentation layer

The presentation layer determines what the end-user sees on his or her monitor, and how he or she can interact to ask new questions. There are many ways to present multidimensional datasets, including pivot tables, pie, line and bar charts, and advanced visualization tools such as clickable maps and dynamic graphics. These might be written in Swing or JSP, charts rendered in JPEG or GIF format, or transmitted to a remote application via XML. What all of these forms of presentation have in common is the multidimensional ‘grammar’ of dimensions, measures and cells in which the presentation layer asks the question is asked, and OLAP server returns the answer.

  1. Dimensional layer

The second layer is the dimensional layer. The dimensional layer parses, validates and executes MDX queries. A query is evaluated in multiple phases. The axes are computed first, then the values of the cells within the axes. For efficiency, the dimensional layer sends cell-requests to the aggregation layer in batches. A query transformer allows the application to manipulate existing queries, rather than building an MDX statement from scratch for each request. And metadata describes the dimensional model, and how it maps onto the relational model.

  1. Star layer

The third layer is the star layer, and is responsible for maintaining an aggregate cache. An aggregation is a set of measure values (‘cells’) in memory, qualified by a set of dimension column values. The dimensional layer sends requests for sets of cells. If the requested cells are not in the cache, or derivable by rolling up an aggregation in the cache, the aggregation manager sends a request to the storage layer.

  1. Storage layer.

The storage layer is an RDBMS. It is responsible for providing aggregated cell data, and members from dimension tables.

These components can all exist on the same machine, or can be distributed between machines. Layers 2 and 3, which comprise the Mondrian server, must be on the same machine. The storage layer could be on another machine, accessed via remote JDBC connection. In a multi-user system, the presentation layer would exist on each end-user’s machine (except in the case of JSP pages generated on the server).

ROLAP Defined

Pentaho Analysis is built on the Mondrian relational online analytical processing (ROLAP) engine. ROLAP relies on a multidimensional data model that, when queried, returns a dataset that resembles a grid. The rows and columns that describe and bring meaning to the data in that grid are dimensions, and the hard numerical values in each cell are the measures or facts. In Pentaho Analyzer, dimensions are shown in yellow and measures are in blue.

ROLAP requires a properly prepared data source in the form of a star or snowflake schema that defines a logical multidimensional database and maps it to a physical database model. Once you have your initial data structure in place, you must design a descriptive layer for it in the form of a Mondrian schema, which consists of one or more cubes, hierarchies, and members. Only when you have a tested and optimized Mondrian schema is your data prepared on a basic level for end-user tools like Pentaho Analyzer and JPivot.

 

Workflow Overview

To prepare data for use with the Pentaho Analysis (and Reporting, to a certain extent) client tools, you should follow this basic workflow:

Design a Star or Snowflake Schema

The entire process starts with a data warehouse.The end result should be data model in the star or snowflake schema pattern. You don’t have to worry too much about getting the model exactly right on your first try. Just cover all of your anticipated business needs; part of the process is coming back to the data warehouse design step and making changes to your initial data model after you’ve discovered what your operational needs are.

Populate the Star/Snowflake Schema

Once your data model is designed, the next step is to populate it with actual data, thereby creating your data warehouse. The best tool for this job is Pentaho Data Integration, an enterprise-grade extract, transform, and load (ETL) application.

Build a Mondrian Schema

Now that your initial data warehouse project is complete, you must build a Mondrian schema to organize and describe it in terms that Pentaho Analysis can understand. This is also accomplished through Pentaho Data Integration by using the Agile BI plugin. Just connect to your data warehouse and auto-populate your schema with the Modeler graphical interface. Alternatively, you can use Pentaho Schema Workbench to create an analysis schema through a manual process.

Initial Testing

At this point you should have a multi-dimensional data structure with an appropriate metadata layer. You can now start using Pentaho Analyzer and JPivot to drill down into your data and see if your first attempt at data modeling was successful. In all likelihood, it will need some adjustment, so take note of all of the schema limitations that you’re unhappy with during this initial testing phase.

Do not be concerned with performance issues at this time — just concentrate on the completeness and comprehensiveness of the data model.

Adjust and Repeat Until Satisfied

Use the notes you took during the testing phase to redesign your data warehouse and Mondrian schema appropriately. Adjust hierarchies and relational measure aggregation methods. Create virtual cubes for analyzing multiple fact tables by conforming dimensions. Re-test the new implementation and continue to refine the data model until it matches your business needs perfectly.

Test for Performance

Once you’re satisfied with the design and implementation of your data model, you should try to find performance problems and address them by tuning your data warehouse database, and by creating aggregation tables. The testing can only be reasonably done by hand, using Pentaho Analyzer and/or JPivot. Take note of all of the measures that take an unreasonably long time to calculate. Also, enable SQL logging and locate slow-performing queries, and build indexes for optimizing query performance.

Create Aggregation Tables

Using your notes as a guide, create aggregation tables in Pentaho Aggregation Designer to store frequently computed analysis views. Re-test and create new aggregation tables as necessary.

If you are working with a relatively small data warehouse or a limited number of dimensions, you may not have a real need for aggregation tables. However, be aware of the possibility that performance issues may come up in the future.

Check in with your users occasionally to see if they have any particular concerns about the speed of their BI content.

Deploy to Production

Your data warehouse and Mondrian schema have been created, tested, and refined. You’re now ready to put it all into production. You may need to personally train or purchase Pentaho training for anyone in your organization who needs to create traditional reports, dashboards, or Analyzer reports with Pentaho’s client tools.

Conditional Formatting in jaspersoft’s OLAP view (mondrian jPivot)

 Conditional Formatting in jaspersoft’s OLAP view (mondrian jPivot)

  1. Open {js-install}\apache-tomcat\webapps\jasperserver-pro\jpivot\table\mdxtable.css and {js-install}\apache-tomcat\webapps\jasperserver-pro\j\mdxtable.css.
  2. In these files find the following section

/* data cells */

td.cell-even,

td.cell-odd,

td.cell-red,

td.cell-yellow,

td.cell-green,

td.cell-rot,

td.cell-gelb,

td.cell-gruen,

td.property-value,

td.property-span {

font-family: Arial, Helvetica, Tahoma, sans-serif;

font-size: 80%;

/*color : Black;*/

color : #033669;

text-align: right;

/*border-bottom: solid 1px #000033;

border-right: solid 1px #000033;

border-top: solid 1px #000033;

border-left: solid 1px #000033;*/

}

  1. Add 3 styles for 3 different colors as shown below

/* data cells */

td.cell-even,

td.cell-odd,

td.cell-red,

td.cell-yellow,

td.cell-green,

td.cell-rot,

td.cell-gelb,

td.cell-gruen,

td.cell-below_avg_red,

td.cell-avg_orange,

td.cell-above_avg_green,

td.property-value,

td.property-span {

font-family: Arial, Helvetica, Tahoma, sans-serif;

font-size: 80%;

/*color : Black;*/

color : #033669;

text-align: right;

/*border-bottom: solid 1px #000033;

border-right: solid 1px #000033;

border-top: solid 1px #000033;

border-left: solid 1px #000033;*/

}

  1. Then a little down you will see some blocks that look like this. This one in particular is for green.td.cell-gruen,
    td.cell-green,
    span.cell-gruen,
    span.cell-green
    {
    background-color: #66FF66;
    }

    So, now just copy this one, and make it for 3 new styles added like this:

 td.cell-below_avg_red,

span.cell-below_avg_red

{

background-color: #FF0000;

}

 

td.cell-avg_orange,

span.cell-avg_orange

{

background-color: #FFBF00;

}

 td.cell-above_avg_green,

span.cell-above_avg_green

{

background-color: #04B404;

}

  1. Now, within the MDX query you can reference these styles as shown below:

 a)      with member [Measures].[Profit] as ‘([Measures].[Store Sales] – [Measures].[Store Cost])’, FORMAT_STRING = CASE WHEN (([Measures].[Profit] <= 300000.0) AND ([Measures].[Profit] > 100000.0)) THEN “|#|style=above_avg_green” WHEN (([Measures].[Profit] <= 100000.0) AND ([Measures].[Profit] > 50000.0)) THEN “|#|style=avg_orange” ELSE “|#|style=below_avg_red” END select {[Measures].[Store Sales], [Measures].[Profit]} ON COLUMNS, {[Product].CurrentMember.Children} ON ROWS from [Sales]

2

b)      with member [Measures].[Profit] as ‘([Measures].[Store Sales] – [Measures].[Store Cost])’, FORMAT_STRING = CASE WHEN (([Measures].[Profit] <= 101.0) AND ([Measures].[Profit] > 75.0)) THEN “|#|style=above_avg_green” WHEN (([Measures].[Profit] <= 75.0) AND ([Measures].[Profit] > 50.0)) THEN “|#|style=avg_orange” ELSE “|#|style=below_avg_red” END

select {([Time].[2012].[Q1], [Measures].[Profit]), ([Time].[2012].[Q2], [Measures].[Profit]), ([Time].[2012].[Q3], [Measures].[Profit]), ([Time].[2012].[Q4], [Measures].[Profit])} ON COLUMNS,

{[Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Cola], [Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Cream Soda], [Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Diet Cola], [Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Diet Soda], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Apple Juice], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Berry Juice], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Cranberry Juice], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Orange Juice]} ON ROWS

from [Sales]

1

Date Calculation using Joda Time Library In iReport

Dear readers, this blog talks about Date Calculation using Joda Time Library In iReport

By using Joda Time Library , we can give default value expression for any datetype parameter .

Steps for using joda time library in iReport  :

  1. Download joda time library ,

2. Put its jar file “joda-time.jar” in the path of iReport : Tools/Options/Classpath/.

joda ireport

3. In the main report section properties , set import property to “org.joda.time.DateTime

Without  double quotes.

joda ireport 2

4. Now with the above configuration , we will be able to do any type of date calculation

 

5. Example :

   a. Start Day Of Current Month :

Take two parameters : now and startday respectively

For ‘now’ parameter give its class name as ‘java.util.Date’ and default value expression as new DateTime.now().dayOfMonth().withMinimumValue().withTimeAtStartOfDay()

 

b. Last day of previous Month :

Take two parameters : now and endday respectively

For ‘now’ parameter give its class name as ‘java.util.Date’ and default value expression as new DateTime.now().minusMonths(1).dayOfMonth().withMaximumValue()

In the similar way we can do any type of date calculation for default value expression.

Rupam Bhardwaj

Helical IT Solutions

Debug Mondrian Cubes

This blog will talk and educate the reader on how to debug Mondrian Cubes.

Question: How to look at the queries that Mondrian generates while the user is navigating the OLAP cube. ? It’s really useful to look at Mondrian log files because they give us a lot of useful information about how our system is behaving. We can
o look at sql statements and MDX queries,
o Have some profiling information on queries that are executed,
o Get other useful debugging information.

The following Steps illustrate how to enable Mondrian debugging logs, adding some properties to the Mondrian configuration file. After that, we’ll configure two new log4j appenders to have the desired log files properly written on our file system.

Step 1: Enable Mondrian debug log – Mondrian has a big set of configuration settings that can be modified. In our case, to enable Mondrian debug information follow the steps detailed below: Open the mondrian.properties file located in <bi-server_home>/pentaho-solution/system/mondrian and add the following line.
mondrian.rolap.generate.formatted.sql=true

Example:
Mondrian.properties file location.
D:\Installation Softwares\Pentaho\biserver-ce-4.8.0-stable\biserver-ce\pentaho-solutions\system\mondrian

Debug Mondrian Cube

Step 2: Update log4j configuration
At this point we’re going to modify the log4j configuration file adding the required appenders categories to have our logging information displayed properly Open the log4j.xml file located in <bi-server_home>/tomcat/webapps/pentaho/WEB-INF/classes Based on what you want to log, add the one or each of the following lines to the file. They will create two new RollingFileAppenders. You’re free to use the kind of appender you prefer.

Example: Location of log4j.xml file

D:\Installation Softwares\Pentaho\biserver-ce-4.8.0-stable\biserver-ce\tomcat\webapps\pentaho\WEB-INF\classe Add the following code.

NOTE: The code is already available within the file.. Just we need to un-comment.

<appender name=”MDXLOG” class=”org.apache.log4j.RollingFileAppender”>
<param name=”File” value=”../logs/mondrian_mdx.log”/>
<param name=”Append” value=”false”/>
<param name=”MaxFileSize” value=”500KB”/>
<param name=”MaxBackupIndex” value=”1″/>
<layout class=”org.apache.log4j.PatternLayout”>
<param name=”ConversionPattern” value=”%d %-5p [%c] %m%n”/>
</layout>
</appender>

<category name=”mondrian.mdx”>
<priority value=”DEBUG”/>
<appender-ref ref=”MDXLOG”/>
</category>

<!– ========================================================= –>
<!– Special Log File specifically for Mondrian SQL Statements –>
<!– ========================================================= –>

<appender name=”SQLLOG” class=”org.apache.log4j.RollingFileAppender”>
<param name=”File” value=”../logs/mondrian_sql.log”/>
<param name=”Append” value=”false”/>
<param name=”MaxFileSize” value=”500KB”/>
<param name=”MaxBackupIndex” value=”1″/>
<layout class=”org.apache.log4j.PatternLayout”>
<param name=”ConversionPattern” value=”%d %-5p [%c] %m%n”/>
</layout>
</appender>

<category name=”mondrian.sql”>
<priority value=”DEBUG”/>
<appender-ref ref=”SQLLOG”/>
</category>

 

Step 3: Enable the new log settings To have the new log settings enabled restart the Pentaho bi-server instance.

Log files location
D:\Installation Softwares\Pentaho\biserver-ce-4.8.0-stable\biserver-ce\tomcat\logs
After restarting the server and when you run the CUBE-OLAP…
You can find the following two files in the above location.

Files are:
i) Mondrian_mdx.log
ii) Mondrian_sql.log

Debug Mondrian Cube 2

Now, Enjoy analysing SQL queries that generated while performing run actions in various tools like Pentaho Analyser, Saiku Analysis.

Get in touch with us at Helical IT Solutions

HOW TO CREATE LOCK FILE IN C

 

File locking provides a very simple yet incredibly useful mechanism for coordinating file accesses. Before I begin to lay out the details, let me fill you in on some file locking secrets:-

There are two types of locking mechanisms: mandatory and advisory.

– Mandatory systems will actually prevent read() and write() to file.

– There are two types of (advisory!) locks: read locks and write locks (also referred to as shared locks and exclusive locks, respectively.)

 

Here I am demonstrating READ-LOCK …

In  my  case , I want  to reduce CPU usage which  takes very  high CPU usage % in case of accessing Sqlite3  by  two  application simultaneously.

To solve  this problem  I  applied  lock  file  concept  to synchronizes  applications ,

I applied it for three different applications.

 

Steps followed while creating lock file:-

 

– Finalize  File Name (xyz.lock)

– Finalize  lock file  location

– Before running  application , check lock file is  present or not

if(getlock(lockfile)== 1) :: Lock file is not present

– If lock file is not present  then create  lock file

int lockstatus = setlock(lockfile);

– After  completion  of job  delete lock file

 

Removelock (lockfile)

– If  lock file is  present  then sleep  for  sometime then again check for lock file

– While creating lock file, handle unexpected application closing mechanism to avoid trouble when next time application needs to be run.

 

Note: Extension of lock file is (abc.lock)

Note: Remove lock file when application is forcefully terminating (ctrl + C)

 

(Example of lock file in c using semaphores : –  )

Reference: http://www.c-program-example.com/2012/04/c-program-to-lock-file-using-semaphores.html

Differences between Reporting and Analysis – Concepts

Differences between Reporting and Analysis – Concepts

Definition

       1.The process of orga­niz­ing data into infor­ma­tional sum­maries in order to mon­i­tor how dif­fer­ent areas of a busi­ness are per­form­ing.

Definition

        1.The process of explor­ing data and reports in order to extract mean­ing­ful insights, which can be used to bet­ter under­stand and improve busi­ness performance.

       2.Report­ing trans­lates raw data into infor­ma­tion

        2.Analy­sis trans­forms data and infor­ma­tion into insights.

       3.Good report­ing should raise ques­tions about the busi­ness from its end users

        3.The goal of analy­sis is to answer ques­tions by inter­pret­ing the data at a deeper level and pro­vid­ing action­able rec­om­men­da­tions

      4.Report­ing shows you what is hap­pen­ing

      4.Analy­sis focuses on explain­ing why it is hap­pen­ing and what you can do about it.

Tasks

       5.Reporting: Build­ing, con­fig­ur­ing, con­sol­i­dat­ing, orga­niz­ing, for­mat­ting, and sum­ma­riz­ing.

        5.Analy­sis:  focuses on dif­fer­ent tasks such as ques­tion­ing, exam­in­ing, inter­pret­ing, com­par­ing, and con­firm­ing.

Outputs/Approach

       6.Report­ing fol­lows a push approach

         6.Analy­sis fol­lows a pull approach

       7.Reports are pushed to users who are then expected to extract mean­ing­ful insights and take   appro­pri­ate actions for them­selves (i.e.,     self-serve).


8.Types

Dashboard: These custom-made reports com­bine dif­fer­ent KPIs and reports to pro­vide a com­pre­hen­sive, high-level view of busi­ness per­for­mance for spe­cific audi­ences. Dash­boards may include data from var­i­ous data sources and are also usu­ally fairly static.

7.Par­tic­u­lar data is pulled by an ana­lyst in order to answer spe­cific busi­ness ques­tions.

8.Types
Ad hoc responses
Analy­sis pre­sen­ta­tions

Approach of Reporting and Analysis

image