Beginners Guide in Understanding Different Types of Joins

INNER JOINS

When we decide that one table won’t be enough for getting our desired output then we try to attach one more table to the existing table is known as JOINS. Now Inner Joins in other words is doing the same thing as the definition of joins goes but with an additional advantage which is based on a particular key column from table 1 and similar key column in table 2 then if the condition exist then it clubs that table and solely ignores rows that does not satisfy the condition

Now practically [using PGAdmin]

Blog

LEFT JOINS & RIGHT JOINS

As we defined joins this is also similar but with Left Joins it clubs the 1st table and 2nd Table based on the condition and also rows that does not satisfy the condition in the 2nd Table

Similarly it we can say for Right Joins but in vice versa manner

 

Real Time

person_disability table has 25 Rows and a column with personid
and person_employment has 19558 rows and also column with personid

Now lets see when we say

LEFT

It returned 12663 after adding few filters why?
Because Based on the 1st Table(Left Table) the condition will execute.

I.e. the table person_employment total rows will execute
and only does rows in the person_disability table that satisfy the condition will show and also NULL rows will also show (because those rows does not satisfy the condition).

 

Now if we Right Join the Same table let’s see what happens?

RIGHT

Now it returned 25 rows after adding few filters why?

Because based on the 2nd Table (Right Table) the condition will execute.

I.e. the person_disability table total rows will execute
and only does rows in the person_employment table that satisfy the condition will show and also NULL rows will also show (because those rows does not satisfy the condition).

We can also attain a RIGHT JOIN Using a Left join by swapping the names of the table
example

LEFT1

In My next Blog we will discuss on CROSS JOIN

Jasper BI And Pentaho Reports Performance Optimization

In any BI solution, we create a lot of reports and dashboard. Often the developers come across the problem that the solution is not performing well, the reports are

taking a lot of time to load.
In this blog we will try to discuss the different approaches which could be used to optimize the performance.

a. Having good hardware. This ensures that the processing is not taking much time

b. Having good internet speed : This reduces latency

c. Reports with many pages: Let’s say if we are generating a report which is having 100 pages. Good part in viewing the same via BI (via web browser) (Jaspersoft and Pentaho) is that we don’t need to load all the pages before we can show it to the end user. As soon as the pages are loaded end customer can start viewing it. Hence he
can start seeing first couple of pages immediately and meanwhile the remaining pages will keep on loading and rendering to the user. But the case is not the same with other formats, if a client wants to download the report in excel/pdf/csv format, he can only download once the entire report has been loaded.

d. Caching at BI level : Jaspersoft & Pentaho, both are having caching facility. We can increase the size of the caching memory, thus this leads to improved performance.

e. Caching at database level : If we feel that some of the reports / db generally takes  a lot of time, what can be done is we can create temp tables inside db and we can put the values required for those reports in this tables. This way we can have multiple level of caching (application level and database level).

f. Optimization at query level: Using the best practices of writing sqlqueries (like avoid writing nested queries, including too many Joins etc).

g. Database indexing and controlling of users on database will also increase performance of report execution.

h. Remove unused variables / parameters / table columns with in the report.

i. Maximum Usage of built in functions which are available with in jasper and pentaho reports for calculating total / percentages / avg and for type conversions like

string to integer etc will also increase the performance of report.

j. Whenever a BI solution is developed directly hitting the transactional database, the performance is compromised. Hence its always suggested that there is separate

reporting database (or data warehouse).

k. If there are multiple input parameters (like the one present in the below snapshot), then on the submit button only the report should change.
BI Report Performance
Otherwise it will result in firing many queries to the database. Also for the report/dashboard, ideally there should not be multiple input parameters.

Navin Dasari
Helical IT Solutions