Performance Improvements in Apache Drill

Posted on December 10, 2019 by By Satya Gopi, in Business Intelligence | 0

Prerequisites: ApacheDrill

We are firing a query in Apache drill it is easily taking 3 minutes for fetching just 1 column from a Table,so to overcome we have used to 2 Performance Improvements

Partition Pruning
Parquet meta data caching

Partition Pruning :

Partition pruning allows a query engine to be able to determine and retrieve the smallest needed dataset to answer a given query. Reading small data means fewer cycles on the IO and fewer cycles on the CPU to actually process data.

Example:

create table dfs.tmp.inputcontrolsinfo partition by (`displayDate`,airport_code,location) as 
select 
distinct `displayDate`,
fields[3].control.`modelvalue` as airport_code,
fields[4].control.`modelvalue` as location
from  `observation`

Above partition is doing on basis of displaydate ,airportcode,location,now we can fire the query as below

 Select * from dfs.tmp.inputcontrolsinfo

Partition will work just like as indexing concept only

Parquet metadata caching :

Capability to cache Parquet metadata in Drill. Once the metadata is cached, it can be refreshed as needed, depending on how frequently the datasets change in the environment.

Command to use cache metadata.

REFRESH TABLE METADATA dfs.tmp.inputcontrolsinfo ;

You only have to run the REFRESH TABLE METADATA command against a table once to generate the initial metadata cache file. Thereafter, Drill automatically refreshes stale cache
files when you issue queries against the table. An automatic refresh is triggered when data is modified.The query planner uses the timestamp of the cache file.

In case if you have any queries please get us at support@helicaltech.com

Thanks,
SatyaGopi
BI Developer
Helical IT Solutions Pvt

apache drill

Business Intelligence

drill

open source

0 0 votes

Article Rating

0 Comments

Inline Feedbacks

View all comments

You might also like..

Helical Insight

Helical IT Solutions Launches Helical Insight 5.2.2 : Focus on Advance Embedded Analytics

By admin

24 Dec 2024: Helical IT Solutions is excited to unveil Helical Insight 5.2.2, the latest iteration of its cutting-edge Open Source Business Intelligence (BI) platform. This release reinforces Helical Insight's position as a cost-effective, versatile, and powerful alternative to mainstream...

Helical Insight 5.2.1

Helical IT Solutions Launches Helical Insight 5.2.1: Elevating Open Source BI to New Heights

By admin

02 Sept 2024 – Helical IT Solutions is thrilled to announce the release of Helical Insight version 5.2.1, the latest upgrade to its Open Source Business Intelligence (BI) platform. This new version delivers a powerful, cost-effective BI solution that is...

Business Intelligence

Installation of Firebird db

By admin

Steps to install firebird db 1. Go to google and type firebird in search box and then click on first link. License aggrement 2. Click on downloads and then install Firebird latest version(5.0.0). 3. It will navigate to the below...

About Helical IT Solutions Pvt Ltd

Location

Contact Us

Search what you are looking for..

Performance Improvements in Apache Drill

Posted on December 10, 2019 by By Satya Gopi, in Business Intelligence | 0

You might also like..

Helical Insight

Helical IT Solutions Launches Helical Insight 5.2.2 : Focus on Advance Embedded Analytics

By admin

Helical Insight 5.2.1

Helical IT Solutions Launches Helical Insight 5.2.1: Elevating Open Source BI to New Heights

By admin

Business Intelligence

Installation of Firebird db

By admin

Contact Form