Processing Multi Schema Files in Talend

Posted on by By Nikhilesh, in Business Intelligence, ETL, Talend | 0

Processing Multi schema files in Talend

Some times, we will get the input files which contains the data about different schema format but in single file.  We can process these files to get the data schema wise by using tFileinputMSDelimited.

Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.

Get your 30 Days Trail Version

Raw Data:

Here is a sample raw data file. In which it exists 3 different schema types.

Rawdata

If you observe the first column. It represents the type of that data

ORD means it is Order details

CUS means it is Customer Details

and TRS means it is Transactions.

All these 3 exists in same file. But our requirement is like to have all these 3 types of data in separate files for each type.

Job:

Create the below job to get separate files/outputs for each schema type

In the palette you will find tFileInputMSDelimited component drag it on to the work area.

Now open component tab to configure it as shown below:

component

Click on Multi Schema Editor And configure as shown in below screen shot:

Component_config

If the column separators are different for each schema you can enable Use Multiple Separators option.

After getting the preview click on the Fetch codes

fetchcodes

Now take 3 output components which ever you need it. I selected tLogrow components to display the result on the console.

Execute the job:

Job exec

Now check the output you will get three different tables based on the schema type.

Output:

Output

Thanks and Regards,

Lalitha

logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments