Processing Multi schema files in Talend
Some times, we will get the input files which contains the data about different schema format but in single file. We can process these files to get the data schema wise by using tFileinputMSDelimited.
Make data easy with Helical Insight.
Helical Insight is the world’s best open source business intelligence tool.
Raw Data:
Here is a sample raw data file. In which it exists 3 different schema types.
If you observe the first column. It represents the type of that data
ORD means it is Order details
CUS means it is Customer Details
and TRS means it is Transactions.
All these 3 exists in same file. But our requirement is like to have all these 3 types of data in separate files for each type.
Job:
Create the below job to get separate files/outputs for each schema type
In the palette you will find tFileInputMSDelimited component drag it on to the work area.
Now open component tab to configure it as shown below:
Click on Multi Schema Editor And configure as shown in below screen shot:
If the column separators are different for each schema you can enable Use Multiple Separators option.
After getting the preview click on the Fetch codes
Now take 3 output components which ever you need it. I selected tLogrow components to display the result on the console.
Execute the job:
Now check the output you will get three different tables based on the schema type.
Output:
Thanks and Regards,
Lalitha
Best Open Source Business Intelligence Software Helical Insight is Here