Processing Multi schema files in Talend

Processing Multi schema files in Talend

Some times, we will get the input files which contains the data about different schema format but in single file.  We can process these files to get the data schema wise by using tFileinputMSDelimited.

Raw Data:

Here is a sample raw data file. In which it exists 3 different schema types.

Rawdata

If you observe the first column. It represents the type of that data

ORD means it is Order details

CUS means it is Customer Details

and TRS means it is Transactions.

All these 3 exists in same file. But our requirement is like to have all these 3 types of data in separate files for each type.

Job:

Create the below job to get separate files/outputs for each schema type

In the palette you will find tFileInputMSDelimited component drag it on to the work area.

Now open component tab to configure it as shown below:

component

Click on Multi Schema Editor And configure as shown in below screen shot:

Component_config

If the column separators are different for each schema you can enable Use Multiple Separators option.

After getting the preview click on the Fetch codes

fetchcodes

Now take 3 output components which ever you need it. I selected tLogrow components to display the result on the console.

Execute the job:

Job exec

Now check the output you will get three different tables based on the schema type.

Output:

Output

Thanks and Regards,

Lalitha

JSON Schema

JSON Schema

What is JSON?
JSON Stands for JavaScript Object Notation. This is a structure that can be used to store data as in key value pair separated by : . JSON can be used to transfer data to and fro any network. It is a simple plain text, it can be consumed by any programming language.

For example the address information can be stored in json format

{
"street":"Station Road"
"city" : "Hyderabad",
"state":"Telengana",
"pincode":"500016"
}

The value may be simple text or it may be an array, or json object as well.

What is JSON Schema?
To validate the values against a given json we can have schema. The schema is a template that can be used to check whether a given JSON is valid or invalid.

For example the above address json the key “city” is supposed to have some values in string/text if the “city” has some arbitary number “3445545” or a decimal values “34.34344” then it can lead to wrong information. The application consuming this JSON will have wrong data. Thus JSON schema help us to define some rules that can be used to validate a JSON.
How to define a JSON Schema?
Consider the following schema for the json structure person

{
"title": "Example Schema",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
}
},
"required": ["firstName", "lastName"]
}

The key “title” is the title of the shema.
The key “type” defines the type of json. PersonJson It may be object, array etc.
The key “properties” has the actual definition of the PersonJson.
The firstName of the Person should be of type string
The lastName of the PersonJson should be type string.
The age of the PersonJson should be of type integer. The minimum value is 0
The key “required” in the schema tells which keys are mandatory and should be part of the PersonJson.

The Json schema does not restrict us to define native type but also we can have custom definition.

How to validate the Json against the Schema?
There are various libraries for different programming language to validate the schema against the Json.
Programming Language and popular schema library in json

{
"C": "WJElement (LGPLv3).",
"Java": "json-schema-validator (LGPLv3).",
"Ruby": "autoparse (ASL 2.0); ruby-jsonschema (MIT).",
"PHP": "php-json-schema (MIT). json-schema (Berkeley).",
"JavaScript" : "Orderly (BSD); JSV; json-schema; Matic (MIT); Dojo; Persevere (modified BSD or AFL 2.0); schema.js"
}

More information can be obtained at http://json-schema.org/

Somen                                                                                                                    Sources: internet