Data Integrator Transforms List
Hi Everyone,
SAP Data services is one of the
finest ETL(Extract, Transform, Load) tools which delivers a single
enterprise-class solution for data integration, data quality, data profiling,
and text data processing that allows you to integrate, transform, improve, and
deliver trusted data to critical business processes.
SAP Data services Transforms
are built-in system objects stored in repository, which are used whenever we
want to transform data from source(s) to target(s).
The transforms can be found under
Transforms tab of our Local object Library – which provides access to all
repository objects (in-built or user built).
The transforms are majorly
classified into four categories as below. Expanding each type we can see the
list of transforms present in each category.
·
Data Integrator
·
Data Quality
·
Platform
·
Text Data Processing
Data Integrator
Transforms:
The list of data integrator
transforms present are:
1.
Data transfer
2.
Date Generation
3.
Effective_Date
4.
Hierarchy_Flattening
5.
History_Preserving
6.
Key_Generation
7.
Map_CDC_Operation
8.
Pivot (Columns to Rows)
9.
Reverse Pivot (Rows to Columns)
10. Table_Comparison
11. XML_Pipeline
Let’s look into the detailed
description of each of the transforms present under Data integrator category.
1) Data Transfer:
This transform writes the data
from a source or the output from another transform into a transfer object and
subsequently reads data from the transfer object.
The transfer type can be a
relational database table or file.
We can use the Data_Transfer
transform to push down operations to the database server when the transfer type
is a database table.
We can also can push down
resource-consuming operations such as joins, GROUP BY, and sorts using this
transform.
Please go through the article ‘Data Transfer transform in SAP
Data Services’ to know more details about this
transform.
2) Date Generation:
This transform produces a
series of dates incremented as we specify in the transform settings.
We use this transform to
produce the key values for a time dimension target.
From this generated sequence we
can also populate other fields in the time dimension (such as day_of_week)
using functions in a query transform.
Please go through the article ‘Date Generation transform in
SAP Data Services’ to know more details about this
transform.
3) Effective Date:
This transform is used to
calculate an “effective-to” value for data that contains an effective date.
The calculated effective-to
date and an existing effective date produce a date range that allows queries
based on effective dates to produce meaningful results.
Please go through the article ‘Effective Date transform in
SAP Data Services’ to know more details about this
transform.
4) Hierarchy Flattening:
This transform constructs a
complete hierarchy from parent/child relationships, then produces a description
of the hierarchy in vertically or horizontally flattened format which can be
used to build start models in data warehouse environment.
Please go through the article ‘Hierarchy Flattening transform
in SAP Data Services’ to
know more details about this transform.
5) History Preserving:
The History_Preserving
transform allows us to produce a new row in our target rather than updating an
existing row.
We can indicate in which
columns the transform identifies changes to be preserved.
If the value of certain columns
change, this transform creates a new row for each row flagged as UPDATE in the
input data set.
Please go through the article ‘History Preserving transform
in SAP Data Services’ to
know more details about this transform.
6) Key Generation:
This transform is used to
generate new keys for new rows/records in a data set.
When it is necessary to
generate artificial keys in a table, the Key_Generation transform looks up the
maximum existing key value from a table and uses it as the starting value to
generate new keys.
The transform expects the
generated key column to be part of the input schema.
Please go through the article ‘Key Generation transform in
SAP Data Services’ to know more details about this
transform.
7) Map CDC Operation:
Using its input requirements
(values for the Sequencing column and a Row operation column), the Map CDC
Operation transform performs three functions:
Sorts input data based on
values in Sequencing column drop-down list and (optional) the Additional
grouping columns box.
Maps output data based on
values in Row operation column drop-down list. Source table rows are mapped to
INSERT, UPDATE, or DELETE operations before passing them on to the target.
Resolves missing, separated, or
multiple before- and after-images for UPDATE rows.
Allows you filter columns and
view UPDATE rows prior to running the job.
While commonly used to support
relational or mainframe changed-data capture (CDC), this transform supports any
data stream as long as its input requirements are met. Relational CDC sources
include Oracle and SQL Server.
This transform is typically the
last object before the target in a data flow because it produces INPUT, UPDATE
and DELETE operation codes. Data Services produces a warning if other objects
are used.
Please go through the article ‘Map CDC Operation transform in
SAP Data Services’ to know more details about this
transform.
8) Pivot (Columns to Rows):
This transform creates a new
row for each value in a column that we identify as a pivot column.
The Pivot transform allows us
to change how the relationship between rows is displayed.
For each value in each pivot
column, Data Services produces a row in the output data set. We can also create
pivot sets to specify more than one pivot column.
Please go through the article ‘Pivot (Columns to Rows)
transform in SAP Data Services’to
know more details about this transform.
9) Reverse Pivot (Rows to
Columns):
This transform creates one row
of data from several existing rows.
The Reverse Pivot transform
allows us to combine data from several rows into one row by creating new
columns.
For each unique value in a
pivot axis column and each selected pivot column, Data Services produces a
column in the output data set.
Please go through the article ‘Reverse Pivot (Rows to
Columns) transform in SAP Data Services’ to know more details about this
transform.
Table Comparision:
This transform compares two
data sets and produces the difference between them as a data set with rows
flagged as INSERT, UPDATE, or DELETE.
The Table_Comparison transform
allows us to detect and forward changes that have occurred since the last time
a target was updated.
Note that in order to use the
Table_Comparison transform with Teradata 13 and later tables as the comparison
table and target table, we must do the following things:
On the Teradata server, set the
General parameter DBSControl to TRUE to allow uncommitted data to be read.
In the Data Services Teradata
datastore, add the following statement in the “Additional session parameters”
field:
SET SESSION CHARACTERISTICS AS
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Please go through the article ‘Table Comparision transform in
SAP Data Services’ to know more details about this
transform.
XML Pipeline:
This transform is used to
process large XML files one instance of a repeatable structure at a time.
With this transform, Data
Services does not need to read the entire XML input into memory then build an
internal data structure before performing the transformation.
An NRDM structure is not
required to represent the entire XML data input. Instead, the XML_Pipeline
transform uses a portion of memory to process each instance of a repeatable
structure, then continually releases and reuses memory to steadily flow XML
data through the transform.
During execution, Data Services
pushes operations of the XML_Pipeline transform to the XML source.
Please go through the article ‘XML Pipeline transform in SAP
Data Services’ to know more details about this
transform.
The below image represents all
the transforms available as part of Data Integrator category.
The calculated effective-to
date and an existing effective date produce a date range that allows queries
based on effective dates to produce meaningful results.
Please go through the article ‘Effective Date transform in
SAP Data Services’ to know more details about this
transform.
Data Integrator Transforms List
Hi Everyone,
1) Data Transfer:
This transform writes the data from a source or the output from another transform into a transfer object and subsequently reads data from the transfer object.
2) Date Generation:
This transform produces a series of dates incremented as we specify in the transform settings.
3) Effective Date:
This transform is used to calculate an “effective-to” value for data that contains an effective date.
4) Hierarchy Flattening:
This transform constructs a complete hierarchy from parent/child relationships, then produces a description of the hierarchy in vertically or horizontally flattened format which can be used to build start models in data warehouse environment.
5) History Preserving:
The History_Preserving transform allows us to produce a new row in our target rather than updating an existing row.
6) Key Generation:
This transform is used to generate new keys for new rows/records in a data set.
7) Map CDC Operation:
Using its input requirements (values for the Sequencing column and a Row operation column), the Map CDC Operation transform performs three functions:
8) Pivot (Columns to Rows):
This transform creates a new row for each value in a column that we identify as a pivot column.
9) Reverse Pivot (Rows to Columns):
This transform creates one row of data from several existing rows.
nice post.
ReplyDeleteSAP Grc training