Sunday, June 29, 2008

Advantages and Disadvantages of using DataStage ETL tool

Major business and technical advantages and disadvantages of using DataStage ETL tool


Business advantages of using DataStage as an ETL tool:
# Significant ROI (return of investment) over hand-coding
# Learning curve - quick development and reduced maintenance with GUI tool
# Development Partnerships - easy integration with top market products interfaced with the datawarehouse, such as SAP, Cognos, Oracle, Teradata, SAS
# Single vendor solution for bulk data transfer and complex transformations (DataStage versus DataStage TX)
# Transparent and wide range of licensing options


Technical advantages of using DataStage tool to implement the ETL processes
# Single interface to integrate heterogeneous applications
# Flexible development environment - it enables developers to work in their desired style, reduces training needs and enhances reuse. ETL developers can follow data integrations quickly through a graphical work-as-you-think solution which comes by default with a wide range of extensible objects and functions
# Team communication and documentation of the jobs is supported by data flows and transformations self-documenting engine in HTML format.
# Ability to join data both at the source, and at the integration server and to apply any business rule from within a single interface without having to write any procedural code.
# Common data infrastructure for data movement and data quality (metadata repository, parallel processing framework, development environment)
# With Datastage Enterprise Edition users can use the parallel processing engine which provides unlimited performance and scalability. It helps get most out of hardware investment and resources.
# The datastage server performs very well on both Windows and unix servers.


Major Datastage weaknesses and disadvantages
# Big architectural differences in the Server and Enterprise edition which results in the fact that migration from server to enterprise edition may require vast time and resources effort.
# There is no automated error handling and recovery mechanism - for example no way to automatically time out zombie jobs or kill locking processes. However, on the operator level, these errors can be easily resolved.
# No Unix Datastage client - the Client software available only under Windows and there are different clients for different datastage versions. The good thing is that they still can be installed on the same windows pc and switched with the Multi-Client Manager program.
# Might be expensive as a solution for a small or mid-sized company.

1 comment:

srinu said...

hi..,
this is Srinivas,
the information is very useful,
you are doing a great job.

can you give information about the following stages and where they are used:

1) change data capture
2) change apply
and date validations

Search 4 DataStage