DataStage Information: DS FAQs

25.give one real time situation where link partitioner stage used?
· If we want to send more data from the source to the targets quickly we will be using the link partioner stage in the server jobs we can make a maximum of 64 partitions. And this will be in active stage. We can't connect two active stages but it is accpeted only for this stage to connect to the transformer or aggregator stage. The data sent from the link partioner will be collected by the link collector at a max of 64 partition. This is also an active stage so in order to aviod the connection of active stage from the transformer to teh link collector we will be using inter process communication. As this is a passive stage by using this data can be collected by the link collector. But we can use inter process communication only when the target is in passive stage
26.How i create datastage Engine stop start script.Actually my idea is as below.!#bin/bashdsadm - usersu - rootpassword (encript)DSHOMEBIN=/Ascential/DataStage/home/dsadm/Ascential/DataStage/DSEngine/binif check ps -ef grep DataStage (client connection is there) { kill -9 PID (client connection) }uv -admin - stop > dev/nulluv -admin - start > dev/nullverify processcheck the connectionecho "Started properly"run it as dsadm
· go to the path /DATASTAGE/PROJECTS/DSENGINE/BIN/uv -admin -stopuv -admin -start
27.What does separation option in static hash-file mean?
· The different hashing algorithms are designed to distribute records evenly among the groups of the file based on charecters and their position in the record ids.
When a hashed file is created, Separation and Modulo respectively specifies the group buffer size and the number of buffers allocated for a file. When a Static Hashfile is created, DATASTAGE creates a file that contains the number of groups specified by modulo.
Size of Hashfile = modulus(no. groups) * Separations (buffer size)
28.What is the difference between sequential file and a dataset? When to use the copy stage?
· Sequential file stores small amount of the data with any extension .txt where as DataSet stores Huge amount of the data and opens the file only with an extension .ds.
· Sequentiial Stage stores small amount of the data with any extension in order to acces the file where as DataSet is used to store Huge amount of the data and it opens only with an extension (.ds ) .The Copy stage copies a single input data set to a number of output datasets. Each record of the input data set is copied to every output data set.Records can be copied without modification or you can drop or change theorder of columns.
29.how to find errors in job sequence?
· using DataStage Director we can find the errors in job sequence
30.what is job control?how can it used explain with steps?
· JCL defines Job Control Language it is ued to run more number of jobs at a time with or without using loops. steps:click on edit in the menu bar and select 'job properties' and enter the parameters asparamete prompt typeSTEP_ID STEP_ID stringSource SRC stringDSN DSN stringUsername unm stringPassword pwd stringafter editing the above steps then set JCL button and select the jobs from the listbox and run the job
31.how to implement type2 slowly changing dimenstion in datastage? give me with example?
· Slow changing dimension is a common problem in Dataware housing. For example: There exists a customer called lisa in a company ABC and she lives in New York. Later she she moved to Florida. The company must modify her address now. In general 3 ways to solve this problem
Type 1: The new record replaces the original record, no trace of the old record at all, Type 2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two different people. Type 3: The original record is modified to reflect the changes.
In Type1 the new one will over write the existing one that means no history is maintained, History of the person where she stayed last is lost, simple to use.
In Type2 New record is added, therefore both the original and the new record Will be present, the new record will get its own primary key, Advantage of using this type2 is, Historical information is maintained But size of the dimension table grows, storage and performance can become a concern.
Type2 should only be used if it is necessary for the data warehouse to track the historical changes.
In Type3 there will be 2 columns one to indicate the original value and the other to indicate the current value. example a new column will be added which shows the original address as New york and the current address as Florida. Helps in keeping some part of the history and table size is not increased. But one problem is when the customer moves from Florida to Texas the new york information is lost. so Type 3 should only be used if the changes will only occur for a finite number of time.

DataStage Information

Saturday, October 18, 2008

DS FAQs

1 comment:

Search 4 DataStage

Blog Archive