DataStage Information: Datastage PX

1.DataStage - delete header and footer on the source sequential?
Ans:In Designer Pallete Development/Debug we can find Head & tail. By using this we can do......
2.How to connect two stages which do not have any common columns between them?
Ans:If suppose two stage don’t have the same column name then in between use one Transformer stage and map the required column.
3.What is the difference between Squential Stage & Dataset Stage. When do u use them?
Ans:Sequential stage is use for format of squential file and Dataset is use for any type ofÂ format file (random)Â
4.What is Invocation ID?
Ans:This only appears if the job identified by Job Name has Allow Multiple Instance enabled. Enter a name for the invocation or a job parameter allowing the instance name to be supplied at run time

An 'invocation id' is what makes a 'multi-instance' job unique at runtime. With normal jobs, you can only have one instance of it running at any given time. Multi-instance jobs extend that and allow you to have multiple instances of that job running (hence the name). They are still a 'normal' job under the covers, so still have the restriction of one at a time - it's just that now that 'one' includes the invocation id. So, you can run multiple 'copies' of the same job as long as the currently running invocation ids are unique.
5.How do you extract job parameters from a file?
Ans:Through user variable activity in sequencer Job. through calling a routine.
6.What are the important considerations while using join stage instead of lookups.?
Ans:If the volume of data is high then we should use Join stage instead of Lookup.
7.How to implement the type 2 Slowly Changing dimension in DataStage?
Ans:You can use change-capture & change apply stages for this
8.Is Hashed file an Active or Passive Stage? When will be it useful?
Ans:Hash file stage is a Passive stage.
The stage which do some process into it is called active stages . Ex: Transformer, Sort, Aggrigate.
9.How do we use Job Parameters in datastage jobs ?
Ans:There is an icon to go to Job parameters in the tool bar. Or you can press Ctrl+J to enter into Job Parameters dialog box. Once you enter, give a parameter name and corresponding default value for it. This helps to enter the value, when you run the job. Its not necessary always to open the job to change the parameter value. Also when the jobÂ runs through script, its just enough to give the parameter value in the command line of script.
Else you have to change the value in the job,compile and then run in the script. So its easy for the users to handle the jobs using parameters.
10.What is the use of $PROJDEF Variable and how can we use this variable to pass the parameters from parameter files ?
Ans:$PROJDEFÂ parms areÂ project level parameters. You can assign a value to them in Administrator and use as #$#.
You can also create user defined parms at project level and use them also.
11.How do you remove duplicate records from a file and capture the rejected records in a separate file ?
Ans:In parallel DS, we can use Duplicate removal Stage to avoid the deplicate records.
where as in server DS, we have to use lookup
or
Other than remove duplicate stage ,we can also use aggregator stage to count the number of records exist for the key columns.If more than one record exist for the key column,then they are considered as duplicate records and using transformer we can set a stage variable as 'COUNT' and check if 'COUNT>1'.If so,using a constraint, reject that duplicate records into reject file.
12.What is the difference between a Filter and a Switch Stage ?
Ans: A Filter stage is used to filter the incoming data ,for suppose u want to get the details of customer 20 if u give customer 20 as the constraint in filter it will display only the customer 20 files and u can also give a reject link,the rest of the records will go into reject link.
where as in the switch,
we need to give as cases,
like case1,case2.
case1=10;
case2=20;
it will give the outputs of 10 and 20 customer records.
switch will check the cases and execute them.
13.How can i handle the after job subroutine in the transformer stage?
Ans:In Transformer Stage click on properities tab in tool bar(Left corner tab). It displays stage properities page like all the other job properities page. There you can specify your before after stage subroutins.
14.How to enable the runtime column propagation in the datastage?Where does this option exists?
Ans:There is an option in data stage administrator->projects tab->properties button->general tab-> enable runtime column propogation for parallel jobs.
If you enable this, you can select runtime propogation to specify that columns encountered by a stage in a parallel job can be used even if they are not defined in the meta data explicitly.
You can see the "runtime propogation" option in most of the active stages in the output tab(if exists) in columns sub tab.
15.How to set Variables in DataStage?
Ans:Use Stage variable to set variable in transformation properties.
where you set user definied variable.

16.NEED HELP
1) I can not get into DataStage Administrator after trying a lot. I give the Host Name or Name of my computer, and I did not give the User Name & Password then I get the error like this
Failed to connect to host: ishaq, project: UV
(The host name specified is not valid, or the host is not responding (81011))

2)When I give the Host Name User Name & Password then I get the error like this
Failed to connect to host: ishaq, project: UV
(The host name specified is not valid, or the host is not responding (81011))

To solve this problems, I Installed the DataStage 7.5.1, atleast seven times, after this all still I did not get enter into the DataStage Administrator then I format the System, and Re Installed every thing once again from the beginning, then again Installed the DataStage, still I have the same problem, I can not get enter into the DataStage Administrator till now.
I am sure, I have Windows XP SP2 Installed on P4 System
Ans:To connect to data stage administrator, your system should have the username and password and check
whether telnet service is running or not you can find this in control panel-datastage-telnet service ensure all the three are in running state if you don't
have the username and password in your system you will face that problem
17.What for we use Hashed files in datastage job? Give an example ?
Ans:mainly hased files are used for lookup perpose. for example ur looking for the data whether the entry is their or not. if it is thr do something otherwise something else
or
I'm agree with naveen patil, Hashed file is used for lookup process. Beside that, Hashed file can also be used if we want to get Distinct data from a source by using the column as key. Sample, I want to get list of country from Customer records. One possible solution is I create a Hashed file and using Country as its key. Then I can get distinct value of Country.

18.1.What is the use of parameters in datastage job?
What is the difference between datastage 6 and datasttage 7.0?
Whta is unit testing, system testing and integration testing?
seheduling the jobs in Datastage?
What is the difference between datastage 7.1 and datasttage 7.5?
How to do error handling in datastage?
how do you remove duplicates in a flatfile?
Ans:Parameters are helps to enter values at run time. and also makes secure by not to forcing a hard coding.
Unit testing is the one mainly deals with a single job.
system testing is the one deals with system compatabilty.
integration testing deals with entire set of jobs that meets business requirements.
Schudeling jobs helps to set a timer to run a job at a perticuler event of time
error hanling is done through a stage in job sequences.
We can remove duplicates using remove duplicate stage
19:What is the difference between importing table definitions by Orchestrate Schema Definition or Plug-in Meta Data Definitions?
Ans:The Import Orchestrate Schema wizard allows you to import meta data into DataStage from an Orchestrate schema file, from a file set, or from a data set while ODBC allow you to import metadata in the native format of the database schema
or
When designing jobs that use plug ins (like Oracle) the parallel engine will use its interpretation of the Oracle meta data (e.g, exact data types) based on interrogation of Oracle, overriding what you mayhave specified in the Columns tab. For this reason it is best to import your Oracle table definitions using the Orchestrate Schema Definitions
18.what is the difference between the routine,transform and function?give some examples for each?
Ans:routine and function are sounds same, functions are subset of routines..
Routine / Function : taking input data and building some buziness logic and give output data
Examples: StringDecode, KeyMgtGetNextValue
Transform : taking input data and transforms to another form of data and return
Examples: TIMESTAMP, NullToEmpty
19.How to use Excel file as input in DataStage?
Ans: You can use excel file as input by importing the .xls file.
step1 --> Go to Administrative Tools -> Data Source (ODBC) --> System DSN. Click on Add button and configure the corresponding .xsl file in your system DSN. Make sure that workbook contains the name of your excel sheet.
Step2 --> Import the excel file into the datastage as ODBC table definition.
Step3 --> Use ODBC stage as input stage.
You should be able to use excel file very effectively. Please let me know if you face any problem.
20.While using oracle database as target or Source why we will use OCI stage why don't we use ODBC stage?
Ans: Useing OCI we can transefer the data rather than ODBC
OCI supports OraBulk and PlugIns useing these concepts Bulk loads possible quickly
or
While using OracleÂ Database as source we use oracle enterprise stage,Â because it retrieves bulk of data fastly, where as the odbc stageÂ has drivers and are slow while retreving bulk of data.

DataStage Information

Sunday, June 29, 2008

Datastage PX

No comments:

Search 4 DataStage

Blog Archive