DataStage Information: Parallel Xtender

21.what is the default size of the hash file?
Ans:Max 2 GB for Static hash file
not sure for dynamic
22.what type of files are created when we create the hash file?
Ans:These are the three types of files will be created .dat, .type, .over
23.how to remove the locked jobs using datastage?
Ans:Go to Director Tools then click Clear the Job Resources optionthere u note the PID Number 1.Go to Administrator 2.Properties3.Command 4.type ds.tools5. u will get options6.select 4th option6.select the 6th option7.Then enter the PID Number
or
...alternatively just restart the datastage services from the "administator tools" section of the control panel...
24.which we have to use for export the jobs from client to server,if sever running on Unix os and client running on windows os?
hi u have to export .dsx files
25.Is there any way to find the usage analysis of particular column in the table using Datastage?
Ans:It's very difficult to have a good usgae analysis in oracle and see exactly who is touching what and how on table and column level.
However I know that there is a good tool called DB*Classify that provides this information and much more regarding the database usage analysis.
it belongs to a company called Zetapoint
26.How to call a Stored procedure which is in SQL Server database in Datastage job? I want to pass job parameters as one of the SP parameters and for remaining SP parameters , I want to use default values. Also I want to perform transformation on the result set of SP.
Ans:In ODBC properties

Click on OUTPUT
General select Stored Procedure
browse to get the stored procedure if have not import then you need to import it.

try this if still have problem buzz me.

27.How can I identify the duplicate rows in a seq or comma delimited file?
the case is...> the source has 4 values like, agent id, agent name, etc... our requirement is that the ID shouldn't be repeated. so how can i identify the duplicate rows , set a flag and send the rejects to the specified reject file? the source systems data is directly given to us. tha's why we are getting these duplicates.if it has a primary key set up already then it would have been very easy.thanks in advance.?
Ans:Sort the sequential file based on the key AGENT_ID adn set the option "Create Key Change Column" to TRUE in the sort stage. The records which has the duplicate records will be populated with the value 0(Zero) in the KeyChange field. Now reject the records which has the value 0.
28.To pass parameters to the job we can use Unix shell program.? But I do not know exactly how to do that.?Plz let me know detail steps to pass the parameter through shell script ?
Ans:Create a shell script like stg.sh and declare needed parameters for source and schema and target in run_dsjob program. Remember the parameters you used for script should be declared at source and target and give all parameters and job name all as parameters to run_dsjob.
or
Hi,use cmd line options../shellscript in the script use $1, $2 ... etc as arguments.regardsSubbu MalepatiBlore.
29.what is metastage?explain
Ans:meta stage contians data about the data it contains the description of data and location of the data
30.what is the difference between static hash files n dynamic hash files?
Ans:Static hash file don't chane their number of groups(modulas) except through manual resizing
Dynamic hash file automatically change their no of groups(modulas)in response to the amount of data stored ina file.
31.how can we create environment variables in datasatage?
Ans:Hi This mostely will comes under Administrator part.As a Designer only we can add directly byDesigner-view-jobprops-parameters-addenvironment variable-under userdefined-then add.
32.how do we create index in data satge?
Ans:What type of index are you looking for ? If it is only based on rows, use @inrownum or @outrownum
33.how can we load source into ODS?
Ans:What is ur source?. Depending on type of source, you have to use respective stage.like oracle enterprise: u can use this for oracle source and target.similarly for other sources.
34.how to eleminate duplicate rows in data stage?
Ans:TO remove duplicate rows you can achieve by more than one way
1.In DS there is one stage called "Remove Duplicate" is exist where you can specify the key.
2.Other way you can specify the key while using the stage i mean stage itself remove the duplicate rows based on key while processing time.
35.What is the difference between reference link and straight link ?
Ans:The differerence between reference link and straight link is
The straight link is the one where data are passed to next stage directly and the reference link is the one where it shows that it has a reference(reference key) to the main table
for example in oracle EMP table has reference with DEPT table.
In DATASTAGE
2 table stage as source (one is straight link and other is reference link) to 1 transformer stage as process.
If 2 source as file stage(one is straight link and other is reference link to Hash file as reference) and 1 transformer stage.
36.what is pivot stage?why are u using?what purpose that stage will be used?
Ans:First of all thanks to srilaxmi for your skill in pivot stage.
Pivot stage supports only horizontal pivoting – columns into rows
Pivot stage doesn’t supports vertical pivoting – rows into columns
Example: In the below source table there are two cols about quarterly sales of a product but biz req. as target should contain single col. to represent quarter sales, we can achieve this problem using pivot stage, i.e. horizontal pivoting.
Source Table
ProdID Q1_Sales Q2_Sales
1010 123450 234550
Target Table
ProdID Quarter_Sales Quarter
1010 123450 Q1
1010 234550 Q2

37.what is quality stage and profile stage?
Ans:Quality Stage:It is used for cleansing ,Profile stage:It is used for profiling
38.how to find the process id?explain with steps?
Ans:From the DS Director.Follow the path :
Job > Cleanup Resources.
There also you can see the PID.It also displays all the current running processes.
or
Depending on your environment, you may have lots of process id's.From one of the datastage docs:you can try this on any given node: $ ps -ef | grep dsuserwhere dsuser is the account for datastage.If the above (ps command) doesn't make sense, you'll need somebackground theory about how processes work in unix (or the mksenvironment when running in windows).Also from the datastage docs (I haven't tried this one yet, but it looks interesting):APT_PM_SHOW_PIDS - If this variable is set, players will output an informational message uponstartup, displaying their process id.Good luck.
39.how to distinguish the surogate key in different dimensional tables?how can we give for different dimension tables?
Ans:Use Database sequence to make your job easier to generate the surrogate key
40.at is Runtime Column Propagation and how to use it?
Ans:If your job has more columns which are not defined in metadata if runtime propagation is enabled it will propagate those extra columns to the rest of the job

DataStage Information

Sunday, June 29, 2008

Parallel Xtender

No comments:

Search 4 DataStage

Blog Archive