Tuesday, December 21, 2010

Data Transformations

Posted by Venkat ♥ Duvvuri 7:49 AM, under | No comments

Data transformation and movement is the process by which source data is selected, converted, and mapped to the format required by targeted systems. The process manipulates data to bring it into compliance with business, domain, and integrity rules and with other data in the target environment. Transformation can take some of the following forms: Aggregation Consolidating or summarizing data values into a single value. Collecting daily sales data to be aggregated to the weekly level is a common example of aggregation. Basic conversion Ensuring...

Tuesday, December 7, 2010

DataStage Stages and Jobs

Posted by Venkat ♥ Duvvuri 9:48 PM, under | 1 comment

An IBM InfoSphere DataStage job consists of individual stages linked together which describe the flow of data from a data source to a data target. A stage usually has at least one data input and/or one data output. However, some stages can accept more than one data input, and output to more than one stage. Each stage has a set of predefined and editable properties that tell it how to perform or process data. Properties might include the file name...

Tuesday, November 23, 2010

DataStage Functions

Posted by Venkat ♥ Duvvuri 8:45 AM, under | No comments

Bottom In its simplest form, IBM InfoSphere DataStage performs data transformation and movement from source systems to target systems in batch and in real time. The data sources might include indexed files, sequential files, relational databases, archives, external data sources, enterprise applications, and message queues. DataStage manages data that arrives and data that is received on a periodic or scheduled basis. It enables companies to solve...

Thursday, November 11, 2010

DataStage FAQs and Best Practices

Posted by Venkat ♥ Duvvuri 8:39 PM, under | 5 comments

1. General Datastage issues 1.1. What are the ways to execute datastage jobs? A job can be run using a few different methods: * from Datastage Director (menu Job -> Run now...) * from command line using a dsjob command * Datastage routine can run a job (DsRunJob command) * by a job sequencer 1.2. How to invoke a Datastage shell command? Datastage shell commands can be invoked from : * Datastage administrator (projects tab -> Command) * Telnet client connected to the datastage server 1.3. How to stop a job when its status is running? To...

Wednesday, October 27, 2010

Interview Questions

Posted by Venkat ♥ Duvvuri 8:38 AM, under | 8 comments

1. How to remove duplicate rows? We have various ways to do the same. ServerJobs -Using Hash File. -Using Stage variable Detailed description as below. we have to define two stage variable for example stgV1 and stgV2. Derivation...............Stage Variable ----------------------------------------- stgV1......................stgV2 IputKeyColm(Id).....stgV1 We have to write a constraint like as below if stgV1<>stgV2 ------------->this stream is to produce unique records if stgV1=stgV2 ------------->this stream is to produce duplicate...

Pages 71234 »