Pages

Wednesday, May 26, 2010

Dataware Housing Informatica Faq's

Informatica >> Interview Questions
Informatica Repository Manager
Q. What type of repositories can be created using Informatica Repository Manager?
Q. What is a code page?
Q. Which all databases PowerCenter Server on Windows can connect to?
Q. Which all databases PowerCenter Server on UNIX can connect to?
Informatica Designer
Q. How to execute PL/SQL script from Informatica mapping?
Q. How can you define a transformation? What are different types of transformations available in Informatica?
Q. What is a source qualifier? What is meant by Query Override?
Q. What is aggregator transformation?
Q. What is Incremental Aggregation?
Q. How Union Transformation is used?
Q. Can two flat files be joined with Joiner Transformation?
Q. What is a look up transformation?
Q. Can a lookup be done on Flat Files?
Q. What are Connected and Unconnected Lookups?
Q. What is a mapplet?
Q. What does reusable transformation mean?
Q. What is update strategy and what are the options for update strategy?

Informatica Repository Manager
Q. What type of repositories can be created using Informatica Repository Manager?
A. Informatica PowerCenter includeds following type of repositories :
• Standalone Repository : A repository that functions individually and this is unrelated to any other repositories.
• Global Repository : This is a centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts.
• Local Repository : Local repository is within a domain and it’s not a global repository. Local repository can connect to a global repository using global shortcuts and can use objects in it’s shared folders.
• Versioned Repository : This can either be local or global repository but it allows version control for the repository. A versioned repository can store multiple copies, or versions of an object. This features allows to efficiently develop, test and deploy metadata in the production environment.
Q. What is a code page?
A. A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text.
When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.
Q. Which all databases PowerCenter Server on Windows can connect to?
A. PowerCenter Server on Windows can connect to following databases:
• IBM DB2
• Informix
• Microsoft Access
• Microsoft Excel
• Microsoft SQL Server
• Oracle
• Sybase
• Teradata
Q. Which all databases PowerCenter Server on UNIX can connect to?
A. PowerCenter Server on UNIX can connect to following databases:
• IBM DB2
• Informix
• Oracle
• Sybase
• Teradata
Infomratica Mapping Designer
Q. How to execute PL/SQL script from Informatica mapping?
A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure.
Q. How can you define a transformation? What are different types of transformations available in Informatica?
A. A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Below are the various transformations available in Informatica:
• Aggregator
• Application Source Qualifier
• Custom
• Expression
• External Procedure
• Filter
• Input
• Joiner
• Lookup
• Normalizer
• Output
• Rank
• Router
• Sequence Generator
• Sorter
• Source Qualifier
• Stored Procedure
• Transaction Control
• Union
• Update Strategy
• XML Generator
• XML Parser
• XML Source Qualifier
Q. What is a source qualifier? What is meant by Query Override?
A. Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.
PowerCenter Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.
Q. What is aggregator transformation?
A. The Aggregator transformation allows performing aggregate calculations, such as averages and sums. Unlike Expression Transformation, the Aggregator transformation can only be used to perform calculations on groups. The Expression transformation permits calculations on a row-by-row basis only.
Aggregator Transformation contains group by ports that indicate how to group the data. While grouping the data, the aggregator transformation outputs the last row of each group unless otherwise specified in the transformation properties.
Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE.
Q. What is Incremental Aggregation?
A. Whenever a session is created for a mapping Aggregate Transformation, the session option for Incremental Aggregation can be enabled. When PowerCenter performs incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally.
Q. How Union Transformation is used?
A. The union transformation is a multiple input group transformation that can be used to merge data from various sources (or pipelines). This transformation works just like UNION ALL statement in SQL, that is used to combine result set of two SELECT statements.
Q. Can two flat files be joined with Joiner Transformation?
A. Yes, joiner transformation can be used to join data from two flat file sources.
Q. What is a look up transformation?
A. This transformation is used to lookup data in a flat file or a relational table, view or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations.
Q. Can a lookup be done on Flat Files?
A. Yes.
Q. What is the difference between a connected look up and unconnected look up?
A. Connected lookup takes input values directly from other transformations in the pipleline.
Unconnected lookup doesn’t take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.
Q. What is a mapplet?
A. A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings.
Q. What does reusable transformation mean?
A. Reusable transformations can be used multiple times in a mapping. The reusable transformation is stored as a metadata separate from any other mapping that uses the transformation. Whenever any changes to a reusable transformation are made, all the mappings where the transformation is used will be invalidated.
Q. What is update strategy and what are the options for update strategy?
A. Informatica processes the source data row-by-row. By default every row is marked to be inserted in the target table. If the row has to be updated/inserted based on some logic Update Strategy transformation is used. The condition can be specified in Update Strategy to mark the processed row for update or insert.
Following options are available for update strategy :
• DD_INSERT : If this is used the Update Strategy flags the row for insertion. Equivalent numeric value of DD_INSERT is 0.
• DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent numeric value of DD_UPDATE is 1.
• DD_DELETE : If this is used the Update Strategy flags the row for deletion. Equivalent numeric value of DD_DELETE is 2.
• DD_REJECT : If this is used the Update Strategy flags the row for rejection. Equivalent numeric value of DD_REJECT is 3.

Here Are The Answers

informatica interview questions


DW Questions
-------------
1.What are Semi-additive and factless facts?And in which scenario will you use such kinds of fact tables?

There are three kind of facts,
Additive:-- That is the facts which can be aggregated or segregated on the change of the dimension level and always remains meaningful: example: sales value

Now a days almost all data warehouses contain Additrivev facts

Semiadditive fact : That is the facts which can be aggregated or segregated on the change of the dimension level but becomes meaningless on aggregation or segregation
Ex : Gross profit percentage.

These can be represented while u use Customized OLAP

Nonadditive facts: That is the facts which Can Not be aggregated or segregated on the change of the dimension level and always remains meaningful:
Example: at the time of accident near C.N .Tower weather was HOT
Hot is non additive dimension.
(you can call me if you are serious to implement semiadditive and nonadditive fact)

2.what are conformed dimensions?
conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts tables accordingly
e.g : Date Dimension

3.Differences between star and snowflake schemas?

in star schemas all dimensions point to a centralized fact table
in snowflake dimensions pointing to fact tables itself are pointed by sub dimension tables.

Question Bank on Datawarehousing:

General Questions:

1. What is a data-warehouse?
A datawarehouse is a repository(centralized as well as distributed) of Data ,able to answer any adhoc,analytical,historical or complex queries.
It contains 3 types of data


2. What are Data Marts?

Data mart is a subset of a datawarehouse.

4. What is ER Diagram?
It represents entities and relationshiup between then in a domain

5. What is a Star Schema?
It has a centralized fact table ,reprensenting the facts on the intersection of the dimensions directly pointing to the fact table.

6. What is Dimensional Modelling?

Breakup of standard Normilised data between ever changing fact(Numeric informal data) and slowly changing or fixed non numeric data for fast and effective OLAP

7. What Snow Flake Schema?
in snowflake schema dimensions pointing to fact tables itself are pointed by sub dimension tables.


8. What are the Different methods of loading Dimension tables?
Depends on Dimension type
For static dimensions one time or periodical loading can be done
For slolwy changing dimensions
1.Overwrite : but u will lost the historical aspect of dw


9. What are Aggregate tables?
These are the fact tables which do note represent the lowest granujlarity level
People do it when minut info is not needed and fast processing is required

10. What is the Difference between OLTP and OLAP?


Online Transaction Processing : Lowest Granularity,continuous update,transaction data,slow reporting
Made for fast insert,up;date,delete



OnLine Analytical Processing: Varying Granularity,Batch Update,Historical data ,Fast reporting..
Made for fast reporting



11. What is ETL?

Extraction,Transformation,Loading

12. What are the vaious ETL tools in the Market?
DTS,Informatica,Decision Stream…

13. What are the various Reporting tools in the Market?
Hyperion Essbase(earlier Brio),Analysis service,Cognos,Beacon….

14. What is Fact table?
Represents the numeric data of a data mart ,representing data on the intersection of related dimensions..

15. What is a dimension table?
Non numeric data of the domain on the hierarchy of which aggregation or segregation of fact is needed

16. What is a lookup table?
Represents metadata


17. What is a general purpose scheduling tool? Name some of them?
use in scheduling yaar,..say window scheduler/cognos scheu or upfront schedular
18. What are modeling tools available in the Market? Name some of them?
Analysis services,Architect(cognos)

19. What is real time data-warehousing?
it’s a concept remains unimplemented, fast update of cube or DW


20. What is data mining?
Looking for a information which was previously unknown or hidden


21. What is Normalization? First Normal Form, Second Normal Form , Third Normal Form?

22. What is ODS?
Operational Data Store

23. What type of Indexing mechanism do we need to use for a typical datawarehouse?

24. Which columns go to the fact table and which columns go the dimension table?
(My user needs to see broken by

All elements before broken = Fact Measures
All elements after broken = Dimension Elements


Changing numeric fields..fact table
Texual-dimension table

25. What is a level of Granularity of a fact table? What does this signify?

Grass root level Transaction data is

(Weekly level summarization there is no need to have Invoice Number in the fact table anymore)

26. How are the Dimension tables designed?
De-Normalized , Wide, Short , Use Surrogate Keys, Contain Additional date fields and flags.

27. What are slowly changing dimensions?

Which change over time,like employees department

28. What are non-additive facts?

Nonadditive facts: That is the facts which Can Not be aggregated or segregated on the change of the dimension level and always remains meaningful:
Example: at the time of accident near C.N .Tower weather was HOT


29. What are conformed dimensions?
conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts tables accordingly

30. What is VLDB? (Data base is too large to back up in a time frame then it's a VLDB)
Very Large Data Base
31. What is SCD1 , SCD2 , SCD3 ?

Kya hai,Data collection sattelites

ETL Questions:

1. What is a staging area? Do we need it? What is the purpose of a staging area?

It’s the area where most of the ETL is done…

2. What is a three tier data warehouse?
DW server,Olap Server ,olap clinet
3. What are the various methods of getting incremental records or delta records from the source systems?

1.Extract source data from all operational records, regardless of whether any data values have changed since the last
ETL load or not

2. Extract source data only from those operational records in which some data values have changed since the last ETL load (“net change”)


4. What are the various tools? - Name a few

informatica,decision stream,DTS

5. What is latest version of Power Center / Power Mart?

PC7

6. What is the difference between Power Center & Power Mart?

main diff is in support of source of data

7. What are the various transformation available?

Structured,semistructured,complex structured,unstructured

enough for now..

Shanker Jha
SE(DW & BI)
CGI SEEPZ Mumbai

No comments:

Receive All Free Updates Via Facebook.