Q.What is Surrgate key?
Ans.Surrogate key is a substitution for the natural primary key.
It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table.Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use Infa sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key.It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.Some tables have columns such as AIRPORT_NAME or CITY_NAME which are stated as the primary keys (according to the business users) but ,not only can these change, indexing on a numerical value is probably better and you could consider creating a surrogate key called, say, AIRPORT_ID. This would be internal to the system and as far as the client is concerned you may display only the AIRPORT_NAME.
Q.What are the steps to build Data Warehouse?
Ans. The steps to build Datawarehouse are
Gathering bussiness requiremnts
Identifying Sources
Identifying Facts
Defining Dimensions
Define Attribues
Redefine Dimensions & Attributes
Organise Attribute Hierarchy & Define Relationship
Assign Unique Identifiers
Additional convetions:Cardinality/Adding ratios
Q.What is incremental loading, batch processing, cross reference table, aggregate fact table?
Ans.Incremental loading means loading the ongoing changes in the OLTP.Aggregate table contains the [measure] values ,aggregated /grouped/summed up to some level of hierarchy.
Batch Processing means executing more than one session in single run at the same time. We can execute these session in 2 ways :
linear: exececuting one after another.
parallel: executing more than one session at at time.
Q.What is Hybrid slowly changing dimension?
Ans. Hybrid SCDs are combination of both SCD 1 and SCD 2.It may happen that in a table, some columns are important and we need to track changes for them i.e capture the historical data for them whereas in some columns even if the data changes, we don't care.For such tables we implement Hybrid SCDs, where in some columns are Type 1 and some are Type 2.
Q.What is the main difference between Inmon and Kimball philosophies of data warehousing?
Ans. Both differed in the concept of building teh datawarehosue..According to Kimball ...Kimball views data warehousing as a constituency of data marts. Data marts are focused on delivering business objectives for departments in the organization. And the data warehouse is a conformed dimension of the data marts. Hence a unified view of the enterprise can be obtain from the dimension modeling on a local departmental level.Inmon beliefs in creating a data warehouse on a subject-by-subject area basis. Hence the development of the data warehouse can start with data from the online store. Other subject areas can be added to the data warehouse as their needs arise. Point-of-sale (POS) data can be added later if management decides it is necessary.
Inmon---First Datawarehouse--Later----Datamarts
Kimball--First DataMarts--Combined way ---Datawarehouse.
Monday, August 17, 2009
Datawarehousing Interview Questions 2
Subscribe to:
Post Comments (Atom)
First of all. Thanks very much for your useful post.
ReplyDeleteI just came across your blog and wanted to drop you a note telling you how impressed I was with the information you have posted here.
Please let me introduce you some info related to this post and I hope that it is useful for community.
Source: administrative assistant interview questions
Thanks again
Ngo