Data warehousing has been growing tremendously over the years. Organizations are looking for qualified individuals to fill this positions and ensure the business stays competitive.
This article will focus on Informatica interview questions to clean and modify data based on both structured and unstructured systems. It will involve questions related to Informatica PowerCenter connected to an SQL server database, data quality, cloud, ETL tools, data integration and product development processes.
Frequently prepared Informatica questions you should know
- What is Informatica? State its application in business.
Informatica is a firm that provides software development solutions for data integration, data virtualization, data quality, among other services. The tools used provides real-time data integration, business to business data integration as well as web services integration.
It includes products like PowerCenter, PowerConnect, Power Mart, Power Exchange, Power Analysis and Power Quality.
Application of Informatica
- When an organization wants to move from an existing system like mainframe to a new database system.
- When the organization is setting up a data warehouse. In such case, an ETL tool is used to move data from the current production system to the warehouse.
- Used as data cleansing tool
- To integrate data from multiple sources e.g. from multiple databases or file systems.
- What is the difference between Informatica and a Data Stage
Informatica uses step by step data integration processes while data stage uses project-based integration process.
Informatica key features like power designer, workflow designer, workflow manager and repository manager are used for the development and monitoring. The job sequence designer and director are used for data stage development.
- What is Data Warehousing?
A data warehouse consists of different types of data. Data warehousing is the data created at a single central point of access. The data is accessed through a single source and can be shared across different platforms.
- What is the difference between a database and a data mart?
A database is a collected or related data records. The data in a database is smaller in size compared to a data warehouse whereas a data mart is a subset of a data warehouse which supports a particular business function. It consists of different kinds of data from different domains within an organization.
- Define the term OLAP
Online Analytical Processing is an extended relational database system which operates on a multidimensional data operation.
- Highlight the different types of OLAP
OLAP is divided into two; ROLAP and HOLAP
- What is Workflow in the context of data communication?
Workflow is a set of instructions to allow communication between the server and the implemented task.
- State the features of a workflow manager
- Task developer tool
- Task design tool
- Workflow designer and
- Work-let designer
- Explain the different types of schedules.
Scheduling is the process of automating task at a specific time and date. There are two types of schedulers;
- Reusable scheduler: This type of schedule is assigned to multiple workflows at the same time.
- Non-Reusable scheduler: The schedule is created for each workflow although it is converted to a reusable form.
- What is Power Centre repository?
This is the central data repository which is used to connect and retrieve data from multiple sources.
- Differentiate between a powerhouse and the repository server.
Powerhouse server is used to execute procedures within the factors of the database repository server whereas the repository server ensures reliability and uniformity of operations.
- What is your understanding of the term domain?
A domain is where nodes and interlinked relationships are undertaken at a single organizational point.
- In the context of Informatica Workflow manager, how many repositories can you create?
Depending on the number of available ports, you can create a number of repositories. There is no limitation on how many you can create.
- Why do you have to partition a session?
Session partitioning is very important as it enables you to get a better server processing and increase its competitiveness. It also enables you to implement solo sequences on the sessions.
- What are sessions in ETL context?
In Informatica ETL, sessions act as a teaching group for transforming information from the source to the target.
- State two uses of aggregator cache.
Aggregator cache is used to provide temporary memory for keeping transformed values.
The cache can locally store transitional values in the buffer memory.
- Briefly describe role-playing dimension.
Role-playing dimension is a technique which allows you to play diversified roles while at the same database domain.
- How can you access the repository reports?
To access the repository reports, you can use a metadata reporter. You don’t need to use a web-based app like the SQL or the transformations to access the repository.
- What are the advantages of Informatica ETL tool as compared to Teradata?
- Informatica is a data integration tool whereas Teradata is an MPP database with a fast data movement capability and a scripting data.
- ETL is logically arranged into Worklets and a workflow folder.
- Jobs can easily be monitored using Informatica workflow monitor.
- Informatica can publish processes and web services.
- Codes ETL to ensure there is a processing balance between the ETL server and the database box.
- Describe the following terms as used in transformation.
- Aggregator transformation: This is used to carry out aggregate calculations like sum and average among others.
- Expression transformation: It is used to test conditional statements. The conditions have to be met before outputting the results or moving to the target table.
- Filter transformation: It filters rows in a mapping. It consists of both input/output ports and if a row matches the condition, it will pass through that filter.
- Joiner transformation: It is a combination of mixed sources from different locations. A source qualifier transformations combine with data from the common source.
- Lookup transformation: When you want to maintain data in a relational table for mapping, a lookup transformation is used. Several lookup transformations can be used in a single mapping.