SQL (Structured Query Language) is the backbone of database management, and mastering it is essential for anyone pursuing a career in data science, software engineering, or IT. Whether you’re a seasoned professional or a recent graduate, being prepared for SQL-related interview questions can significantly boost your chances of landing your dream job. In this blog, we’ll explore the top 20 SQL interview questions you must know in 2024. These questions cover a wide range of topics, from basic concepts to advanced SQL features, and provide detailed answers to help you succeed in your next interview.
1. What is SQL?
Answer:
SQL, or Structured Query Language, is a standardized programming language used to manage and manipulate relational databases. It allows users to create, read, update, and delete data within a database, as well as manage the database schema and control access to the data.
Key Points:
- SQL is used by database management systems like MySQL, PostgreSQL, Oracle, SQL Server, and SQLite.
- It is a declarative language, meaning that users specify what they want to achieve without necessarily describing how to do it.
- SQL is essential for querying, inserting, updating, and deleting database records.
Example:
SELECT * FROM Employees WHERE Department = 'Sales';
This SQL statement retrieves all records from the “Employees” table where the department is “Sales.”
2. What are the different types of SQL statements?
Answer:
SQL statements can be categorized into five main types:
- Data Query Language (DQL):
- Used to query data from the database.
- Example:
SELECT
- Data Definition Language (DDL):
- Used to define and manage the structure of database objects.
- Examples:
CREATE
,ALTER
,DROP
- Data Manipulation Language (DML):
- Used to manipulate data in the database.
- Examples:
INSERT
,UPDATE
,DELETE
- Data Control Language (DCL):
- Used to control access to data in the database.
- Examples:
GRANT
,REVOKE
- Transaction Control Language (TCL):
- Used to manage transactions in the database.
- Examples:
COMMIT
,ROLLBACK
,SAVEPOINT
Explanation:
- DQL is primarily for retrieving data.
- DDL defines and modifies the database structure.
- DML is for performing operations on the data itself.
- DCL controls access to the data.
- TCL manages transaction boundaries and ensures data integrity.
3. What is a primary key in SQL?
Answer:
A primary key is a column (or a set of columns) in a table that uniquely identifies each row in that table. The primary key ensures that no two rows have the same value in the primary key column(s), and it cannot contain NULL
values.
Key Points:
- Each table can have only one primary key.
- The primary key ensures the uniqueness of data and is often used as a reference by foreign keys in other tables.
Example:
CREATE TABLE Employees ( EmployeeID INT PRIMARY KEY, FirstName VARCHAR(50), LastName VARCHAR(50) );
In this example, EmployeeID
is the primary key for the Employees
table.
4. What is a foreign key in SQL?
Answer:
A foreign key is a column or a set of columns in a table that establishes a link between the data in two tables. The foreign key in one table points to the primary key in another table, enforcing referential integrity between the two tables.
Key Points:
- A foreign key ensures that the value in a column must match a value in the referenced primary key column.
- It helps maintain consistency in the data by ensuring that relationships between tables are valid.
Example:
CREATE TABLE Orders ( OrderID INT PRIMARY KEY, OrderDate DATE, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID) );
Here, CustomerID
in the Orders
table is a foreign key that references the CustomerID
in the Customers
table.
5. What is the difference between JOIN
and UNION
in SQL?
Answer:
Both JOIN
and UNION
are used to combine data from multiple tables, but they do so in different ways.
JOIN:
- Combines columns from two or more tables based on a related column between them.
- Different types include
INNER JOIN
,LEFT JOIN
,RIGHT JOIN
, andFULL OUTER JOIN
.
UNION:
- Combines the result sets of two or more
SELECT
queries into a single result set. - The
UNION
operator removes duplicate records, whileUNION ALL
includes duplicates.
Example of JOIN
:
SELECT Orders.OrderID, Customers.CustomerName FROM Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;
Example of UNION
:
SELECT City FROM Customers UNION SELECT City FROM Suppliers;
Explanation:
JOIN
combines columns from related tables.UNION
combines rows from different result sets.
6. What is an INDEX
in SQL, and why is it used?
Answer:
An INDEX
in SQL is a database object that improves the speed of data retrieval operations on a table. It is similar to the index in a book, which helps you find information quickly without scanning every page.
Key Points:
- Indexes are created on columns that are frequently searched, filtered, or used in
JOIN
operations. - While indexes improve read performance, they can slow down
INSERT
,UPDATE
, andDELETE
operations because the index must be updated every time the data changes.
Example:
CREATE INDEX idx_customer_name ON Customers(CustomerName);
Explanation:
- In the example, an index is created on the
CustomerName
column in theCustomers
table to speed up queries that filter or sort byCustomerName
.
7. What is normalization in SQL?
Answer:
Normalization is the process of organizing data in a database to minimize redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and linking them using relationships.
Key Points:
- Normalization involves several normal forms (1NF, 2NF, 3NF, BCNF, etc.), each with its own rules and guidelines.
- The goal of normalization is to eliminate redundant data and ensure that each piece of data is stored in only one place.
Example:
Consider a table storing customer orders that includes customer information. To normalize it:
- Split the table into two:
Customers
andOrders
. - Link them using a foreign key.
Explanation:
Normalization reduces data redundancy, ensuring consistency and making the database easier to maintain.
8. What is denormalization in SQL?
Answer:
Denormalization is the process of intentionally introducing redundancy into a database by combining tables or adding redundant data to improve read performance. It is often used in data warehousing or when performance is prioritized over strict normalization.
Key Points:
- Denormalization may involve combining normalized tables into a single table or adding redundant columns to avoid
JOIN
operations. - It can lead to faster query performance but may result in increased storage requirements and potential data anomalies.
Example:
Combining the Customers
and Orders
tables into a single table to avoid JOIN
operations during query execution.
Explanation:
Denormalization is used when query performance is critical, even at the expense of data redundancy.
9. What is the difference between HAVING
and WHERE
in SQL?
Answer:
Both HAVING
and WHERE
clauses are used to filter data in SQL queries, but they are used in different contexts.
WHERE:
- Filters rows before any groupings are made.
- Cannot be used with aggregate functions like
SUM()
,COUNT()
, etc.
HAVING:
- Filters groups created by the
GROUP BY
clause. - Can be used with aggregate functions.
Example:
SELECT Department, COUNT(EmployeeID) FROM Employees GROUP BY Department HAVING COUNT(EmployeeID) > 10;
This query returns departments with more than 10 employees.
Explanation:
- Use
WHERE
to filter rows before grouping, andHAVING
to filter groups after grouping.
10. What is a subquery in SQL?
Answer:
A subquery, also known as an inner query or nested query, is a query within another SQL query. Subqueries are used to perform operations that depend on the result of another query.
Key Points:
- Subqueries can be used in
SELECT
,INSERT
,UPDATE
, andDELETE
statements. - They can return a single value, a single row, or multiple rows.
Example:
SELECT EmployeeName FROM Employees WHERE Salary > (SELECT AVG(Salary) FROM Employees);
This query retrieves the names of employees whose salary is above the average salary.
Explanation:
Subqueries allow you to dynamically retrieve and use data from other queries within a main query.
11. What is the difference between EXISTS
and IN
in SQL?
Answer:EXISTS
and IN
are both used to filter records based on whether certain
conditions are met, but they work differently.
IN:
- Checks if a value is present in a list or the result of a subquery.
- The subquery returns a list of values, and the
IN
clause filters records based on that list.
EXISTS:
- Checks whether a subquery returns any rows.
- The subquery returns
TRUE
orFALSE
based on whether it finds any matching records.
Example of IN
:
SELECT EmployeeName FROM Employees WHERE DepartmentID IN (SELECT DepartmentID FROM Departments WHERE Location = 'New York');
Example of EXISTS
:
SELECT EmployeeName FROM Employees WHERE EXISTS (SELECT 1 FROM Departments WHERE Employees.DepartmentID = Departments.DepartmentID AND Location = 'New York');
Explanation:
IN
is typically used for lists or sets of values, whileEXISTS
checks the existence of rows based on a condition.
12. What are aggregate functions in SQL?
Answer:
Aggregate functions in SQL perform calculations on a set of values and return a single value. They are often used with the GROUP BY
clause to group rows that share a common value and perform calculations on each group.
Key Aggregate Functions:
SUM()
: Calculates the total sum of a numeric column.AVG()
: Calculates the average value of a numeric column.COUNT()
: Counts the number of rows in a result set.MAX()
: Returns the maximum value in a column.MIN()
: Returns the minimum value in a column.
Example:
SELECT Department, AVG(Salary) FROM Employees GROUP BY Department;
This query calculates the average salary for each department.
Explanation:
Aggregate functions are essential for performing calculations on large datasets, especially in combination with GROUP BY
.
13. What is a VIEW
in SQL?
Answer:
A VIEW
in SQL is a virtual table that is based on the result of a SELECT
query. Views do not store data themselves but provide a way to simplify complex queries, encapsulate query logic, and present data in a specific format.
Key Points:
- Views can be used to hide complex queries, provide data security, and simplify data access.
- They can be queried like regular tables but are dynamically generated from the underlying tables when accessed.
Example:
CREATE VIEW EmployeeView AS SELECT EmployeeID, FirstName, LastName, Department FROM Employees WHERE Department = 'Sales';
This view provides a simplified way to access employee data from the Sales department.
Explanation:
Views are useful for presenting data in a specific way without modifying the underlying tables.
14. What is a stored procedure in SQL?
Answer:
A stored procedure is a precompiled collection of SQL statements that can be executed as a single unit. Stored procedures are used to encapsulate logic, improve performance, and promote code reuse.
Key Points:
- Stored procedures can accept parameters, perform operations, and return results.
- They are stored in the database and can be called by applications or other SQL statements.
Example:
CREATE PROCEDURE GetEmployeeDetails @EmployeeID INT AS BEGIN SELECT FirstName, LastName, Department, Salary FROM Employees WHERE EmployeeID = @EmployeeID; END;
This stored procedure retrieves the details of an employee based on their EmployeeID
.
Explanation:
Stored procedures are useful for encapsulating business logic and improving performance by reducing the need for repeated query parsing.
15. What is a trigger in SQL?
Answer:
A trigger in SQL is a special kind of stored procedure that automatically executes in response to specific events on a table or view, such as INSERT
, UPDATE
, or DELETE
operations.
Key Points:
- Triggers are used to enforce business rules, maintain audit trails, and ensure data integrity.
- They can be defined to execute before or after the specified event.
Example:
CREATE TRIGGER trg_UpdateEmployeeSalary ON Employees AFTER UPDATE AS BEGIN INSERT INTO SalaryAudit (EmployeeID, OldSalary, NewSalary, ChangeDate) SELECT EmployeeID, OldSalary, Salary, GETDATE() FROM Inserted; END;
This trigger logs changes to employee salaries in a SalaryAudit
table whenever an update occurs.
Explanation:
Triggers are powerful tools for automating tasks and enforcing rules in a database.
16. What is the difference between DELETE
, TRUNCATE
, and DROP
in SQL?
Answer:DELETE
, TRUNCATE
, and DROP
are SQL commands used to remove data or database objects, but they differ in their functionality and impact.
DELETE
:
- Removes specific rows from a table based on a condition.
- Can be rolled back (transactional).
- Triggers are fired.
Example:
DELETE FROM Employees WHERE Department = 'Sales';
TRUNCATE
:
- Removes all rows from a table, but the table structure remains.
- Cannot be rolled back (non-transactional).
- Triggers are not fired.
Example:
TRUNCATE TABLE Employees;
DROP
:
- Completely removes a table or database from the system, including the data and structure.
- Cannot be rolled back.
Example:
DROP TABLE Employees;
Explanation:
- Use
DELETE
for removing specific rows. - Use
TRUNCATE
to quickly remove all rows from a table. - Use
DROP
to remove an entire table or database.
17. What is a CURSOR
in SQL?
Answer:
A CURSOR
in SQL is a database object used to retrieve, manipulate, and iterate over the rows of a result set one row at a time. Cursors are useful when you need to perform row-by-row processing that cannot be easily achieved with set-based operations.
Key Points:
- Cursors are typically used in stored procedures, triggers, and scripts where complex row-by-row logic is needed.
- They can be used to update or process rows individually.
Example:
DECLARE @EmployeeID INT; DECLARE employee_cursor CURSOR FOR SELECT EmployeeID FROM Employees WHERE Department = 'Sales'; OPEN employee_cursor; FETCH NEXT FROM employee_cursor INTO @EmployeeID; WHILE @@FETCH_STATUS = 0 BEGIN -- Perform operations on each employee FETCH NEXT FROM employee_cursor INTO @EmployeeID; END; CLOSE employee_cursor; DEALLOCATE employee_cursor;
Explanation:
Cursors allow for fine-grained control over row-by-row processing, but they can be slow and resource-intensive, so they should be used sparingly.
18. What is the COALESCE()
function in SQL?
Answer:
The COALESCE()
function in SQL returns the first non-NULL
value from a list of expressions. It is often used to handle NULL
values in queries.
Key Points:
COALESCE()
can take two or more arguments.- It returns the first non-
NULL
value encountered in the list.
Example:
SELECT EmployeeName, COALESCE(Phone, 'N/A') AS ContactNumber FROM Employees;
This query returns the phone number for each employee, but if the phone number is NULL
, it returns ‘N/A’.
Explanation:COALESCE()
is useful for handling NULL
values and ensuring that queries return meaningful results.
19. What is the CASE
statement in SQL?
Answer:
The CASE
statement in SQL is used to perform conditional logic within SQL queries. It allows you to return different values based on conditions, similar to an IF-THEN-ELSE
structure in programming languages.
Key Points:
- The
CASE
statement can be used inSELECT
,INSERT
,UPDATE
, andDELETE
statements. - It evaluates conditions sequentially and returns the corresponding result.
Example:
SELECT EmployeeName, CASE WHEN Salary > 100000 THEN 'High' WHEN Salary BETWEEN 50000 AND 100000 THEN 'Medium' ELSE 'Low' END AS SalaryCategory FROM Employees;
This query categorizes employees’ salaries into ‘High’, ‘Medium’, or ‘Low’ based on their salary.
Explanation:
The CASE
statement is a powerful tool for performing complex conditional logic within SQL queries.
20. What is the difference between RANK()
, DENSE_RANK()
, and ROW_NUMBER()
in SQL?
Answer:RANK()
, DENSE_RANK()
, and ROW_NUMBER()
are window functions in SQL that assign a unique rank or number to each row within a partition of a result set, but they behave differently in the presence of ties.
RANK()
:
- Assigns a unique rank to each row, with gaps in the rank values if there are ties.
- Example: If two rows tie for rank 1, the next rank assigned will be 3.
DENSE_RANK()
:
- Assigns a unique rank to each row without gaps in the rank values, even in the case of ties.
- Example: If two rows tie for rank 1, the next rank assigned will be 2.
ROW_NUMBER()
:
- Assigns a unique sequential integer to each row within a partition,