Interviewing for a Data Architect role requires deep knowledge of SQL, Python, Snowflake, and real-world data modeling. Recently, I appeared for a Data Architect interview at Colgate-Palmolive, and I wanted to share a detailed breakdown of everything that happened — the questions, scenarios, queries, and how I approached them.
This blog is especially useful for Data Engineers, Senior Data Engineers, BI Developers, and anyone preparing for cloud analytics architecture roles.
🔹 Introduction: Setting the Stage
The interview panel consisted of two members:
- A Manager
- A Technical Lead
We began with introductions where I spoke briefly about my:
- Experience building data pipelines
- Work with Snowflake, SQL, and Python
- Projects involving ETL/ELT, data modeling, and analytics
- Approach to handling data quality and maintaining reliable workflows
The tone was conversational and focused on how I think through end-to-end architecture.
🔹 Deep Dive Into My Projects
The panel asked detailed questions about:
✔ Business problems I solved
✔ Data ingestion & transformation layers
✔ Streaming vs batch decisions
✔ Ensuring data quality & validation
✔ Dimensional modeling & SCD handling
✔ Pipeline design choices and optimizations
They wanted clarity on why I made each decision — not just what I did.
🔹 SQL Challenges (Real Interview Questions)
This was the core of the technical round.
Here are the actual SQL problems I was given.
1️⃣ Percentage of Sales for Each Employee (Window Function Question)
Problem Statement
You are given a table where:
- Each employee sells 3 products
- You have quantities per product
- Calculate the percentage contribution of each product to the total sales of that employee
Sample Table: employee_sales
| employee_id | employee_name | product | quantity |
|---|---|---|---|
| 1 | A | P1 | 10 |
| 1 | A | P2 | 20 |
| 1 | A | P3 | 30 |
| 2 | B | P1 | 15 |
| 2 | B | P2 | 25 |
| 2 | B | P3 | 10 |
SQL Solution
SELECT
employee_id,
employee_name,
product,
quantity,
ROUND(
quantity * 100.0
/ SUM(quantity) OVER (PARTITION BY employee_id),
2
) AS pct_of_employee_sales
FROM employee_sales
ORDER BY employee_id, product;
Why this works
- Window function computes total per employee
- Percentage = (product_qty / employee_total) * 100
2️⃣ Employee–Manager Mapping (SAME TABLE, NULL Manager Case)
This was the second SQL question given.
Problem Statement
You are given one table that contains both employees and their managers.
Some employees have NULL manager_id.
Write a query to display each employee along with their manager’s name.
If manager_id is NULL, output should show NULL.
Table: employees
| emp_id | emp_name | manager_id |
|---|---|---|
| 1 | Alice | NULL |
| 2 | Bob | 1 |
| 3 | Charlie | 1 |
| 4 | David | 2 |
| 5 | Emma | NULL |
Expected Output
| emp_id | emp_name | manager_name |
|---|---|---|
| 1 | Alice | NULL |
| 2 | Bob | Alice |
| 3 | Charlie | Alice |
| 4 | David | Bob |
| 5 | Emma | NULL |
SQL Solution (Self-Join)
SELECT
e.emp_id,
e.emp_name,
m.emp_name AS manager_name
FROM employees e
LEFT JOIN employees m
ON e.manager_id = m.emp_id
ORDER BY e.emp_id;
Why this works:
- Self-join matches employee → manager
LEFT JOINensures employees without managers still appearm.emp_namebecomes NULL automatically for top-level employees
This is a very common question to test hierarchical SQL thinking.
🔹 Python & OOP Questions
Next, they asked Python questions covering:
- Class & instance attributes
- Inheritance
- Method overriding
- Output prediction
- Basic OOP architecture logic
They were checking conceptual clarity — not advanced algorithms.
🔹 Snowflake Questions & Scenarios
This was a major component.
Topics included:
- Streams & Tasks
- Auto-suspend and auto-resume
- Virtual warehouse cost model
- Time Travel & Fail-safe
- Micro-partitions
- Data loading (COPY, Snowpipe, stages)
Example scenario they asked:
If a Snowflake warehouse runs for only 3 minutes and then auto-suspends, how much will it cost?
They wanted understanding of:
- Snowflake’s per-second billing
- Minimum credit charge
- How suspend/resume optimizes cost
🔹 Managerial & Business Questions
The Manager asked:
- Why I’m considering a change
- What I know about Colgate-Palmolive’s business
- How I interact with cross-functional teams
- Ownership mindset and delivery approach
These questions test cultural fit and communication clarity.
🔹 Final Thoughts
This Colgate interview was well-balanced — covering:
- SQL fundamentals
- Python OOP
- Snowflake architecture
- Data modeling
- Business reasoning
If you’re preparing for a Data Architect / Senior Data Engineer interview, make sure you are confident in:
- Window functions
- Self-joins & hierarchical SQL
- Data modeling principles
- Cloud data warehouses (Snowflake/Azure/AWS)
- Pipeline architecture decisions
- Communicating clearly
Leave a Reply