Here is the summary of all questions and answers from the Airtel SDE-2 interview experience:
Interview Overview
Role: SDE-2
Company: Airtel
Mode: Virtual, Google Meet
Duration: Around 90 minutes
Panel: 1 Senior Engineer + 1 Tech Lead
Main focus: Java, Spring Boot, Kafka, System Design, Monitoring, SLA, backend performance
- Kafka Fundamentals
Question: How does Kafka ensure message delivery guarantees?
Answer summary:
Kafka provides three delivery guarantees:
| Guarantee | Meaning |
|---|---|
| At-most-once | Message may be lost, but not delivered again |
| At-least-once | Message will not be lost, but may be delivered more than once |
| Exactly-once | Message is processed exactly once using idempotent producers and transactions |
Important Kafka configs mentioned:
acks=all
min.insync.replicas=2
enable.idempotence=true
Key learning: Mention real production configs, not only theory.
- Monitoring and Observability
Question: How do you track production system performance and uptime?
Answer summary:
Use monitoring and observability tools like:
| Tool | Purpose |
|---|---|
| Grafana + Prometheus | Metrics and dashboards |
| AWS CloudWatch | Infrastructure alerts |
| ELK Stack | Centralized logging |
They configured alerts for:
Latency spikes
Downtime anomalies
Error rate increases
SLA breaches
Key learning: Always connect tools with metrics like latency, throughput, uptime, and error rate.
- API SLA and Performance
Question: What is an SLA for an API, and how do you ensure it is met?
Answer summary:
SLA means Service Level Agreement. It defines expected guarantees such as:
API uptime
Response time
Error tolerance
Availability
Ways to ensure SLA:
| Technique | Purpose |
|---|---|
| Redis caching | Faster response |
| Circuit breaker | Prevent cascading failure |
| Retry pattern | Handle temporary failures |
| Load balancing | Distribute traffic |
| Async processing | Improve performance |
They mentioned using Resilience4j for circuit breaker and retry patterns.
Key learning: Don’t only define SLA. Explain how you maintain it in production.
- Coding Question — Rotated Sorted Array
Question: Find an element in a rotated sorted array.
Example problem:
Given a rotated sorted array and a target, return the index of the target, otherwise return -1.
Brute Force
Check every element one by one.
int search(int[] nums, int target) {
for (int i = 0; i < nums.length; i++) {
if (nums[i] == target) return i;
}
return -1;
}
Time complexity: O(N)
Optimized Approach
Use modified binary search.
int search(int[] nums, int target) {
int left = 0, right = nums.length - 1;
while (left <= right) {
int mid = (left + right) / 2;
if (nums[mid] == target) return mid;
if (nums[left] <= nums[mid]) {
if (target >= nums[left] && target < nums[mid])
right = mid - 1;
else
left = mid + 1;
} else {
if (target > nums[mid] && target <= nums[right])
left = mid + 1;
else
right = mid - 1;
}
}
return -1;
}
Time complexity: O(log N)
Key learning: First explain brute force, then optimize, then explain why binary search works.
- System Design Question
Question: Suppose you are tracking the speed of a production line in real time. How would you design the system?
Answer summary:
Proposed architecture:
Devices/Sensors
↓
Kafka Topics
↓
Spring Boot Consumer Service
↓
Processing / Aggregation
↓
Time-Series Database
↓
Grafana Dashboard
↓
Alerting Service
Components mentioned:
| Component | Purpose |
|---|---|
| Devices | Send speed data |
| Kafka | Handle real-time event streaming |
| Spring Boot service | Consume and process data |
| InfluxDB / DynamoDB | Store time-series data |
| Grafana | Visualize real-time metrics |
| Kubernetes | Scaling and deployment |
| Alerting service | Trigger alerts on threshold breach |
Important design points:
Scalability
Fault tolerance
Low latency
Real-time freshness
Monitoring and alerting
Key learning: For SDE-2, system design should be practical and production-oriented.
- Project Challenge / Incident Question
Question: Tell me about a major challenge you faced in your project.
Answer summary:
The candidate discussed a payment duplication issue caused by retry loops in a distributed system.
Solution used:
| Solution | Purpose |
|---|---|
| Idempotency keys | Avoid duplicate processing |
| Deduplication table | Track transaction IDs |
| Circuit breaker | Prevent repeated failure calls |
| Retry pattern | Handle temporary failures safely |
Key learning: Interviewers want to see debugging mindset, ownership, and production experience.
Overall Key Learnings
The interview was more scenario-based than theoretical.
Important areas to prepare:
Kafka delivery guarantees
Spring Boot internals
API SLA and performance
Monitoring tools like Grafana, Prometheus, ELK
Resilience4j
Circuit breaker and retry patterns
Rotated sorted array / binary search
Real-time system design
One strong production incident story
Final Preparation Checklist
For Airtel SDE-2, prepare these:
| Area | What to revise |
|---|---|
| Java | Collections, multithreading, concurrency basics |
| Spring Boot | REST APIs, filters, interceptors, dependency injection |
| Kafka | Partitions, consumer groups, offset, delivery guarantee |
| System Design | Real-time data processing, Kafka-based architecture |
| Monitoring | Grafana, Prometheus, CloudWatch, ELK |
| Resilience | Retry, circuit breaker, timeout, idempotency |
| DSA | Binary search, arrays, sliding window |
| Project Story | One production bug or outage story |
Best takeaway:
This Airtel SDE-2 interview checks whether you can build, debug, monitor, and scale real backend systems, not just answer textbook definitions.