Load balancing is a fundamental system design technique used to distribute incoming network traffic or computational workloads across multiple servers or resources. The primary goal is to optimize resource utilization, ensure high availability, prevent server overload, and maintain a seamless user experience. In modern applications, where millions of users may access a service simultaneously, load balancing ensures that no single server becomes a bottleneck, reducing latency and preventing downtime.