Saturday, March 1, 2014

Broadcast Traffic in Data Centers

This post talks in general about problems and trade-off specific to networks in data centers :  general requirements of Data Centers and to what extent L2 and L3 networks meet these requirements.

After going through a few papers and some interesting Coursera videos on SDN, we were able to outline our problem statement better.

Data Centers and their requirements - (A high level perspective)

To understand our problem statement better, it is important to understand the working and requirements in Data Centers clearly.

It is common for user applications to span thousands of servers, where a single user search request might access an inverted index spread over 1k+ servers. In addition to this, analytics involves constant querying of the data stored in these servers. These use cases indicate a high amount of internet traffic within Data Centers, a fair percentage of the total Internet Communication. Hence, any kind of mechanism that can reduce the unnecessary internal traffic like broadcast traffic, can lead to a lot of performance enhancement for DCs.

Because of the scale that DCs deal with, some of the requirements/constraints include:
1. VM migration - IP shouldn't change as existing TCP connection will break and the application level state will be lost. VM Migration is a basic requirement for flexible and efficient resource usage.
2. Minimize configuration of the forwarding devices deployed in the DC network.
3. No forwarding loops - to prevent occurrences of broadcast storms and flooding

Lets see how L2 and L3 match up to these requirements :

- Layer 2 networks use MAC addressesing  - a flat addressing scheme.
- The layer 2 network switches have no configuration overheads - the switch is self learning, plug and play.
- In data centers, its usually good to have physical loops so that fail over can be easy and load can be distributed among the links. In this case, spanning tree protocol has to be run on L2 network. But doing so removes the redundant links and does not utilize them.
- To address the 1st requirement, the VM migration should take place within the same Layer 2 network, so that it can continue to have the same IP address.
-Also, L2 networks are more efficient than L3 networks, for several reasons :
Switches don't have to modify packets. Just lookup and forward
But routers have to modify layer 2 frame and update it with its own mac address.
They have to decapsulate one more extra layer - layer 3  and also modify it, mainly ttl which the switches don't have to. This also adds an overhead computation because the IP checksum has to be recomputed due to the modification of the packet.

Also the switch's' logic is usually implemented in specialized hardware which are much more faster than using general purpose micro processors which is usually the case with routers.
This efficiency is desirable in DC as low latency is of very high priority.
But broadcast traffic remains as one of the biggest downsides of using L2 network.

- Layer 3 networks use IP - a hierarchical addressing scheme.
- VM migration can only happen within the same subnet if the IP has to be retained. But this would limit the choice of physical hosts to where the VM can migrate to.
- Layer 3 network switches or Routers as they are called, require-
1. the hosts to be configured with the address of the gateway Router for the subnet it belongs to
2. the addition of each new Router, involves setting its subnet information
  Also, synchronizing of DHCP servers is required to distribute IP addresses based on the host's subnet.
  These tasks are overseen by network administrators leading to a lot of difficulty in management.
- At the same time, using Routers, eliminates the switch level broadcast as the addressing scheme being used is structured and hence the routers know where to forward the incoming packets based on their destination IP address.
- Layer 3 forwarding can have loops as they are handled by the TTL field and redundant entries for easy fail over.

Note - An ideal system would involve the benefits of both - structured Layer 3 addressing which doesn't use broadcast as well as the simple Layer 2 forwarding mechanism with no configuration overheads and easy vm migration.

As we can see, L2 networks meet many of the ideal requirements of a Data Center, but due to the scaling problem of broadcast traffic they are unusable at such a scale. Our aim is to make it scalable.