From: Richard Whitehouse Date: Fri, 13 May 2011 09:24:34 +0000 (+0100) Subject: Modifications based on mas90 comments X-Git-Url: https://git.richardwhiuk.com/?a=commitdiff_plain;h=6064469944474bde4a7df103fb8cf9b3600f9481;p=ii-diss.git Modifications based on mas90 comments --- diff --git a/introduction.tex b/introduction.tex index e9dbea2..5e434c9 100644 --- a/introduction.tex +++ b/introduction.tex @@ -1,91 +1,100 @@ \chapter{Introduction} -% TODO: Identify additional link layer protocols to incorporate +Data Link Layer protocols are responsible for data transfer between adjacent nodes, routing of data within a local network segment and providing and regulating access to the physical network layer and allowing services to be run on top. Ethernet, the dominant protocol in this area, has been around since the early 1970s \cite{spurgeon}. Research has indicated \cite{myers} that Ethernet has a number of issues as the size of the network scales up. A number of different proposals have been made to improve Ethernet, notably SEATTLE \cite{seattle-1} and MOOSE \cite{moose1}. -Data Link Layer protocols are responsible for data transfer between adjacent nodes, routing of data within a local network segment and providing and regulating access to the physical network layer and allowing services to be run on top. Ethernet, the dominant protocol in this area, has been around since the early 1970s \cite{ethernet}. Research has indicated that Ethernet has a number of issues as the size of networks scale up. A number of different proposals have been made replace Ethernet, notably SEATTLE \cite{seattle} and MOOSE \cite{moose1}. Due to the large numbers of hosts and data involved in performing such a simulation, and the variety of different network topologies I shall analyse, the only suitable mechanism in order to perform this evaluation is under simulation. +Since the number of hosts will need to be very large in order to demonstrate the issues associated with Ethernet, and they will be arranged in a number of different topologies, the most suitable mechanism in order perform this evaluation is under simulation, where the topology of the network can easily be altered. -In this paper, I aim to compare one of these proposals to Ethernet in order to provide qualatative evidence to show the differences between the networks, and how MOOSE identifies the deficiencies identifed. In order to do this, I set out to create a representative implementation of Ethernet and MOOSE for a modern network simulator. +In this paper, I aim to make a quantatative evaluation of Ethernet and one of the proposed improvements, MOOSE, comparing them under a variety of different situations. I have chosen to focus on MOOSE as other protocols, like SEATTLE, have already been evaluated \cite{seattle-2}. In order to do this, I set out to create a representative implementation of the protocols on a modern network simulator, and run tests to explore their differences. \section{Background} -% TODO: insert intro to background here - \subsection{Protocol Stack} -When designing for use in a computer network architecture, it is common to divide the functions into layers, with each layer providing services to the layer above, and requiring services from the layer below. By doing this we can provide separation of concerns in forms of an abstraction of what the layer above and below require and provides us. This reduces the complexity of the network stack. Splitting up the different layers also allows the system to be modular, allowing us to specify different sets of protocols for different architecture and applications. +When implementing a computer network architecture, it is advantageous to separate the different functions into different protocol layers, with each layer providing services the layer above, and requiring services from the layers below. By performing this separation, we can divide up the complexity required to be implemented. This is done by each layer providing an abstraction for the layer above and below. This allows each layer to concentrate on only providing a small amount of functionality. This technique also allows the system to be modular, and allows us to select different protocols to perform different tasks for different applications on different architectures. -Several different proposals have been made for the different layers of the network stack, the seven layer ISO OSI Model \cite{osi} and the four layers in the Internet Protocol Suite \cite{rfc1122}. In practice the model used by the Internet Protocol is largely represenative, along with a addition physical layer. +Several different proposals have been made for the different layers of the network stack, the seven layer ISO OSI Model \cite{osi} and the three layers in the Internet Protocol Suite \cite{rfc1122}. In practice the model used by the Internet Protocol is largely represenative, along with a additional physical layer and link layer. -The layers described below operate on packets, also known as frames and datagrams. Each layer may add headers to the front of the packet and trailers to the end in order to add routing information and checksums as necessary. A protocol may choose to split up a packet, in which case it is required to rejoin it at the other end. +Each layer describes the protocol data unit (PDU) used, as well as a the services it provides, and the terminilogy used. \subsubsection{Physical Layer} -The stack starts at the bottom with the physical layer. This is primarily concerned with the physical interconnection of the two computers and how the basic signalling works in terms of cabling or radio. Standards on this layer include 100BASE-T \cite{ieee802-3u} and 1000BASE-T \cite{ieee802-3ab} - commonly known as Fast Ethernet amd Gigabit Ethernet respectively. +The stack starts at the bottom with the physical layer. This is primarily concerned with the physical interconnection of the two computers and how the basic signalling works in terms of cabling or radio. Standards on this layer include 100BASE-TX \cite{ieee802-3u} and 1000BASE-T \cite{ieee802-3ab} - implementations of Fast Ethernet amd Gigabit Ethernet respectively. The data on this layer is in the form of a raw bitstream. \subsubsection{Data Link Layer} -The data link layer layer providees basic addressing and controlling access to the physical layer. The dominant protocol in this area is Ethernet as part of IEEE 802.3. +The data link layer layer providees basic addressing and controlling access to the physical layer, as well as subnet routing. A subnet is a small part of the complete network, often this maybe a Local Arean Network (LAN), though it may be as wide as an entire city, or just a small component of a LAN. This layer provides easy mobility of hosts among the subnet, without requiring reconfiguration. The dominant protocol in this area is Ethernet as part of IEEE 802.3. Data sent to this layer is encapsulated in a Ethernet frame, which then has additional data placed at the header and trailer of the frame. \subsubsection{Network Layer} -The network layer provides large scale routing, control over network traversal and identification of hosts. The primary protocol in this area is IP - Internet Protocol, in particular IPv4 \cite{rfc791} although IPv6 \cite{rfc1883} is starting to gain traction due to impending address exhaustion \cite{icann-030211}. +The network layer provides routing across the wider network, and identification of hosts on a global scale, with current networks having estimated as having a billion hosts. The primary protocol in this area is IP - Internet Protocol, in particular IPv4 \cite{rfc791} although IPv6 \cite{rfc1883} is starting to gain traction due to impending address exhaustion \cite{icann-030211}. This provides IP Packets as the primary abstraction \subsubsection{Transport Layer} -The transport layer provides multiplexing of data between end hosts. On this level also exists the capability for connection setup, reliability and ordering. Examples of protocols on this layer are UDP \cite{rfc768} and TCP \cite{cerf74}. +The transport layer provides multiplexing of data between applications on end hosts. On this level, protocols can also specify the capability for connection setup, reliability, encryption and ordering. Examples of protocols on this layer are UDP \cite{rfc768} and TCP \cite{cerf74}. TCP provides TCP segments which are reliably transmitted, ordered and provided at the other end, whereas UDP provides UDP datagrams which are simply sent to the opposite host, with no reliability or ordering. \subsubsection{Application Layer} -The top layer of the protocol stack in the Internet Protcol Suite model and the top three layers in the OSI Reference Model, Application, Presentation and Session, can be considered as a singular discrete layer - the application layer. These provide user level services situated on end hosts. Examples include HTTP \cite{rfc2616} and DNS \cite{rfc1035}. These provide for the transmission of material in the form of hypertext documents and other media and a distributed directory of names and addresses. +The top layer of the protocol stack in the Internet Protocol Suite model and the top three layers in the OSI Reference Model, Application, Presentation and Session, can be considered as a singular discrete layer, the application layer. These provide user level services situated on end hosts. Examples include HTTP \cite{rfc2616} and DNS \cite{rfc1035}. These provide for the transmission of material in the form of hypertext documents and other media and a distributed directory of names and addresses. \subsection{Ethernet} % TODO: queueing needs mentioning -Ethernet is the most widely deployed and used Local Area Networking - LAN - technology currently in the market. While the physical layer of the standard has undergone a number of iterations, the data link layer protocol has remained largely stagnant due to the lack of motivation and a desire for interopability with current equipment. +Ethernet is the most widely deployed and used Local Area Networking - LAN - technology currently in the market. While the physical layer of the standard has undergone a number of iterations, the data link layer protocol has remained largely stagnant due to the lack of motivation and a desire for interoperability with current equipment. -Ethernet was first described in a memo on May 22, 1973 at Xerox PARC by Bob Metcalfe who designed it to interconnect workstations and printers in a modern computer network. \cite[p.~125]{spurgeon} It was to be a shared medium network based on the previous work by the University of Hawaii in the late 1960's with the Aloha network. \cite{abramson}. Ethernet pioneered the technology by incorporating CSMA/CD (Carrier Sense Multiple Access with Collision Detection). This allowed the original Ethernet networks to be a single shared medium network where all the computers were connected together using a single cable. Every computer on the network would recieve all the packets and, based on the header, the Ethernet controller would inform the system when a packet arrived which was for the host in question. +Ethernet was first described in a memo on May 22, 1973 at Xerox PARC by Bob Metcalfe \cite{ethernet} who designed it to interconnect workstations and printers in a modern computer network \cite[p.~125]{spurgeon}. It was to be a shared medium network based on the previous work by the University of Hawaii in the late 1960's with the Aloha protocol \cite{abramson}. Ethernet used the CSMA - Carrier Sense Multiple Access - designed for ALOHA for and added Collision Detection, to create a cable connected shared medium network. This allowed the original Ethernet networks to be a single shared medium network where all the computers were connected together using a series of cables. Every computer on the network would receive all the packets and, based on the header, the Ethernet interface would inform the system when a packet arrived which was for the host in question. -This relied on each computer having a unique address to which it could be sent packets, which the controller would listen out for. This was also the only required property - the address did not need to be transferable, it did not need to describe any information about the owner and it did not need to contain routing information. In order to provide this guarantee, Ethernet controllers were allocated a 48 bit long address. The first three bytes - 24 bits, were assigned by the IEEE and identify the type of address and the manufacturer. The last three bytes are assigned by the manufacturer, with the manufacturer guaranteeing that the address was unique, with an additional block of of addresses allocated when a manufacturer had run out. +This relied on each computer having a unique address to which it could be sent packets, which the controller would listen out for. This was also the only required property - the address did not need to be transferable, it did not need to describe any information about the owner and it did not need to contain routing information. In order to provide this guarantee, Ethernet controllers were allocated a 48 bit long address. The first three bytes, 24 bits, consist of one bit which determines whether the packet is multicast / broadcast or unicast, one bit which determines whether the address is locally administered or universally administered, and then the remaining 22 bits, which in the case of unicast, universally administered addresses, are used to designate a manufacturer, and are assigned by the IEEE.The last three bytes are assigned by the manufacturer, with the manufacturer guaranteeing that the address is unique, with additional block of of addresses allocated when a manufacturer had run out. Each manufactuer gives each interface a unique address by incrementally assigning the addresses within each block. -As network traffic increased, and more computers were added to local area networks, the reality of all computers sharing a single shared media with which any packet would collide with other traffic became impractical. This was exacerbated by distant computers being far away from each other, descreasing the utility of Carrier Sense, increasing the likelihood of collision, which is followed by a process where each host backed off before trying again, thus decreasing utilisation of the network. +As network traffic increased, and more computers were added to local area networks, the reality of all computers sharing a single shared media with which any packet would collide with other traffic became impractical. This was exacerbated by computers being located far away from each other, descreasing the utility of Carrier Sense, increasing the likelihood of collision, which is followed by a process where each host backed off before trying again, thus decreasing utilisation of the network. -The solution to this was to split the single colision domain into a set of domains which would be joined together using a network bridge to form a single collision domain. \cite{ieee802-1d} The bridge, or switch as it is also known due to the manner in which it operates, filters traffic between the different collision domain, only sending traffic where it is required. As the number of hosts on the network, and the speed of network traffic increased, more and more collision domains were created. Modern Ethernet networks are entirely switched networks with each host having a full duplex link to the local switch. +The solution to this was to split the single collision domain into a set of different domains which would be joined together using network bridges to form a single broadcast domain \cite{ieee802-1d}. The bridge, or switch as it is also known due to the manner in which it operates, filters traffic between the different collision domains, only sending traffic where it is required. As the number of hosts on the network, and the speed of network traffic increased, more and more collision domains were created. Modern Ethernet networks are entirely switched networks with each host having a full duplex point to point link to the local switch, and each switch being connected directly to other switches, to form a network topology. -The next problem was that as networks became more vital to operation, it became desirable to have redundant links. In order to allow this the switches were often aranged in a way that formed loops. However, the Ethernet protocol relies on traffic being sent to a tree, and not a graph with cycles in it, in order to prevent broadcast traffic from cycling the network endlessly using up all the available bandwidth. In order to solve this Ethernet uses a spanning tree protocol in order to convert the graph of switches into a tree. This tree is not guranteed to a minimum spanning tree, instead it is guaranteed that all switches have the shortest possible path to a root bridge which may be specified by the network administrator in top end switches. The current iteration of this protocol is the Rapid Spanning Tree Protocol - RSTP and this disables links until that condition is met. In practical terms this means that redundant links which could be used to increase the available network bandwidth are instead disabled. Should the state of the network change, they will be re-enabled to allow the network to continue to operate. +The next problem was that as networks became more vital to the operation of a business, it became desirable to have redundant links, so that when a link failed, traffic could be diverted. In order to allow this the switches were often arranged in a way that formed topological loops. However, the Ethernet protocol relies on traffic being sent to a tree, and not a graph with cycles in it, in order to prevent broadcast traffic from cycling the network endlessly using up all the available bandwidth. In order to solve this Ethernet uses a spanning tree protocol in order to convert the graph of switches into a tree. This tree is not guaranteed to be a minimum spanning tree, instead it is guaranteed that all switches have the shortest possible path to a root bridge, which may be specified by the network administrator.s. The current iteration of this protocol is the Rapid Spanning Tree Protocol, RSTP, which changes the state of links until that condition is met. In practical terms this means that redundant links which could be used to increase the available network bandwidth are instead disabled. Should the state of the network change, they will be re-enabled to allow the network to continue to operate. \subsubsection{Limitations} % TODO: Additional limitations -As Ethernet addresses contain no routing information, unlike IP addresses in which parts of the network space are delegated to segments of the network, it is impossible in a practical scenario to perform any form of aggregation. This means in order for network switches to switch traffic between different network segments it has to contain a lookup between each address and the port to send the data. Due to the high speed at which modern network links operate, this has to be done quickly. This is done using Content Addressable Memory (CAM) which provides O(1) lookup time for unsorted data. CAM is very expensive in large blocks and consumes large amounts of power. This means that modern Ethernet switches have a practical state table size maximum of between 8000 and 32000 hosts. \cite{cam}. +As Ethernet addresses contain no routing information, unlike IP addresses in which parts of the network space are delegated to segments of the network, it is impossible in a practical scenario to perform any form of address aggregation. This means in order for network switches to switch traffic between different network segments it has to contain a lookup between each address and the port to send the data. Due to the high speed at which modern network links operate, this has to be done quickly. Depending on the switch, this might be done using Content Addressable Memory, CAM, \cite{cam} which provides $O(1)$ lookup time for unsorted data, however is very expensive in large blocks and consumes considerable amounts of power. Alternatively it can be done using a lookup table, which while cheaper, provides worse performance as the number of entries grows. Since the time spent processing a frame is critical, this limits the feasible size of the state table. This means that modern Ethernet switches have a practical state table size maximum of between 8000 and 32000 hosts \cite{switchds}. -Current trends in data centers, with large numbers of machines in a dense configuration \cite{facebook}, each machine containing multiple hosts through the utilisation of Xen VM \cite{xen} or VMware ESX hypervisors, giving each host a virtual Ethernet controller are causing this limit to be exceeded \cite{moose1}. When the switch's state table is full it can not store any more entries. When a packet arrives to a host not in the state table, it must revert to broadcast, increasing the amount of traffic on the network, decreasing utilisation and increasing the delay per packet. +Current trends in data centres, with large numbers of machines in a dense configuration \cite{facebook}, with each machine able to contain multiple hosts via virtualisation software such as Xen \cite{xen}, giving each host a virtual Ethernet controller causes this limit to be exceeded \cite{moose1}. As such, the number of hosts on the network will exceed the size of state table. This will cause any host's whose state can not be stored in the switch to have their frames entering the switch being flooding the frame on every port. As such the overall traffic on the network will increase, decreasing the utilisation and increasing the delay per packet. -Another limitation is that when the spanning tree protocol converts the network from a graph to a tree it disables a large number of links, reducing the available network utilisation. A more intelligent protocol could allow these links to be enabled in order to increase utilisation. This could also use multiple links when available to increase the available bandwidth between switches. +Another limitation is that when the spanning tree protocol converts the network to a tree from the inital graph topology, it disables the number of available links. By doing this, it reduces the available network bandwidth. With a better routing protocol, these links could be put to use to decrease the shortest path between two hosts, as well as being used for multipath routing. \subsection{MOOSE} % TODO: Make this section clearer. Add graphic -MOOSE - Multi-level Origin-Organised Scalable Ethernet is a proposed improvement on the Ethernet protocol design to backwards compatible with the large number of ethernet controllers currently in the market place. MOOSE requires only the replacement of switches in the target network, or in some cases simply installing the MOOSE code on the existing switches. +MOOSE - Multi-level Origin-Organised Scalable Ethernet - is a proposed improvement on the Ethernet protocol design to be backwards compatible with the large number of ethernet controllers currently in the market place. MOOSE requires only the replacement of switches in the target network, or in some cases simply installing the MOOSE code on the existing switches. MOOSE performs in place rewriting of Ethernet datagrams on the switch in order to provide the aggregation of addresses on a per switch basis. By providing for routing information within the address, the MOOSE switch allows for traffic to be routed between MOOSE switches, allowing for better utilisation of network links. -This alleviates the problems associated with the deployment of STP, and also allows smaller state tables as each switch only needs to know about it's local hosts and the other switches, resulting in a significantly lower size of state table. +This alleviates the problems associated with the deployment of RSTP, and also allows smaller state tables as each switch only needs to know about the hosts directly attached to it and the other switches, resulting in a significantly lower size of state table. -The address a host will be rewritten to is calculated by looking at the switch it is assigned to. Each switch has a unique identifier. This is the concatenated with a unique identifier for the host. For example, the switch with a MOOSE address of 02:00:01:00:00:00, with the first three bytes being the unique host identifier, 02:00:01, will have hosts with addresses such as 02:00:01:00:00:01 and 02:00:01:00:00:02, where 00:00:01 and 00:00:002 are the unique host identifiers. +MOOSE addresses are assigned by using the locally administered portion of the Ethernet address space. By using this section of the address space, none of the addresses will conflict with any manufactured assigned addresses, which are set as universly administered. Each address is also marked as a unicast address. -In the state table, a MOOSE switch contains a mapping between host identifiers and MAC addresses and ports, and a separate state table mapping between switch identifiers and ports. By aggregating all of a switch's hosts under a single entry in the state table, a vast reduction in size is achieved. +The MOOSE address is then constructed in two separate parts. The first is the switch identifier, which is the switch to which the host is directly attached. The second part is the host identifier. This gives MOOSE addresses a hierarchical nature, allowing routing based on the identifier, as well as allowing aggregation of host addresses under one switch. -When a packet enters the switch, if it contains a normal Ethernet address, it is rewritten to a MOOSE address and this mapping is stored in the table. It is then directed using the host identifier for switch local traffic and the switch identifier for non local traffic. +Each host is assigned a single host identifier, and when the switch recieves traffic from the host, it rewrites the source address to contain the MOOSE address of the host. For example, when a host is attached to a switch with a MOOSE address of 02:00:01:00:00:00, it will be assigned a MOOSE address sch as 02:00:01:00:00:01 and any traffic it sends will be rewritten to this address. When traffic is recieved by the switch at this address, it will be rewritten back to the hosts' MAC address and sent to the host. If another host is attached, it will be assigned 02:00:01:00:00:02 and so forth. A host attached to a different switch on the same subnet, might be assigned to 02:00:02:00:00:00. -MOOSE addresses are identified by looking at the second bit in the most significant byte. This is set to universally administered for almost all Ethernet devices (as assigned by the IEEE). MOOSE only uses addresses with this set to locally administered allowing them to differentiated. +In the state table, a MOOSE switch contains a mapping between host identifiers and MAC addresses and ports, and a separate state table mapping between switch identifiers and ports. By aggregating all of a switch's hosts under a single entry in the state table, a vast reduction in size is achieved. + +When a packet enters the switch, if it contains a normal Ethernet address, it is rewritten to a MOOSE address and this mapping is stored in the table. It is then directed using the host identifier for local traffic, that is hosts directly attached to the switch, and the switch identifier for remote traffic, i.e. traffic where the host is attached to a different MOOSE switch. \subsection{Simulation} +Network simulation is a evolving area of research, with many different simulators and simulation techniques being used. It is rapidly becoming the most popular way of performing large scale network research for both local area networks, and the wider internet due to the low cost and speed of iteration of different tests that it allows. When testing networks of significant scale, the approach commonly taken is to verify it works on a small testbed, before scaling up under simulatuion. + +There are a number of different approaches to simulating a computer network architecture. The first is discrete event simulation. Here a simulator is primed with a topology and a number of event sources. These event sources fire events, which are executed in turn, which may then fire other events. The events are stored in a priority queue. Each event is executed according to it's time, with a global variable containing the current time of the simulation being advanced upon the executing of each event. This is continued until either there are no more events left in the queue, or some predefined time has passed. This simulation technique has the advantage that areas in which nothing of interest are passed over quickly, while events which are notable are where the majority of the computation is spent. + +Other types of network simulation include markov chain simulations which are useful for modelling queueing + + Network simulation is a hot area of research activity with many different simulation techniques identified. I have chosen to use network simulation instead of using a different evaluation technique like building a real life model of the networks under simulation due to the large number of hosts required which would be impractical in terms of cost. Network simulation poses a viable alternative, especially if a well known and tested network simulator is used. +The two main types of network simulators are discrete event simulators and + Since no implementation of MOOSE exists for a current network simulator, I will need to program this module and make checks to ensure it produces valid data. In terms of simulation, I shall be using a discrete event simulator, in which events are added to a priority queue, with the lowest item being taken off the queue and processed. Each event may generate future events which will be processed until a finish condition is reached at which point the simulation will terminate. @@ -94,7 +103,7 @@ Simulation has a number of drawbacks. Primarily it is not a perfectly accurate r \section{Context} -The most important work in this area is set out by Scott et al. in \cite{moose1} which sets out the ideas behind MOOSE which this paper aims to test under simulation. A prototype NetFPGA implementation also exists by Wagner-Hall et al. in \cite{moose2} in which a practical implementation of MOOSE is described. This gives a reference implementation from which this works is based on. A number of different network simulators have been identified and one of these will be selected to provide a foundation for the work. +The most important work in this area is set out by Scott et al \cite{moose1} which sets out the ideas behind MOOSE which this paper aims to test under simulation. A prototype NetFPGA implementation also exists by Wagner-Hall et al \cite{moose2} which outlines a practical implementation of MOOSE is described. \section{Aims} @@ -114,5 +123,5 @@ I aim to show how MOOSE and Ethernet compare as the number of hosts under simula \section{Relevant Courses} -I have used the knowledge from the following courses: Digital Communications 1 and Principles of Communication for network concepts, Algorithms 1 and 2 for general algorithm design, Programming in C and C++, as most the majority of the code will be written in C or C++, Software Design and Software Engineering for general software engineering and best practice, Computer Systems Modelling for ideas about simulation, Concurrent and Distributed for looking at how the switches co-operate, Object Orientated Programming, which is the main software design paradigm in use in designing simulators. +I have used the knowledge from the following courses: Digital Communications 1 and Principles of Communication for network concepts, Algorithms 1 and 2 for general algorithm design, Programming in C and C++, as most the majority of the code will be written in C or C++, Software Design and Software Engineering for general software engineering and best practice, Computer Systems Modelling for ideas about simulation, Concurrent and Distributed Systems for looking at how the switches co-operate and Object-Orientated Programming, which is the main software design paradigm used in designing simulators.