From: Richard Whitehouse Date: Fri, 6 May 2011 09:54:26 +0000 (+0100) Subject: Rewritten implementation section X-Git-Url: https://git.richardwhiuk.com/?a=commitdiff_plain;h=ceeb115f1ded78a2f8afc80000180761f99b4be4;p=ii-diss.git Rewritten implementation section --- diff --git a/implementation.tex b/implementation.tex index e144095..d79fa97 100644 --- a/implementation.tex +++ b/implementation.tex @@ -1,38 +1,145 @@ \chapter{Implementation} -The starting point for the implementation was the ns3 project. This was branched into a new git repository and built on my desktop computer. Once I had ensured that ns3 was working, by using the test routines provided, the next step was to enhance the existing Ethernet implementation. +\section{System Design} -The inital implementation of a CSMA switch was split into two different parts in order to separate the different functionality, namely into the bridge and a separate class for representing a port on the switch, as this would be needed to hold state in the RSTP implementation. An additional deficency in the CSMA implementation was identified, namely that the state table did not limit the number of entries which could be stored in it. This was fixed, allowing only 8000 host entries to remain in the table at any time. If more than that were in the table, new entries would be dropped. +The system is structured into three different phases, each of which performs a different task. These are generation, simulation and analysis. These are connected together using a series of file formats. This section describes the three phases and how they are structured and the file formats used. -After this had been completed the structure of the MOOSE implementation was implemented. This primarily involved coding support structures to hold the MOOSE address identifiers, comprising of both a host and switch section. The state of the MOOSE switch was also coded. This involved two separate state table structures. +\subsection{Generation} -The first, indexed by switch identifier, which is mapped to port. This maps other MOOSE switches to allow forwarding of packets. +Due to the large number of hosts under simulation and the hetrogeneous nature of local area network topologies, it was decided that the best way to proceed would be to dynamically generate different network topologies in order to send the data over. The main output of the Generation phase is a topology file which details the number of hosts, number of bridges and the connections between them. -The second, is indexed by both host identifier and Ethernet addresss, is again mapped to a port on the switch. This holds hosts attached to the switch. The first index is used to forward and translate packets arriving at the switch and being forwarded to the host. The second allows the packets from the host arriving at the switch to be translated into MOOSE packets. +The core component of the generation phase of the Topology Helpers. These are based on the Factory design pattern, and each helper creates a different kind of network topology. The topology created is represented in the Topology class, which contains the number of hosts in the topology, the number of bridges, a map between hosts and bridges (since each host may only connect to one bridge) and a set data structure containing pairs of bridge to bridge links. This is a totally ordered list of bridge to bridge links where order in the pair is unimportant as the links are bidirectional. -Once this had been completed, the next stage was implementing the translation of packets by the MOOSE switch. This was achieved by using the state tables described above, and sequentially generating a new host identifier, and thus MOOSE address based on the switch's MAC address. +The following types of topology are currently implemented: -The next stage was implementing ARP rewriting. As ARP packets contain a hardware address inside the packet, they require rewriting by the switch in order for the protocol to work correctly. +* Tree +* Mesh +* Cube +* Torus -At this stage, both Ethernet and MOOSE correctly worked on a acyclic network. As such I implemented a testing framework a simulated a number of packets to verify the work so far was completed correctly. +Each topology type takes a number of parameters and outputs a topology as a result. This topology is then stored in a file. -For the next stage, I implemented static routing. This was done by working out the routing for each switch before hand. For MOOSE I used the Boost Graph Library dykstra implementation. I also used the Boost Graph Library for Ethernet, in that case Kruskall's minimum spanning tree algorithm. However, it soon became obvious that this provides different results than the RSTP algorithm as instead of optimising for total spanning tree length, it optimises distance from the root bridge to each switch. +\subsection{Simulation} -I replaced the minimum spanning tree algorithm with one which arbitarily selects a root bridge, and then iteratively adds each neighbour to a priority queue, ordered by distance to the root, and then retrieves the first from the queue which isn't already a member of the tree. It terminates when either the queue is empty, or the tree includes all bridges. In the case of all links having the same weight, I optimise to use a normal queue instead. +Once the topology has been determined, either via generation from the Generation phase, or from manually creating a topology file, the simulation phase can begin. This is the core phase of the project and responsible for most of the work. The simulation phase uses the ns3 simulator as a core component, with the necessary components to perform Ethernet and MOOSE simulation built on top of it. +\subsubsection{Network} +The first stage is to setup up the network. This is done by reading in a topology file and converting it into a series of ns3 objects. These are then stored in the Network class. This is done by loading the topology file into a Topology object, and then passing it to the Link Layer Helper. This also takes whether the simulation should be done using MOOSE or Ethernet and whether dynamic or static routing should take place. -% things to write about: +First it creates a node to represent each of the hosts and bridges in the topology. It then creates the Ethernet links between the hosts and bridges, and the links between the bridges as specified in the topology file. Each Ethernet link is Gigabit, with a 2 millisecond delay. -% refactor +Once it has done this, it configures the routing. Early in the project, I decided to allow two different forms of routing, static and dynamic. -% moose implementation +Static routing is done at setup time by running an algorithm across all the ndoes to determine the network map. In the case of Ethernet it runs a static spanning tree algorithm, whereas in MOOSE it runs a all pairs shortest path algorithm by running Dijkstra's algorithm on each node. It can do this more efficently than running Johnson's algorithm as all the path costs are known to be positive. This is implemented using the Boost Graph library implementation of Dijkstra - \cite{boostgraph}. -% topology thing +Dynamic routing on the other hand relies on the interchange of BPDU - Bridge Protocol Data Units for Ethernet as outlined in \cite{ieee802-1d} for the Rapid Spanning Tree protocol. MOOSE however can use a routing protocol - OSPFM (Open Shortest Path First for MOOSE), an implementation of OSPF \cite{rfc2328}, is suggested in \cite{dwh}. This happens during the running of the simulation. -% routing implementation +Once the routing has been configured, the Link Layer Helper initalises a BridgeNetDevice on each bridge node in the case of Ethernet, or a MooseBridgeNetDevice in the case of MOOSE. This represents a switch and contains the implementation of the Link Layer protocol on the switch. The BridgeNetDevice and MooseBridgeNetDevice objects are created through the use of another factory class, BridgeNetDeviceHelper and MooseBridgeNetDeviceHelper respectively. These are responsible for creating the bridge and ports on the bridge. -% dynamic routing +A pre-existing BridgeNetDevice implementation exisisted for Ethernet, it was incomplete, with no support for RSTP - Rapid Spanning Tree Protocol. It also had no limit on the size of the state table, and was implemented in a single class. I decided to split it up into several different classes to separate the different concerns. It now uses a BridgeNetDevice, with each port on the Bridge represented as a number of BridgePortNetDevice objects which controls the reception and transmission of packets on each port, and a BridgeState class which holds the state table for the Bridge. By performing this separation of concerns, the implementation ensures that each concern is handled efficently. -% .... +A similar approach was taken to implementing the MooseBridgeNetDevice, which again has a number of MooseBridgePortNetDevice objects to represent each port on the bridge, and a MooseBridgeState object to hold the bridge's state. + +Along side these core classes, a number of different types were created to hold the MOOSE data required. These were MooseAddress, MoosePrefixAddress and MooseSuffixAddress, which hold the MOOSE Address components required to implement MOOSE. The former can readily be convered to and from a Mac48Address, ns3's implementation of a MAC Address, which allows it to be used to interogate MAC addresses. MooseAddress gives three types of MAC Address, HOST which is used for a globally administered MAC Address, which is used by normal Ethernet devices, MOOSE which represents an address assigned by a MOOSE switch and MULTICAST which is a MAC address used for multicast/broadcast, and thus not translated to a MOOSE address. + +The final stage of network setup is to initalise the hosts. Each hosts is given a standard IPv4 stack, complete with IP, ARP, ICMP, UDP and TCP/IP. This is the base from which we simulate packets across the network + +\subsubsection{Data} + +The next stage is to setup the data streams across the network. This is setup based on a file input which details which hosts to send what data to and from which hosts.The data file lists the time to start sending data, the host to send from, the host to send to, and the number of packets to send. This flexible format allows the simulation to simulate a large variety of different traffic scenarios. + +Each UDP packet is sent from a UdpClient installed on the sender for each set of packets, to a UdpServer which is installed once per recipient, on port 9, the standard port for discarding packets \cite{rfc863}. + +Once the file has been read in, the UdpClient and UdpServers are installed using the ns3 Helper classes designed for this purpose. + +\subsubsection{Tracing} + +Once this is setup we enable tracing on each of the nodes. Two types of tracing are done. The first is a ASCII trace of all the Ethernet links. This notes enqueue and dequeue operations on each of the Ethernet queues, dropped packets and reception of packets by a device. This is human readable, and can also be parsed by programs to generate statistics on the network. + +The second is a PCAP - Packet Capture - file format trace, where one file is generated per link. Each file details the packets transmitted across that link. PCAP files are useful as they are widely used by packet trace programs such as Wireshark. They can be loaded into Wireshark to see what would have been seen by a packet capture software running on that host, verifying that the other traces are correct. + +\subsubsection{Run} + +Finally, once the setup is complete and the trace files are connected to the correct sources, the simulation is run to gather the date. When the simulation is finished, the program captures the state data in each bridge and outputs it in a common file format for later analysis. + +\subsection{Analysis} + +The analysis section + +\subsection{File Formats} + +In order to pass data between the different sections of the project, a number of different file formats have been devised. These are all designed to be editable using a text editor in order to manually verify the contents. The exact layout of the files is detailed in the appendix. + +Each file begins with a magic marker, "ns-moose", for identification, followed by the file type, and file type version. This allows multiple file types to be defined and allows the system to check the correct file has been input. It also allows the file types to be changed in a later version of the code. + +\subsubsection{Topology} + +The topology file details the graph of the network. Specifically it lists the number of hosts in the network, the number of bridges and the links between them. When describing the links between switches, the switches are zero indexed, with the hosts following afterwards. When the file is interpreted by the Topology class it ensures that each hosts can only link to one bridge. + +\subsubsection{Data} + +The data file details the data sent across the network. After the file header, it lists in order, the sets of packets to be transferred across the network, The first field contains the time to start, interpreted as a double. After this follows descriptors containing the hosts (zero indexed) to send to and from. Finally each section has the number of packets to send across the link. + +\subsubsection{State} + +\subsubsection{Trace} + +The two trace file formats used are a ASCII format included with ns3, as described in the ns3 manual and the included PCAP format as used notably by tcpdump and Wireshark. A ASCII trace and PCAP traces for each link can be created for each run of the simulator. + +\section{Core Algorithms and Data Structures} + +\subsection{Topology Representation} + +The topology is represented internally with a a std::map between hosts and bridges, since each host may only appear on one bridge, and a std::set of pairs which contains bridge to bridge links with a custom comparator. The custom comparator is designed in order to give a total order to links, without caring about order, as links are bidirectional. Thus <1,2> is treated equal to <2,1>. The order defined is that <1,2> < <1,3> < <4,1> < <3,4>, in other words, the lower number is used for the primary sort, and the larger one as a secondary sort. These provided adequate performance for the purpose. + +\subsection{MOOSE State Tables} + +In order to hold the data required for the MOOSE implementation a series of tables were designed. While in reality these would be implemented in CAM (Content Addressable Memory) in order to provide O(1) performance, in this implementation they are implemented using std::map. + +The first is the Prefix table. This handles MOOSE traffic which is remote to this bridge. This provides a Prefix to Port and Expiration Time mapping. This is updated when traffic from that bridge is arriving at the bridge, the switch stores a bridge to port mapping. This will also be updated by the routing protocol. When data is to be sent to the specified prefix, the port is looked up, and after making sure it is still valid, the packet is sent on that port. + +The second is the Suffix table. This maps MOOSE Suffix Addresses (i.e. the host part) to a Ethernet address. This is indexed by both Ethernet address (48 bit MAC Address) and the suffix. The former index is used when converting incoming packets to the correct address, the later when converting incoming packets from the host, to the correct MOOSE address. + +The third table is the Port table. This maps MOOSE Suffix Addresses to the correct port. This is updated when a packet comes from the host, and used when a packet is sent to the host in question. + +\subsection{Reverse Path Forwarding} + +To prevent flooding causing network starvation in Ethernet, the system converts the graph to a spanning tree. While this approach could be used in MOOSE, a different approach is prefered, which also allows routing to take place. This is a technique derived from implementing multicasting in IP networks without resorting to TTL - Time To Live - expiry. + +Reverse path forwarding works by only forwarding packets which arrive on a port which they would be sent on. E.g, if a packet from Switch 6 is recieved by switch 4, it is only transmitted on if switch 4 would send packets to switch 6 on that interface. This creates a natural broadcast prevention by creating a implicit tree from the root switch of the broadcast. + +In my implementation this is used for all broadcast packets to prevent flooding causing network degradation. Broadcast packets are important as they are used to implement ARP, the Address Resolution Protocol which converts IP addresses into hardware addresses. + +\subsection{Static Spanning Tree} + +The static spanning tree algorithm is used to implement RSTP in the setup phase. By doing it here, I was able to check the network was performing correctly, before I had implemented RSTP. + +The spanning tree algorithm is implemented using a priority queue and two maps. One map takes pairs of bridges and states whether they are part of the spanning tree. The second looks at bridges and checks whether they have been included in the tree. The queue contains a list of links to check. + +The algorithm executes in the following way. + +* The root bridge with the smallest MAC address (in my implementation, this is always numbered 0, as switch MAC address are assigned increasing from 00:00:00:00:00:00) is chosen. + +* The bridge is marked as part of the tree, and each of its links are added to the queue with the cost of traversing the link as the weight. + +* The first link is then removed from the queue. If the bridge it points to is already in the tree, then it is discarded. + +* Otherwise, the node it points to is added to the tree, and all that bridge's links are added to the end of the queue, with the weight being the cost of traversal from the root bridge. + +* The next links is removed from the queue and the algorithm repeats. The algorithm terminates when either there are no more links on the queue, or all the bridges are part of the map. + +The algorithm assumes that all the nodes are connected, and that all links are bidrectional, which is the case is the networks I am testing. In the case I am testing the priority queue is also replacable by a normal queue under the condition that all the links have equal cost traversal, which is also the case. + +\subsection{Rapid Spanning Tree} + +\subsection{Static Routing} + +The MOOSE implementation has a static routing algorithm. This simply consists of insert all the bridges and links into a graph, assigning a weight for each link, in this case, since all the links are the same, they are all given a weight of 1, and the running Dijykstra's algorithm \cite{dijkstra} across all pairs of nodes. This is an optimisation of Johnson's algorithm, as it is known that there are no negative weights, and as such there is no need to run the Bellman-Ford algorithm. As such, it has complexity $O(n^{3})$. + +%\subsection{Dynamic Link State Routing} + +%\section{Ethernet Implementation} + +%\section{MOOSE Implementation}