Dong Zhou Data Structure Engineering for High Performance Software Packet Processing Degree Type: Ph.D. in Computer Science Advisor(s): David G. Andersen Graduated: August 2019 Abstract: From software routers to virtual switches to the more general concept of Network Function Virtualization (NFV), the applications of software based packet processing have been booming in recent years. Unlike specialized hardware, general-purpose CPUs are not optimized for processing network packets and thus demand software systems to be carefully designed and optimized to meet ever-increasing performance requirements. This thesis strives to address the performance challenges of software packet processing by designing and implementing high performance, memory efficient data structures for each of the applications we examined. We make three contributions. First, based on concurrent cuckoo hashing, we propose a set of algorithmic and architecture-aware optimizations to craft an x86-optimized, high-performance hash table, which we then use to build scalable and resource-efficient software-based Ethernet switches. Second, we propose an extremely compact set separation data structure that omits keys and uses a variant of perfect hashing to store values. This data structure is the backbone of our new architecture for building scalable, clustered network appliances. Third, we propose a new software cache design and a cache eviction algorithm that balances between cache hit rate and lookup latency. Replacing the microflow cache in the Open vSwitch with our cache design improves the throughput by up to 15%. Thesis Committee: David G. Andersen (Chair) Michael Kaminsky (Intel Labs) Justine Sherry Sylvia Ranasamy (University of California at Berkeley) Srinivasan Seshan, Head, Computer Science Department Martial Hebert, Dean, School of Computer Science Keywords: Software Switch, Cuckoo Hashing, Scalability, Network Function Virtualization, Hashing Algorithms