The word “reimagine” is one of those words loved by marketing people and often loathed by engineers. But, in the context of this column, I think it is appropriate. The word “reimagine” should be close to every engineer’s heart, as it is at the essence of what we all love: solving problems in a creative and innovative way.
Over the last decade or two, we have witnessed a great deal of creativity and innovation in how we build networks and deliver communication services. We have witnessed the rise of Ethernet and IP and how these two protocols laid the foundation for a common networking paradigm that we take for granted today. We have witnessed the rise of the IP-based internet and how every imaginable service has been dramatically affected. We have witnessed the rise of cloud computing and how this has, in a sense, completed the disruption that the introduction of the internet first promised.
Now, in the latest wave of creativity and innovation, we are witnessing how the workhorse that we have taken for granted, the standard computing platform, is being reimagined. This promises not only to reimagine computing, but also to reimagine how networking and communication services are delivered in the future.
The new wave I am talking about is reconfigurable computing.
The standard computing platform is the generic CPU-based standard server. The rise of the internet and cloud computing have been propelled by the versatility and scalability of standard compute platforms. What has been proven over the last decade is that the standard computing platform can be used for almost every imaginable application or workload and is the basis for hyper-scale computing.
One of the reasons for the success of the standard computing platform is Moore’s Law, or the industry focus on doubling the number of transistors in CPUs every 18 months. This performance improvement cadence established the rhythm for the entire IT industry and ecosystem as well as customer investment cycles, as it assured everyone that the growth in data usage and service consumption could be managed cost-effectively.
However, as time has progressed and the laws of physics begin to impose themselves, new approaches have been needed to secure this progress, such as multi-CPU integrated circuits. Now that we are reaching the end of the viability of this path, the industry has been searching for the answer. The answer is reconfigurable computing.
Large cloud service providers, also called hyper-scale cloud companies, were the first to feel the pressure and therefore the need for another approach. Facing unprecedented growth in customer uptake, service consumption and, thereby, data load, the hyper-scale companies needed to find a platform that could keep up with data growth in an efficient and cost-effective manner.
Several technologies were considered, ranging from custom ASICs to NPUs to MIPS, but the consensus has been that the perfect complement to generic CPUs in a standard computing platform is reconfigurable FPGA technology. It is the combination of standard computing platforms with FPGA technology that is the basis for reconfigurable computing. It combines some of the flexibility of software with the high performance of hardware by processing with very flexible, high-speed computing fabrics like field-programmable gate arrays (FPGAs).
The first hyper-scale cloud company to go public with their use of reconfigurable computing technologies like FPGAs was Microsoft Azure. In a presentation in June of 2015 at Open Networking Summit, Azure’s CTO, Mark Russinovich, made the first presentation of Azure’s FPGA-based SmartNICs. These were being used to accelerate workloads in Microsoft Azure data centers, and the results were proving so positive that the plan was to extend the use of the SmartNICs throughout Azure data centers to accelerate all their workloads. That is also exactly what Microsoft Azure has done.
Why this platform? According to Microsoft: “We’ve achieved an order of magnitude performance gain relative to CPUs with less than 30 percent cost increase, and no more than 10 percent power increase. The net results deliver substantial savings and an industry-leading 40 gigaops/W energy efficiency for deployed at-scale accelerators.”
The rest of the cloud industry was taking note, and many were already on the same research path. Amazon Web Services, for instance, now offers “FPGA-as-a-Service,” where developers of FPGA-based solutions can try out their designs quickly without having to invest in or wait for the hardware. We expect that this is just the beginning and that a number of FPGA-based “micro-services” will emerge that everyone can exploit to their advantage.
This adoption of FPGA technology and the proof that this technology is well suited to accelerating a broad range of applications prompted Intel, the largest producer of CPUs for standard computing platforms, to acquire the second-largest vendor of FPGAs, Altera, for $16.7 billion in 2015. This was one of the largest technology acquisitions of all time, and the fact that it was Intel, the company that sets the cadence for the rest of the industry, is significant.
Intel intends to combine CPU and FPGA technology in standard computing platforms, which can be considered reconfigurable computing platforms. The FPGA technology can be on a SmartNIC but can equally be on the motherboard or integrated with or in the CPU chip itself.
In my next column, I’ll look at why FPGAs are so powerful in this context, the major challenge of working with FPGAs, and how vendors and companies are addressing the challenge.