Platform shifts kill companies. Mainframe companies missed the shift to minicomputers, minicomputers to desktops, and desktops to mobile. Machine learning is the next platform shift. Hardware platforms for machine learning have yet to be defined.
Creating the machine learning models that power the next generation of self-driving cars, home assistants and language translation will require novel new hardware architectures that are significantly faster and address much larger memory spaces.
+ Also on Network World:5G will help autonomous cars cruise streets safely +
To ensure it doesn’t miss out on this platform shift, Intel spent $15.3 billion to acquire Mobileye. Its goal is to gain early entry into the self-driving car market that will consume even more silicon than cars do today for self-driving, safety and convenience.
Machine learning hardware that controls self-driving cars and powers data centers is yet to be fully defined, presenting a large opportunity to build special processors—where NVIDIA currently leads. Intel does not want to be late to market with chips that power the next generation of cars.
If there is one thing to be learned from Intel’s dominance in the Windows desktops and server market and ARMs dominance of the mobile platform market it is that machine learning hardware is a winner-takes-all market opportunity. A state-of-the-art chip is not enough to win the next platform. A large enabling ecosystem is needed for a chip to become a platform. A look at why Intel’s late entry into the mobile system on a chip (SoC) market and ARMs challenge to Intel’s processor dominance in the data center explain why early entry into the self-driving car and other machine learning businesses is crucial.
Intel doesn’t want to be late to market again
Intel failed to build a competitive ecosystem for its Atom mobile SoC because it was late to market. From an architectural perspective, Atom had all the features a phone designer wanted except one, the ARM instruction set with its large ecosystem.
Almost every mobile SoC maker licenses the ARM instruction set, including market leaders Qualcomm and MediaTek, because of the dependency on its ecosystem of compilers, debug tools and development kits. Intel’s mobile SoC ecosystem was not fully compatible with ARM, forcing the company to exit the business last year. Many applications downloaded onto Intel Atom powered smartphones had to emulate the ARM instruction set in software, reducing performance. Convincing developers to build a native Intel Atom version of their app was a non-starter because of its tiny market share.
Even though 98 percent of the cloud runs on Intel-powered servers, the internet echo chamber erupted with Intel’s server business obituary when Microsoft announced last week the company’s commitment to port Windows Server to ARM-powered designs by Qualcomm and Cavium. But most large platform cloud companies, such as Amazon, Facebook, Google, IBM and even parts of Microsoft, do not run their clouds on Windows Server. These clouds run on Linux on Intel-powered servers because of the ecosystem, toolchain and Intel’s adaptation of its architecture with features such as flashable controllers that accommodate the evolution of open source standards.
+ Also on Network World:Just one autonomous car will use 4,000 GB of data/day +
Five years ago, ARM and Ubuntu, one of the larger Linux distributions, announced their server partnership, which did not attract an ecosystem and did not dent Intel’s share of the server market. ARM’s late arrival to a market five years ago was a huge obstacle. Unless a chip is surrounded by an ecosystem for software developers and designers to build products, it doesn’t matter if it performs twice as fast, costs less or consumes less power.
Windows Desktop on ARM also makes clear the importance of ecosystems and the challenges of arriving late to market. Last December, another partnership between Microsoft and Qualcomm was announced with a demonstration of Windows 10 running on an emulator on Qualcomm’s ARM-compatible Snapdragon 835. Running x86, win 32 apps on an emulator makes a good demo, but it repeats the problem Intel had getting its chips into phones; emulators are a shortcut, a corner case with limited appeal.
Apps need to be designed and compiled to run as native binaries to have good stability and run fast. Microsoft tried this once before with the original Surface Pro RT powered by the ARM-based NVIDIA Tegra 3 that forced its internal and external developers to recompile for the Surface. It was not until the Surface Pro 2 with Wintel x86 binary compatibility did the product start to gain traction, largely due to the software ecosystem. Microsoft has not looked back since the Surface Pro 2. The Surface Pro 3, 4 and 5 all followed with x86 processors. The processor ecosystem, like Karma, is a bitch.
A constant theme overshadows the history of machine learning: Hardware performance lags the new machine learning breakthroughs produced by theoretical researchers. As soon as new machine learning research is confirmed, researchers go to work optimizing it to run on the hardware available.
Implementing machine learning and AI
Much of machine learning and AI was proven during the last 20 years, but up until recently, no one tried to implement it because the hardware available was not viable. In 2011, Google needed a whole data center to train a machine to learn to find cats in YouTube videos. Machine learning was not accessible without a Google-scale data center, so while an interesting development, it did not spark much interest in following Google.
In 2012, a Ph.D. candidate in Yoshua Bengio’s lab at the University of Montreal took advantage of the parallelism of NVIDIA GPUs to program high-performance machine learning models. He proved that a small rack of systems filled with NVIDIAs GPUs could perform like a Google data center, and the cost was within reach of every data sciences lab. Suddenly there was a renaissance of interest by developers and researchers in AI and machine learning.
NVIDIA’s market share in the model training segment of the market has reached 90 percent, but its market dominance is not yet guaranteed because there are many unresolved architectural issues. Machine learning model developers are limited by the architecture from building large models to create the next generation of artificial intelligence. It is a problem similar to supercomputing where software developers need to also create parts of the operating system software to divide the application and distribute it to hundreds or even thousands of machines, as well as to coordinate and synchronize computational tasks. Tasks such as these are better performed by low-level operating system interfaces to the hardware, analogous in many ways to abstracting the developer from managing virtual memory.
+ Also on Network World:Microsoft wants to bring machine learning into the mainstream +
Machine learning is moving too fast for hardware makers to commit to a processor architecture because it is likely to change before the year is out. Right now, application developers are also solving problems distributing the application workload across multiple systems, accommodating the slower-than-bus speed fiber-optic interconnects between racks of GPUs, and generally developing features that the operating system and hardware would be expected to do—to name a few.
Intel’s acquisition of Mobileye, as well as it’s purchase of deep learning, chip startup Nirvana in August of last year, wasn’t to gain immediate revenues. It bought them to do what made Intel successful: define customer needs, convert the requirements to silicon that fit specific use cases, and create an ecosystem around it that enables hardware and software engineers to build products.
Intel is investing large sums to be early because of the dire consequences of missing the machine learning platform shift. When machine learning hardware requirements converge, the winner will sell a lot of machine learning processors into data centers and into the auto industry to control self-driving cars. Intel does not want to be left out like it was in mobile.