Researchers Design New Neuromorphic Chip for AI on the Edge

2022-08-19 22:07:24 By : Ms. Sophia Ho

Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them

Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them

Aug. 19, 2022 — An international team of researchers has designed and built a chip that runs computations directly in memory and can run a wide variety of AI applications–all at a fraction of the energy consumed by computing platforms for general-purpose AI computing.

The NeuRRAM neuromorphic chip brings AI a step closer to running on a broad range of edge devices, disconnected from the cloud, where they can perform sophisticated cognitive tasks anywhere and anytime without relying on a network connection to a centralized server.  Applications abound in every corner of the world and every facet of our lives, and range from smart watches, to VR headsets, smart earbuds, smart sensors in factories and rovers for space exploration.

The NeuRRAM chip is not only twice as energy efficient as the state-of-the-art “compute-in-memory” chips, an innovative class of hybrid chips that runs computations in memory, it also delivers results that are just as accurate as conventional digital chips. Conventional AI platforms are a lot bulkier and typically are constrained to using large data servers operating in the cloud.

In addition, the NeuRRAM chip is highly versatile and supports many different neural network models and architectures. As a result, the chip can be used for many different applications, including image recognition and reconstruction as well as voice recognition.

“The conventional wisdom is that the higher efficiency of compute-in-memory is at the cost of versatility, but our NeuRRAM chip obtains efficiency while not sacrificing versatility,” said Weier Wan, the paper’s first corresponding author and a recent Ph.D. graduate of Stanford University who worked on the chip while at UC San Diego, where he was co-advised by Gert Cauwenberghs in the Department of Bioengineering.

The research team, co-led by bioengineers at the University of California San Diego, presents their results in the Aug. 17 issue of Nature.

Currently, AI computing is both power hungry and computationally expensive. Most AI applications on edge devices involve moving data from the devices to the cloud, where the AI processes and analyzes it. Then the results are moved back to the device. That’s because most edge devices are battery-powered and as a result only have a limited amount of power that can be dedicated to computing.

By reducing power consumption needed for AI inference at the edge, this NeuRRAM chip could lead to more robust, smarter and accessible edge devices and smarter manufacturing. It could also lead to better data privacy as the transfer of data from devices to the cloud comes with increased security risks.

On AI chips, moving data from memory to computing units is one major bottleneck.

“It’s the equivalent of doing an eight-hour commute for a two-hour work day,” Wan said.

To solve this data transfer issue, researchers used what is known as resistive random-access memory, a type of non-volatile memory that allows for computation directly within memory rather than in separate computing units. RRAM and other emerging memory technologies used as synapse arrays for neuromorphic computing were pioneered in the lab of Philip Wong, Wan’s advisor at Stanford and a main contributor to this work. Computation with RRAM chips is not necessarily new, but generally it leads to a decrease in the accuracy of the computations performed on the chip and a lack of flexibility in the chip’s architecture.

“Compute-in-memory has been common practice in neuromorphic engineering since it was introduced more than 30 years ago,” Cauwenberghs said.  “What is new with NeuRRAM is that the extreme efficiency now goes together with great flexibility for diverse AI applications with almost no loss in accuracy over standard digital general-purpose compute platforms.”

A carefully crafted methodology was key to the work with multiple levels of “co-optimization” across the abstraction layers of hardware and software, from the design of the chip to its configuration to run various AI tasks. In addition, the team made sure to account for various constraints that span from memory device physics to circuits and network architecture.

“This chip now provides us with a platform to address these problems across the stack from devices and circuits to algorithms,” said Siddharth Joshi, an assistant professor of computer science and engineering at the University of Notre Dame , who started working on the project as a Ph.D. student and postdoctoral researcher in Cauwenberghs lab at UC San Diego.

Researchers measured the chip’s energy efficiency by a measure known as energy-delay product, or EDP. EDP combines both the amount of energy consumed for every operation and the amount of times it takes to complete the operation. By this measure, the NeuRRAM chip achieves 1.6 to 2.3 times lower EDP (lower is better) and 7 to 13 times higher computational density than state-of-the-art chips.

Researchers ran various AI tasks on the chip. It achieved 99% accuracy on a handwritten digit recognition task; 85.7% on an image classification task; and 84.7% on a Google speech command recognition task. In addition, the chip also achieved a 70% reduction in image-reconstruction error on an image-recovery task. These results are comparable to existing digital chips that perform computation under the same bit-precision, but with drastic savings in energy.

Researchers point out that one key contribution of the paper is that all the results featured are obtained directly on the hardware. In many previous works of compute-in-memory chips, AI benchmark results were often obtained partially by software simulation.

Next steps include improving architectures and circuits and scaling the design to more advanced technology nodes. Researchers also plan to tackle other applications, such as spiking neural networks.

“We can do better at the device level, improve circuit design to implement additional features and address diverse applications with our dynamic NeuRRAM platform,” said Rajkumar Kubendran, an assistant professor for the University of Pittsburgh, who started work on the project while a Ph.D. student in Cauwenberghs’ research group at UC San Diego.

In addition, Wan is a founding member of a startup that works on productizing the compute-in-memory technology. “As a researcher and  an engineer, my ambition is to bring research innovations from labs into practical use,” Wan said.

The key to NeuRRAM’s energy efficiency is an innovative method to sense output in memory. Conventional approaches use voltage as input and measure current as the result. But this leads to the need for more complex and more power hungry circuits. In NeuRRAM, the team engineered a neuron circuit that senses voltage and performs analog-to-digital conversion in an energy efficient manner. This voltage-mode sensing can activate all the rows and all the columns of an RRAM array in a single computing cycle, allowing higher parallelism.

In the NeuRRAM architecture, CMOS neuron circuits are physically interleaved with RRAM weights. It differs from conventional designs where CMOS circuits are typically on the peripheral of RRAM weights.The neuron’s connections with the RRAM array can be configured to serve as either input or output of the neuron. This allows neural network inference in various data flow directions without incurring overheads in area or power consumption. This in turn makes the architecture easier to reconfigure.

To make sure that accuracy of the AI computations can be preserved across various neural network architectures, researchers developed a set of hardware algorithm co-optimization techniques. The techniques were verified on various neural networks including convolutional neural networks, long short-term memory, and restricted Boltzmann machines.

As a neuromorphic AI chip, NeuroRRAM performs parallel distributed processing across 48 neurosynaptic cores. To simultaneously achieve high versatility and high efficiency, NeuRRAM supports data-parallelism by mapping a layer in the neural network model onto multiple cores for parallel inference on multiple data. Also, NeuRRAM offers model-parallelism by mapping different layers of a model onto different cores and performing inference in a pipelined fashion.

The work is the result of an international team of researchers.

The UC San Diego team designed the CMOS circuits that implement the neural functions interfacing with the RRAM arrays to support the synaptic functions in the chip’s architecture, for high efficiency and versatility. Wan, working closely with the entire team, implemented the design; characterized the chip; trained the AI models; and executed the experiments. Wan also developed a software toolchain that maps AI applications onto the chip.

The RRAM synapse array and its operating conditions were extensively characterized and optimized at Stanford University.

The RRAM array was fabricated and integrated onto CMOS at Tsinghua University.

The Team at Notre Dame contributed to both the design and architecture of the chip and the subsequent machine learning model design and training.

The research started as part of the National Science Foundation funded Expeditions in Computing project on Visual Cortex on Silicon at Penn State University, with continued funding support from the Office of Naval Research Science of AI program, the Semiconductor Research Corporation and DARPA JUMP program, and Western Digital Corporation.

A compute-in-memory chip based on resistive random-access memory

Published open-access in Nature, August 17, 2022, https://www.nature.com/articles/s41586-022-04992-8.

Weier Wan, Rajkumar Kubendran, Stephen Deiss, Siddharth Joshi, Gert Cauwenberghs, University of California San Diego

Weier Wan, S. Burc Eryilmaz, Priyanka Raina, H-S Philip Wong, Stanford University

Clemens Schaefer, Siddharth Joshi, University of Notre Dame

Rajkumar Kubendran, University of Pittsburgh

Wenqiang Zhang, Dabin Wu, He Qian, Bin Gao, Huaqiang Wu, Tsinghua University

Corresponding authors: Wan, Gao, Joshi, Wu, Wong and Cauwenberghs

Source: Ioana Patringenaru, UC San Diego

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Next month the AI Hardware Summit returns to the Bay Area, bringing AI technologists and end users together to share ideas and get up to speed on all the latest AI hardware developments. The event – which takes place September 13-15, 2022, at the Santa Clara Marriott, Calif. – will be co-located with the Edge AI Summit. Both events are organized by... Read more…

Thanks to a grant from the National Science Foundation, Oklahoma State University (OSU) will be building a new supercomputer. The as-yet unnamed system will succeed OSU’s existing system, which is simply named “Pete.” “This is a big moment for OSU and the High Performance Computing Center (HPCC),” said Pratul Agarwal, assistant vice president of research cyberinfrastructure and... Read more…

“It is my privilege to welcome you to the dedication of Frontier, the supercomputer that broke the exascale barrier.” That was the introduction by Oak Ridge National Laboratory Director Thomas Zacharia, at a small, public event on August 17 to officially dedicate the supercomputer, which in May became the first system to achieve over 1.0 exaflops of 64-bit performance on the... Read more…

Tesla has revealed that its biggest in-house AI supercomputer – which we wrote about last year – now has a total of 7,360 A100 GPUs, a nearly 28 percent uplift from its previous total of 5,760 GPUs. That’s enough GPU oomph for a top seven spot on the Top500, although the tech company best known for its electric vehicles has not publicly benchmarked the system. If it had, it would... Read more…

For the second time in as many weeks, President Biden has signed into law a major bill with significant implications for the computing sector. The Inflation Reduction Act – which is certainly the cornerstone of Biden’s first two years in office – allocates hundreds of billions of dollars toward energy security, climate change and healthcare. Among those hundreds of billions are hundreds of millions for scientific computing. At the signing ceremony... Read more…

23andMe Innovates Drug and Therapeutic Discovery with HPC on AWS

Genomics and biotechnology company 23andMe provides direct-to-customer genetic testing, giving customers valuable insights into their genetics. Read more…

Financial services organizations face increased competition for customers from technologies such as FinTechs, mobile banking applications, and online payment systems. To meet this challenge, it is important for organizations to have a deep understanding of their customers. Read more…

The Quantum Science Center (QSC), headquartered at Oak Ridge National Laboratory, is one of five such centers created by the National Quantum Initiative Act in 2018 and run by the Department of Energy. They all have distinct and overlapping goals. That’s sort of the point, to bring both focus and cooperation, and a heavy dose of industry participation to advance quantum information sciences... Read more…

“It is my privilege to welcome you to the dedication of Frontier, the supercomputer that broke the exascale barrier.” That was the introduction by Oak Ridge National Laboratory Director Thomas Zacharia, at a small, public event on August 17 to officially dedicate the supercomputer, which in May became the first system to achieve over 1.0 exaflops of 64-bit performance on the... Read more…

Tesla has revealed that its biggest in-house AI supercomputer – which we wrote about last year – now has a total of 7,360 A100 GPUs, a nearly 28 percent uplift from its previous total of 5,760 GPUs. That’s enough GPU oomph for a top seven spot on the Top500, although the tech company best known for its electric vehicles has not publicly benchmarked the system. If it had, it would... Read more…

For the second time in as many weeks, President Biden has signed into law a major bill with significant implications for the computing sector. The Inflation Reduction Act – which is certainly the cornerstone of Biden’s first two years in office – allocates hundreds of billions of dollars toward energy security, climate change and healthcare. Among those hundreds of billions are hundreds of millions for scientific computing. At the signing ceremony... Read more…

The Quantum Science Center (QSC), headquartered at Oak Ridge National Laboratory, is one of five such centers created by the National Quantum Initiative Act in 2018 and run by the Department of Energy. They all have distinct and overlapping goals. That’s sort of the point, to bring both focus and cooperation, and a heavy dose of industry participation to advance quantum information sciences... Read more…

The Australian government has been busy on the supercomputing front. In just the last two weeks, the Australian Department of Defence and the Australian Bureau of Meteorology have both revealed major supercomputing updates. The Department of Defence, for its part, unveiled a new supercomputer: Taingiwilta, named after the word for “powerful” in the language of the... Read more…

Quantum computing technology advances so quickly that it is hard to stay current. HPCwire recently asked a handful of senior researchers and executives for their thoughts on nearer-term progress and challenges. We’ll present their responses as they trickle in through the late summer and fall. (These execs take vacations too!) This also allows us to present the respondent’s... Read more…

Courtesy of the schedule for the SC22 conference, we now have our first glimpse at the finalists for this year’s coveted Gordon Bell Prize. The Gordon Bell Pr Read more…

HPCwire presents our interview with Bronson Messer, distinguished scientist and director of Science at the Oak Ridge Leadership Computing Facility (OLCF), ORNL, and an HPCwire 2022 Person to Watch. Messer recaps ORNL's journey to exascale and sheds light on how all the pieces line up to support the all-important science. Also covered are the role... Read more…

Intel has shared more details on a new interconnect that is the foundation of the company’s long-term plan for x86, Arm and RISC-V architectures to co-exist in a single chip package. The semiconductor company is taking a modular approach to chip design with the option for customers to cram computing blocks such as CPUs, GPUs and AI accelerators inside a single chip package. Read more…

In April 2018, the U.S. Department of Energy announced plans to procure a trio of exascale supercomputers at a total cost of up to $1.8 billion dollars. Over the ensuing four years, many announcements were made, many deadlines were missed, and a pandemic threw the world into disarray. Now, at long last, HPE and Oak Ridge National Laboratory (ORNL) have announced that the first of those... Read more…

The U.S. Senate on Tuesday passed a major hurdle that will open up close to $52 billion in grants for the semiconductor industry to boost manufacturing, supply chain and research and development. U.S. senators voted 64-34 in favor of advancing the CHIPS Act, which sets the stage for the final consideration... Read more…

The 59th installment of the Top500 list, issued today from ISC 2022 in Hamburg, Germany, officially marks a new era in supercomputing with the debut of the first-ever exascale system on the list. Frontier, deployed at the Department of Energy’s Oak Ridge National Laboratory, achieved 1.102 exaflops in its fastest High Performance Linpack run, which was completed... Read more…

The first-ever appearance of a previously undetectable quantum excitation known as the axial Higgs mode – exciting in its own right – also holds promise for developing and manipulating higher temperature quantum materials... Read more…

Additional details of the architecture of the exascale El Capitan supercomputer were disclosed today by Lawrence Livermore National Laboratory’s (LLNL) Terri Read more…

HPCwire takes you inside the Frontier datacenter at DOE's Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tenn., for an interview with Frontier Project Direc Read more…

PsiQuantum, founded in 2016 by four researchers with roots at Bristol University, Stanford University, and York University, is one of a few quantum computing startups that’s kept a moderately low PR profile. (That’s if you disregard the roughly $700 million in funding it has attracted.) The main reason is PsiQuantum has eschewed the clamorous public chase for... Read more…

AMD is getting personal with chips as it sets sail to make products more to the liking of its customers. The chipmaker detailed a modular chip future in which customers can mix and match non-AMD processors in a custom chip package. "We are focused on making it easier to implement chips with more flexibility," said Mark Papermaster, chief technology officer at AMD during the analyst day meeting late last week. Read more…

Intel reiterated it is well on its way to merging its roadmap of high-performance CPUs and GPUs as it shifts over to newer manufacturing processes and packaging technologies in the coming years. The company is merging the CPU and GPU lineups into a chip (codenamed Falcon Shores) which Intel has dubbed an XPU. Falcon Shores... Read more…

The long-troubled, hotly anticipated MareNostrum 5 supercomputer finally has a vendor: Atos, which will be supplying a system that includes both Nvidia and Inte Read more…

Tesla has revealed that its biggest in-house AI supercomputer – which we wrote about last year – now has a total of 7,360 A100 GPUs, a nearly 28 percent uplift from its previous total of 5,760 GPUs. That’s enough GPU oomph for a top seven spot on the Top500, although the tech company best known for its electric vehicles has not publicly benchmarked the system. If it had, it would... Read more…

Just a couple of weeks ago, the Indian government promised that it had five HPC systems in the final stages of installation and would launch nine new supercomputers this year. Now, it appears to be making good on that promise: the country’s National Supercomputing Mission (NSM) has announced the deployment of “PARAM Ganga” petascale supercomputer at Indian Institute of Technology (IIT)... Read more…

You may recall that efforts proposed in 2020 to remake the National Science Foundation (Endless Frontier Act) have since expanded and morphed into two gigantic bills, the America COMPETES Act in the U.S. House of Representatives and the U.S. Innovation and Competition Act in the U.S. Senate. So far, efforts to reconcile the two pieces of legislation have snagged and recent reports... Read more…

Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and senior vice president of research, providing an overview of Nvidia’s R&D organization and a few details on current priorities. This year, Dally focused mostly on AI tools that Nvidia is both developing and using in-house to improve... Read more…

Installation has begun on the Aurora supercomputer, Rick Stevens (associate director of Argonne National Laboratory) revealed today during the Intel Vision event keynote taking place in Dallas, Texas, and online. Joining Intel exec Raja Koduri on stage, Stevens confirmed that the Aurora build is underway – a major development for a system that is projected to deliver more... Read more…

© 2022 HPCwire. All Rights Reserved. A Tabor Communications Publication

HPCwire is a registered trademark of Tabor Communications, Inc. Use of this site is governed by our Terms of Use and Privacy Policy.

Reproduction in whole or in part in any form or medium without express written permission of Tabor Communications, Inc. is prohibited.