MIT’s Light-Based System to Transform Language Models

By Srikanth
3 Min Read
How to choose the best machine learning model

ChatGPT has captured global attention for its capability to compose essays, emails, and computer code through minimal user prompts. Presently, an MIT-led team is introducing a system that holds the potential to propel machine-learning programs to a magnitude far exceeding that of ChatGPT. Moreover, this innovative system boasts the remarkable advantage of operating on drastically reduced energy consumption compared to contemporary supercomputers powering existing machine-learning models.

Advertisement

In the latest issue of Nature Photonics, published on July 17, the researchers present their groundbreaking experimental demonstration of this novel system. Unlike conventional approaches that rely on electrons, their system executes computations based on the movement of light. This revolutionary method employs hundreds of micron-scale lasers. The outcomes are staggering: The team reports over a 100-fold enhancement in energy efficiency and a 25-fold augmentation in compute density. This density metric gauges the system’s power and capabilities, significantly surpassing state-of-the-art digital computers designed for machine learning.

Within the paper, the research team further references the potential for “considerably numerous additional orders of magnitude for future enhancements.” This leads the authors to assert that this method “paves the way for expansive optoelectronic processors that can accelerate machine-learning operations, extending from data centers to decentralized edge devices.” This implies that devices such as cell phones and compact gadgets might acquire the ability to execute programs that are presently exclusively feasible within expansive data centers.

Further, because the components of the system can be created using fabrication processes already in use today, “we expect that it could be scaled for commercial use in a few years. For example, the laser arrays involved are widely used in cell phone face ID and data communication,” says Zaijun Chen, first author, who conducted the work while a postdoc at MIT in the Research Laboratory of Electronics (RLE) and is now an assistant professor at the University of Southern California.

Says Dirk Englund, an associate professor in MIT’s Department of Electrical Engineering and Computer Science and leader of the work, “ChatGPT is limited in size by the power of today’s supercomputers. It’s just not economically viable to train much bigger models. Our new technology could make it possible to leapfrog to machine-learning models that otherwise would not be reachable shortly.”

He continues, “We don’t know what capabilities the next-generation ChatGPT will have if it is 100 times more powerful, but that’s the regime of discovery that this kind of technology can allow.” Englund is also the leader of MIT’s Quantum Photonics Laboratory and is affiliated with the RLE and the Materials Research Laboratory.

Share This Article
Passionate Tech Blogger on Emerging Technologies, which brings revolutionary changes to the People life.., Interested to explore latest Gadgets, Saas Programs