In a new study in the journal Proceedings of the National Academy of Sciences (PNAS), researchers at IBM, Harvard, and MIT, use transformers to explore how networks of astrocytes and neurons process information and contribute to learning and memory in the brain. Their model is the first to show theoretically how neurons and astrocytes may communicate while processing language and images.
“Neurons in the brain helped inspire artificial neural networks in modern AI,” said Dmitry Krotov, an AI researcher at IBM Research. “We wanted to flip that around and see what recent advances in AI could teach us about the biological computation of neurons and astrocytes.”
Transformers were originally designed to handle language but are now widely used to process images, speech, and audio. Before transformers, neural networks had to be trained on labeled datasets that were costly to compile. Transformers eliminated the bottleneck by being able to ingest massive, raw datasets and extract their underlying structure. By creating a compressed representation of large-scale data, transformer-based AI models known as foundation models could be fine-tuned and applied to a range of tasks.
Read more on their website.