LLMs forget. Everyone knows that. The primary culprit behind this is the finity of context length of the models. Some even say that it is the biggest bottleneck when it comes to achieving AGI.
Soon, it appears that the debate over which model boasts the largest context length will become irrelevant. Microsoft, Google, and Meta, have all been taking strides in this direction – making context length infinite.
While all LLMs are currently running on Transformers, it might soon become a thing of the past. For example, Meta has introduced MEGALODON, a neural architecture designed for efficient sequence modelling with unlimited context length.
Comments are closed.