(Editor’s note: This post on AI is part of Dispatches’ Tech Tuesday series. We cover tech because so many of our highly skilled internationals are scientists, researchers and entrepreneurs.)
Humanity has always been captivated by giants. Throughout history, cultures have shared tales of these towering figures, though they often find themselves on the losing side. David defeats Goliath, the Olympians overthrow the Titans, Odysseus blinds the Cyclops, and Vishnu outsmarts Bali. Giants are not inherently malevolent, but despite their immense advantages, they often succumb to ingenuity and cooperation.
Growing up in Ireland, I was enchanted by the story of Finn MacCool (a fantastic name), his wife Oonagh (pronounced Una), and the Scottish giant Benandonner (whose name, I suspect, means “the brown mountain”). According to legend, Finn built a massive bridge called the Giant’s Causeway, which remains visible today.
Stretching through the sea between Ireland and Scotland, this impressive feat of engineering had unintended consequences: it allowed the much larger Benandonner to cross over and challenge Finn. Faced with an adversary of overwhelming size, Finn relied on his wife’s cleverness. Disguising Finn as a giant baby, Oonagh tricked Benandonner into believing that the baby’s father must be truly colossal. Intimidated, Benandonner fled back to Scotland, destroying the bridge behind him.
This legend offers a timeless lesson: wit and teamwork can triumph over sheer power.
It’s a principle that resonates with modern technological trends, particularly in artificial intelligence (AI). Just as the giants in myths are formidable but ultimately outmaneuvered, today’s massive AI models are impressive yet face growing limitations.
The future may belong to smaller, more collaborative systems.
The era of AI Giants
For several years, AI research has embraced a “bigger is better” philosophy, producing colossal models with billions of parameters. Well-known examples such as GPT-2, GPT-3 and GPT-4 showcase the extraordinary capabilities of these behemoths in tasks such as language translation and text generation.
However, this approach comes at a cost.
The computational demands of these models are enormous, their environmental impact is significant, and their reliance on vast amounts of internet data raises sustainability concerns. Some experts predict that AI’s insatiable appetite for data may surpass the available supply as early as 2026.
Additionally, there are indications that the next generation of unreleased AI models is not delivering the same dramatic leaps in performance as their predecessors. More data is not leading to better models.
Better data is likely to be a more successful strategy.
The rise of modularity in AI
Fortunately, the field is shifting toward a more refined approach: modularity. Inspired by the teamwork seen in the Finn MacCool legend, modular AI emphasizes collaboration between smaller, specialized systems. This trend has already led to innovations such as Low-Rank Adaptation (LoRA), which efficiently fine-tunes pre-trained models for specific tasks by adding extra layers while preserving their core capabilities. Similarly, techniques such as Direct Preference Optimization (DPO) allow models to align with personal preferences and generate tailored responses.
However, these methods are not without challenges, as they can cause issues such as catastrophic forgetting, where a model gets its wires crossed and “forgets” aspects of its original training.
Another promising approach involves integrating AI models with local memory systems, which reduces reliance on extensive contextual memory and improves efficiency. Known as AI agents, these systems collaborate to perform tasks cohesively. For example, in advanced internet searches, AI agents can locate information, verify its relevance and summarize it into a coherent overview, providing a more practical and user-friendly experience than traditional search methods.
Combining strengths: modular systems in action
The concept of modularity extends beyond fine-tuning and AI agents. Recent advancements involve combining different types of AI models to create unified systems. OpenAI’s integration of ChatGPT with the DALL-E model, for instance, allows users to generate images directly within a text-based conversation. Similarly, NVIDIA has demonstrated the potential of modularity with Hymba, a system that combines transformer architectures (such as GPT models) with post-transformer methods (like Mamba). By integrating diverse processing mechanisms, these parallel architectures enhance adaptability and resource efficiency.
The future of AI: Smarter, not larger
While large-scale AI models have showcased the potential of the field, modular approaches are proving to be more practical, scalable, and efficient. Just as Oonagh’s clever strategy outsmarted brute force, modularity in AI represents a triumph of thoughtful design. By emphasizing collaboration, specialization, and adaptability, these systems promise to address current limitations and unlock new possibilities.
The tale of Finn MacCool reminds us that the greatest successes often come not from overwhelming power but from ingenuity and teamwork. As we look ahead, the future of AI lies in building systems that work together seamlessly: smaller giants, united by purpose, achieving feats far greater than any could accomplish alone.
Shane Ó Seasnáin
Shane Ó Seasnáin is an AI expert, originally from Ireland, who works within various organizations in the Netherlands and across Europe. He has three simple steps for successful AI use: be curious about the world, ask questions about whether this is the best we can do, and determine how AI can deliver a better world. He believes that there are many ways we can radically transform our environment, healthcare, and industry using AI, but it is always with the goal of making people more capable and happy.