Aidan Gomez can take credit for the ‘T’ at the end of ChatGPT. He was part of a group of Google engineers who first introduced a new model of artificial intelligence called a transformer.
That helped lay the groundwork for today’s AI generation boom fueled by OpenAI, the maker of ChatGPT and others. Gomez, one of eight co-authors on the 2017 Google paper, was a 20-year-old intern at the time.
He is now the CEO and co-founder of Cohere, a Toronto-based startup that competes with other major AI companies in providing large language models and the conversations they power to large businesses and organizations.
Gomez talked about the future of AI generation with The Associated Press. The interview has been edited for length and clarity.
Q: What is a transformer?
A: A transformer is the architecture of a neural network – the structure for the computation that takes place within the model. The reason transformers are special relative to their peers — other competing architectures, other ways of structuring neural networks — is fundamentally because they scale very well. They can be trained over not just thousands, but tens of thousands of chips. They can be trained very quickly. They use many different operations that these GPUs (graphics chips) are tailored for. Compared to what was there before the transformer, they make that processing faster and more efficient.
Q: How important are they to what you’re doing at Cohere?
A: Very important. We use the transformer architecture like everyone else when building large language models. For Cohere, there is a huge focus on scalability and production readiness for enterprises. Some of the other models we compete against are huge and super inefficient. You can’t really put that into production, because once you’re facing real users, costs blow up and the economics break down.
Q: What is a specific example of how a customer is using the Cohere model?
A: My favorite example is in the healthcare space. It stems from the surprising fact that 40% of a doctor’s working day is spent writing patient notes. So what if we could require doctors to attach a small passive listening device to carry with them throughout the day, between their patient visits, to listen to the conversation and pre-communicate those notes so that instead of writing it down from the beginning, to exist. first draft there. They can read through and just make changes. Suddenly, the ability of the doctors is greatly strengthened.
Q: How do you address customer concerns about AI language models being prone to ‘imagination’ (errors) and bias?
A: Customers are always concerned about hallucinations and delusions. It results in a bad product experience. So it’s something we really focus on. For hallucinations, we have a central focus on RAG, which is the generation of increased recovery. We have just released a new model called Command R which is specifically aimed at RAG. It allows you to link the model to private sources of reliable information. This could be internal documents of an organization or emails of a particular employee. You are giving the model access to information that he would not have seen on the web when he was learning otherwise. What’s important is that it also allows you to check the model, because now instead of texting in, texting out, the model is referencing documents. He can refer back to where he got that information. You can check his work and get much more confidence working with the tool. It greatly reduces hallucinations.
Q: What are the biggest public misconceptions about AI generation?
A: The fear that certain individuals and organizations have about this technology being Terminator, an existential risk. Those are the stories that humanity has been telling for many years. Technology is coming and taking over and displacing us, making us subservient. They are deeply embedded in the cultural brain stem of the community. It is a very significant story. It’s easier to capture people’s imaginations and fears when you tell them that. So we pay a lot of attention to it because it is so exciting as a story. But the truth is I think this technology is going to be really good. Many of the arguments about how it could go wrong, those of us who are developing the technology are very aware of those risks and are working to mitigate those risks. We all want this to go well. We all want technology to be an addition to humanity, not a threat to it.
Q: Not just OpenAI but some big tech companies are now explicitly saying that they want to build general artificial intelligence (a term for AI that is overall better than human). Is AGI part of your mission?
A: No, I don’t see it as part of my mission. For me, AGI is not the ultimate goal. The ultimate goal is a profound positive impact on the world with this technology. It is a very general technology. It’s reasoning, it’s intelligence. So it applies all over the place. And we want to make sure it’s the most efficient form of technology possible, as soon as possible. It’s not some pseudo-religious pursuit of AGI, which we don’t even know the definition of.
Q: What’s next?
A: I think everyone should keep their eyes on using tools and behaving more like agents. Models you can present for the first time with a tool you’ve built. It may be a software program or an API (application programming interface). And you can say, ‘Hey model, I just built this. This is what it does. This is how you interact with it. This is part of your toolkit of things you can do.’ That general principle of being able to bring a model tool that’s never been seen before and it can be adopted effectively, I think it’s going to be very powerful. To do many things, you need access to external tools. The current status quo is that models can only write characters (text) back to you. If you give them access to tools, they can take action around the world for you.