The emergence of ChatGPT, OpenAI’s large language model, has sparked immense excitement about the potential of generative AI to revolutionize knowledge, research, and content creation. Among the many potential applications of generative AI, search stands out as an area with enormous promise. With the ability to fundamentally alter users’ expectations of search, generative AI could very well be the next frontier in this space. For years, Google has dominated search, but it now faces a formidable new challenger in Microsoft, which recently invested a staggering $10 billion in OpenAI and intends to incorporate the technology into various Microsoft products, including Bing.
However, while there is a great deal of hype surrounding ChatGPT, there are several practical, technical, and legal hurdles that must be overcome before these tools can achieve the scale, robustness, and reliability of an established search engine like Google. The path to realizing the full potential of generative AI in search is paved with challenges, but with the right investments in research and development, we may see a future in which chatbots like ChatGPT play a crucial role in shaping how we search for information.
Search 1.0 and 2.0
The emergence of search engines in the 1990s brought a fundamental change to how we access and process information. However, their basic principles have remained constant since their inception. Search engines aim to rank indexed websites in a way that best matches the user’s query. In the first iteration of search engines, dubbed “Search 1.0”, users had to enter specific keywords to query the engine. With the introduction of “Search 2.0” in the late 2000s, semantic search emerged, allowing users to use natural phrases as if they were conversing with a human. Despite these developments, Google has remained the king of search engines thanks to its simple and user-friendly interface, its PageRank algorithm, which delivers relevant results, and its ability to scale efficiently as search volume grows.
The Rise of Search 3.0
Google has acknowledged a shift in user behavior where they desire more than just a list of relevant websites when making a query, as noted in its announcement of Bard. Users are now seeking a greater level of comprehension and insight. Search 3.0 satisfies this demand by providing direct answers to questions instead of website links. Google has been compared to a colleague who helps us find a book in a library that can answer our question, while ChatGPT is like a colleague who has already read every book and can offer an answer to our question – at least hypothetically.
The Limitations of ChatGPT
Currently, ChatGPT cannot function as a search engine due to its inability to access real-time information like a web-crawling search engine. While ChatGPT has been trained on an extensive dataset up to October 2021, it only possesses a considerable amount of static knowledge and language processing capabilities. ChatGPT lacks the capacity to “know” anything beyond what it has been trained on, including current events such as the Omicron stage of Covid, the success of FTX, and Queen Elizabeth’s status. OpenAI CEO Sam Altman cautioned against relying on ChatGPT for significant matters in December 2022.
The Challenges of Continuous Retraining
At present, training an LLM to keep up with the constantly evolving information on the internet is highly challenging. One major obstacle is the enormous amount of processing power required for continuous training, along with the associated financial costs. Unlike Google, which sells ads to cover the cost of search and provide the service for free, the higher energy consumption of LLMs makes it harder to achieve this goal, especially if the aim is to process queries at Google’s rate, which is estimated to be in the tens of thousands per second, or billions per day.
There are various technical solutions currently being developed to overcome the challenges faced by generative AI in the search industry. One approach is to merge generative AI with other AI techniques, like machine learning, to produce more precise results. Google, for instance, has developed an algorithm known as BERT (Bidirectional Encoder Representations from Transformers), which can examine the context of words in a search query to deliver more appropriate results. BERT employs a combination of machine learning and natural language processing to understand the significance behind the words in a query.
Another solution is to use federated learning, a method that allows multiple parties to work together to train a machine learning model without having to share their data with each other. This would enable generative AI models to be trained on a diverse range of data sources while preserving the privacy and security of the data.
Legal challenges pose an additional obstacle for the widespread adoption of generative AI in the search industry. One major concern is the risk of biased or inaccurate results, which could harm individuals or groups and result in legal liability for the companies that develop and use these tools. To mitigate this risk, ethical guidelines and best practices must be developed for the use of generative AI in search, along with transparent and accountable processes for reviewing and auditing the results produced by these tools. Protecting intellectual property rights is another legal challenge that must be addressed. Generative AI has the potential to create vast amounts of content, some of which may be protected by copyright, trademark, or other forms of intellectual property. Companies must establish mechanisms for identifying and respecting these rights while allowing for the free flow of information that is crucial to the functioning of the search industry.
The use of generative AI in search has the potential to transform the industry by providing users with a greater understanding and depth of information. However, there are several significant obstacles that need to be overcome before these tools can rival established search engines like Google in terms of scale, reliability, and functionality. To overcome these challenges, companies will need to invest resources in research and development, work together with other players in the industry, and engage with policymakers and regulators to ensure that generative AI is implemented ethically and responsibly. The success of generative AI in search will depend on its ability to deliver accurate and unbiased results to users, while maintaining the privacy and security of their data. It is only by addressing these challenges that we can fully realize the potential of generative AI and its transformative impact on the search industry.