Teachers and postgraduates of Samara University was the first in the world (according to Google Scholar) to conduct the experimental study on the age of information used by popular artificial intelligence (AI) systems, so-called Large Language Models (LLM). These AI systems are traditionally used for various operations with texts, such as writing, literary processing, error correction, translation, etc. Besides, Large Language Models can write program code, search and collect information, and communicate with users by answering their questions – almost exactly like a human.
During the experiments, the researchers found out, with which type of questions you can almost accurately understand who exactly you are communicating with on the Internet – with a human or an artificial intelligence. This can help improve the classical Turing test, which is no longer able to cope with modern AI systems. Results of the study have been published in the authoritative Russian scientific journal Artificial Intelligence and Decision-Making.
“When researching, we considered the limitations in applying Large Language Models due to the obsolescence of the information, on which the models were trained at the time. Previously, as far as I know, such research has not been conducted anywhere in the world, so we can say that we are the first here. According to Google Scholar, our work is the first, and this is a conference paper to have been reviewed, unlike competitors’ preprints, which were published later. The works are also easily distinguished by the terminology used. The fact is that, unfortunately, traditional LLM is not equipped with the system of additional learning on most topics and areas of human knowledge, so over time, the information gained by these language models becomes outdated, the answers of their chatbots become inaccurate and lose relevance in the light of new events, news, the emergence of new technologies, and so on. This effect has already been observed many times in various models, and therefore studying the limitations on the area of LLM applications is now a very important task for scientists working in the field of artificial intelligence”, said Andrey Sukhov, Doctor of Technical Sciences, Professor at Samara University’s Department of Software Systems.
According to the scientist, the problem of verifying responses received from LLM chatbots has been still quite acute. So far, it is impossible to say for sure whether the response received from the chatbot is accurate and based on real facts, or whether the chatbot used unverified facts and speculation posted on the global network. However, in course of studying the problems related to the age of information, Samara researchers have identified a pattern that can easily expose bots masquerading as humans on the Internet.
“The format of responses made by a chatbot to requests related to the information from different time periods – before and after learning the LLM – varies greatly. The standard output of a chatbot is usually just a text response with the result to be explained. If the user has requested information about events and phenomena that occurred after learning the LLM, the chatbot turns to the search engine – it varies from model to model – and gives the user a list of text excerpts with links to the websites, from which these excerpts are taken. This change in the response format allows accurately determining the time, at which the model was learned, that is, determining the age of information, and can also help distinguish a bot from a human during communicating on the Internet”, said Murad Jeribi, one of the authors of the research, a postgraduate of the Department of Cyberphotonics, specializing in Artificial Intelligence and Machine Learning at Samara University. Murad arrived to study in Samara from Algeria.
According to the research results, for determining the time of learning the model and the age of the information, on which it was learnt, one just need to create and ask a list of questions that suggest a simple numerical answer assuming different numerical values in different time periods. Herewith, the answers to these questions are needed to be easily checked by using search engines on the Internet. As such a control query, one can ask a chatbot, for example, about the population or the number of marriages and divorces over certain periods of time in a country where the statistics are publicly available on the websites of relevant agencies. As soon as the chatbot changes the response format and starts producing excerpts of statistical data with links to websites, one can understand when the model of this chatbot was learnt.
“Such questions can also be asked, for example, for finding out who you are communicating with online – with a human or a computer. If your interlocutor’s answers contain links to an Internet search engine or the response contains a list of sites with brief information on the subject of the query, it is very likely that you are communicating with the AI system. Therefore, we suggest using this change in the response format as a special qualification condition-sign for identifying a LLM. We believe that our proposed algorithm of actions should also be used in future for compiling an updated list of questions when passing the Turing test”, said Andrey Sukhov.
The material is prepared with the support of Russia’s Ministry of Education and Science, in the framework of the Decade of Science and Technology.
For reference:
The Turing test is a method of researching artificial intelligence, proposed by British mathematician Alan Turing in 1950. The test is aimed at finding out whether a computer is capable of behaving so convincingly in a dialogue with a human that the person will neither notice the substitution nor realize that he(she) is communicating with a computer. Modern AI systems successfully pass the classic Turing test.
Google Scholar is a global search engine for scientific publications. Using search robots, the portal indexes metadata and performs full-text searches of scientific literature, including journal articles, preprints, dissertations, books, and technical reports.
RU
EN
CN
ES 