One of the great fears that some part of Humanity faces is to imagine that machines will surpass human beings. The concern is pertinent because no one would like to be immersed in a science fiction soap opera in which the plot is about supremacist machines crushing the souls of the beings who invented them. It looks like that won’t happen. According to Apple researchers, they have just shown that Artificial Intelligence bots cannot think and possibly never will.
According to Michael Hiltzik, columnist for the Angeles Timesthe Apple team found “catastrophic performance drops” in AI models when they tried to analyze simple math problems written in essay form. The systems tasked with solving the question often did not understand the data presented to them, failing to decipher a simple math problem from a primary school textbook.
It’s curious, but the truth is, it feels reassuring to learn that human schoolchildren—according to what Apple researchers postulated—are much better at detecting the difference between relevant information and inconsequential opinions. It seems that we find that Artificial Intelligence can handle the complex and does not know how to interpret the simple.
The findings of the team of Apple researchers were published last October in a technical document that can be consulted and have attracted attention in AI laboratories, academia, the specialized press, not only because the results are well documented, but also because the researchers work for one of the leading high-tech consumer companies.
So, what they say is relevant since it affects them firsthand and no one shoots themselves in the foot for the sake of it. It seems that AI systems have been marketed as reliable, and “smart” are not bulletproof and require care in handling.
In fact, Apple’s conclusion coincides with previous studies that have found that large language models, or LLMs, do not actually “think” but rather match language patterns in the materials they have been supplied as part of their “training”. When it comes to abstract reasoning that is a key attribute of human intelligence, in the words of Melanie Mitchell, an expert in cognition and intelligence at the Santa Fe Institute, the models fall short.
In the academic field, it has been possible to verify and corroborate what was supported by the Apple research group. The question of whether current LLMs are actually capable of true logical reasoning remains an important focus of research. While some studies highlight impressive capabilities, closer examination reveals substantial limitations.
Teachers at all academic levels have detected certain repetitive patterns that reveal assignments written by artificial intelligence bots. After putting GPT bots through a series of analogy puzzles, the conclusion is that there is still a huge gap in basic abstract reasoning between humans and state-of-the-art AI systems.
This is important because LLMs like GPT are the foundation of AI products that have captured the public’s attention. But the LLMs tested by the Apple team were consistently misled by the language patterns they were trained on. For example, according to Michael Hiltzik, Apple researchers set out to answer the question: “Do these models really understand mathematical concepts?” The answer is no. Professor Farajtabar also wonders if the deficiencies they identified can be easily fixed, and his answer is also no.
Apple’s research, along with other findings about the limitations of AI bots’ cognitive limitations, is a much-needed corrective to the sales pitches coming from companies selling their AI models and systems, including OpenAI and the lab Google DeepMind. Promoters generally describe their products as reliable.
In fact, their results are consistently suspect, posing a clear danger when used in contexts where the need for rigorous precision is absolute, for example in healthcare applications.
Of course, in reality, this is not always the case. There are some problems with which you can generate profits without having a perfect solution. For example, AI-powered recommendation engines, those that direct shoppers on Amazon to products they might also like, for example. If those systems get a recommendation wrong, it’s no big deal. A customer can spend a few dollars on a book they won’t like and that’s okay. But, if a calculator is correct only 85% of the time, it’s garbage. You wouldn’t use it.
These errors are often described by AI researchers as “hallucinations.” The term may make errors seem almost innocuous, but in some applications, even a minuscule error rate can have serious ramifications. What is passed off as a minor defect, errors could be incorporated into official records, such as transcripts of court testimony or telephone calls from prison, leads to imprecise decisions, which lead a person to put themselves at greater risk than the benefit that AI brings.
That is not to speak ill of artificial intelligence or to become those who believe that it will be the window of the Apocalypse. No. It is appropriate to note that Apple researchers are not critical of AI as such, but they believe it is necessary to understand its limitations. The decline in performance of these models as problems become more complex is a frontier. For these models and for those of us who are users of them. It is necessary to understand when it is prudent and necessary to use them and when not. For now, it seems that the question of whether or not they know how to think has a clear answer. No.
Contact:
Mail: (email protected)
Twitter: @CecyDuranMena
The opinions expressed are solely the responsibility of their authors and are completely independent of the position and editorial line of Forbes Mexico.
Follow information about business and current events in Forbes Mexico