Tools such as OpenAI can on occasion give the impression that they are able to prove theorems and even generalize them. Whether this is a sign of real (artificial) intelligence or simply combining facts retrieved from technical papers and put things together without advanced logic, is irrelevant. A correct proof is a correct proof, no matter how the author – a human or a bot – came to it.
My experience is that tools like OpenAI make subtle mistakes, sometimes hard to detect. Some call it hallucinations. But the real explanation is that usually, it blends different results together and add transition words between sentences in the answer, to make it appear as a continuous text. Sometimes, this creates artificial connections between items that are loosely if at all related, without providing precise reference to the source, and the exact location within each reference. It makes it hard to double check and make the necessary corrections. However, the new generation of LLMs (see https://mltblog.com/4g2sKTv) offers that capability: deep, precise references.
Likewise, mathematicians usually make mistakes in the first proof of a new, challenging problem. Sometimes these are glitches that you can fix, sometimes the proof is fundamentally wrong and not fixable. It usually takes a few iterations to get everything right.
➡️ Read full article and learn how I proved a difficult result with the help of AI, at https://mltblog.com/4jqUiUD