A few months ago, a very interesting paper came out that tested the GPT-3's performance on tests of analogical reasoning. (There is an accessible article about it here.) Surprisingly for a system built entirely on correlation, ChatGPT performed quite well, outscoring the average human test-taker.
The study used both abstract (letter and number-based) and verbal analogies. While I can't comment on what might be going on with the abstract problems, an experience of mine may shed some light on the AI's verbal performance. I scored 800 on both the verbal portion of the SAT in 2000 and the verbal GRE in 2005, when both tests included analogies. For the record, my prep both times consisted of a Kaplan CD-ROM, which I used mainly for math, and the official practice test.
Taking the tests was an interesting experience. The easy and medium questions worked pretty much as intended. I recalled the meanings of the words, figured out the relationship between the words in the original pair, and marked the answer choice that would give a similar relationship in the target pair. (This is where explicit vocabulary training can help student performance.)
But after a certain level of difficulty, that strategy failed. A number of the harder questions contained at least one word I could not have explicitly defined. On those questions, I went by feel. Many of the words I couldn't define were still words I had encountered in reading. I had some sense of how they were used and how they related to other words, so I went by that, throwing in some other cues like etymology. Apparently, it worked. (At that level, the SAT and GRE pretty much becomes tests of how much you read.)
I relate this experience because association is fundamentally what GPT does. Although its detailed workings are a secret, we do know that this kind of AI uses linguistic correlation structures derived from enormous amounts of text. And somehow, that kind of correlational knowledge can enable a test-taker to solve analogy problems without relying on explicit deductive reasoning. It may be worth investigating whether generative AI is doing something similar and how far such a strategy can take it.