LLMs (I refuse to call them AI, as there's no intelligence to be found) are simply random word sequence generators based on a trained probability model. Of course they're going to suck at math, because they're not actually calculating anything, they're just dumping what their algorithm "thinks" is the most likely response to user input.
"The ability to speak does not make you intelligent" - Qui-Gon Jin
Ive heard the chatgpt math problem was fixed in the new one by having it write a python code to complete the math problem and then providing the answer when the code is run.
I kind of want to see something on this because my past experiences with ChatGPT is that it was horrible at math, but I used it again the other week for some insight on some math, and it worked it out completely correctly. I don't know anymore, is it shit at math, is it good at math or in between?