A new test of AI capabilities consists of puzzles that humans are able to solve without too much trouble, but which all leading AI models struggle with. To improve and pass the test, AI companies will need to balance problem-solving abilities with cost.
Ai companies will just train on these specific puzzles. Then they will claim their AI is AGI and the quality of the models will be the exact same or worse than before. They'll just have one checkmark more in their marketing.
The better an ai is at logic the less creative it often becomes, the more creative an ai gets the worse it gets at accurately recalling knowledge and the better an ai gets at knowledge the more it flounders at thinking critically and logically instead of just lazily reciting its knowledge perfectly back at you.
Until ai researches find a way to solve this rock-paper-scissors of constant self sabotage ai cant advance to the next phase.
This is because AI is not aware of context due to not being intelligent.
What is called creative is really just randomization within the constraints of the design. That reduces accuracy, because of the randomization. If the 'creativity' is reduced, it becomes more accurate because it is no longer adding changes.
Using words like creativity, self sabotage, hallucinations, etc. all make it seem like AI is far more advanced than it actually is.
I know I am anthropormizing it too much but the fact the current design cant even increase this super basic creativity without messing itself up in the process is a massive problem in the design, the ai cant seem to understand when to be "creative" and when not to, when to attempt to solve a probme through recalling data abd when not to showing its far less aware than a person is to a very basic level