ChatGPT reveals “Achilles heel”: Can write code, do thesis, pass MBA exam but can’t solve math?

Tram Ho

According to CNBC, ChatGPT, a chatbot that has been “storming” in recent months, is said to be able to “answer questions like a real person”, presenting ideas in a logical, in-depth and extremely fluent way. for complex questions. However, this chatbot seems to be quite poor at math.

ChatGPT để lộ "gót chân Achilles": Có thể viết code, làm luận văn, thi đỗ MBA nhưng lại không thể giải toán? - Ảnh 1.

Since its launch in November 2022, OpenAI’s ChatGPT chatbot has exploded in many fields, including education. Many schools in New York (USA) have banned access to this website in schools. In addition, the professors also revised the curriculum and teaching methods to prevent students from using this tool to “do homework” and cheat in exams.

However, it seems that ChatGPT has also revealed “Achilles’ heel” when it was found to be quite bad at math. “I don’t see math instructors expressing concerns about this chatbot,” said Paul von Hippel, a research professor of data science and statistics at the University of Texas. I’m not sure if this tool is useful for math but it’s quite strange because math is often the first experimental field for computing devices.”

ChatGPT can do basic math but has difficulty solving word problems. For example, with the question “If a banana weighs 0.5 lbs. I have 7 lbs of bananas and 9 oranges, how many total do I have?”, ChatGPT gave the answer as 16 with 7 bananas and 9 oranges. Meanwhile, the correct answer to this problem should be 23.

ChatGPT để lộ "gót chân Achilles": Có thể viết code, làm luận văn, thi đỗ MBA nhưng lại không thể giải toán? - Ảnh 2.

ChatGPT has difficulty solving math problems with words. Photo: Wall Street Journal

Or if you ask ChatGPT between Shaquille O’Neal and Yao Ming who is taller, this chatbot will give accurate information that Yao Minh is 7’6″ tall and Shaquille O’Neal is 7’1″. But the conclusion this tool makes is that Shaquille is superior. Or this chatbot will miscalculate the square root of large numbers.

According to The Wall Street Journal, ChatGPT’s limitations with math are completely normal. This chatbot is like sentence autocomplete but more complicated. A supercomputer proficient in Mad Libs can be extremely efficient at writing grammatically correct answers to an essay but not to a math problem. It’s ChatGPT’s “Achilles heel”.

Professor Hippel added: “This chatbot acts like an expert, and sometimes it can convincingly pretend to be an expert. But this tool often gives answers that contain both true, false, and possibly fabricated information in a convincing way.”

According to Debarghya Das, a search engine engineer, having ChatGPT answer other questions right but doing the math wrong is like asking a group of people who know nothing about math but can gather information. “If asked, ‘What is 2 + 2,’ they might say, ‘We usually see 4’. That’s how ChatGPT is working,” Das said.

OpenAI CEO Sam Altman once wrote on Twitter: “ChatGPT is extremely limited, but good enough in some respects to create misinformation. Relying on this chatbot is completely a mistake.”

When initiating a conversation with ChatGPT, the tool warns in advance: “While we have safeguards in place, the system can sometimes generate inaccurate or misleading information.”

ChatGPT để lộ "gót chân Achilles": Có thể viết code, làm luận văn, thi đỗ MBA nhưng lại không thể giải toán? - Ảnh 3.

ChatGPT wrongly solves the problem of finding x for 3x + 4 = 11. Photo: Wall Street Journal

“Mathematics is the most revolutionary machine-based industry I know of,” said Conrad Wolfram, chief strategy officer of Wolfram Research, the company that developed the problem solving website Wolfram Alpha. While English teachers still have concerns about students using computers to “do their homework,” math teachers have long had to make sure that students really understand math, not simply using computers. calculate to calculate.

“Since the advent of computers, have the concepts of math, science and engineering become simpler? The answer is no, quite the opposite. We constantly have more difficult and complicated questions,” Mr. Wolfram said.

According to the Wall Street Journal, artificial intelligence will finally be able to answer math questions correctly with confidence. A pure large language model may not be suitable for the job, but technology will improve them. The next generation of AI can combine ChatGPT’s language skills with Wolfram Alpha’s math skills.

In general, however, AI, like computers, will ultimately be most useful to people who already have a clear understanding of a certain field. They know the questions to ask, how to identify gaps, and what to do with the answers provided.

Reference: Wall Street Journal

Share the news now

Source : Genk