AI vs humans: Accounting students trump ChatGPT in latest study

The study was conducted by researchers from Brigham Young University in the United States. 
Picture Courtesy: Unsplash/TNIE
Picture Courtesy: Unsplash/TNIE

Accounting students outperformed OpenAI's chatbot product, ChatGPT, on exams, as per a new study. However, the researchers noted that ChatGPT's performance was remarkable, describing it as a "game changer" that will revolutionize the way people teach and learn, yielding positive outcomes, according to the report by the PTI.

Where was the study conducted?

The researchers from Brigham Young University (BYU), United States, and 186 other universities wanted to know how OpenAI's technology would fare on accounting exams. They have published their findings in the journal Issues in Accounting Education.

How was the bot's performance?

In the researchers' accounting exam, students scored an overall average of 76.7 per cent, compared to ChatGPT's score of 47.4 per cent. While in 11.3 per cent of the questions, ChatGPT was found to score higher than the student average, doing particularly well on accounting information systems (AIS) and auditing, the AI bot was found to perform worse on tax, financial, and managerial assessments.

Researchers think this could possibly be because ChatGPT struggled with the mathematical processes required for the latter type.

The AI bot, which uses machine learning to generate natural language text, was further found to do better on true or false questions where the answers were found to be 68.7 per cent correct and multiple-choice questions where the bot rightly answered 59.5 per cent of the questions, but it happened to struggle with short-answer questions and could only get correct answers between 28.7 and 39.1 per cent.

Was ChatGPT factually correct?

In general, the researchers said that higher-order questions were harder for ChatGPT to answer. In fact, sometimes ChatGPT was found to provide authoritative written descriptions for incorrect answers or answer the same question in different ways.

They also found that ChatGPT often provided explanations for its answers, even if they were incorrect. Other times, it went on to select the wrong multiple-choice answer, despite providing accurate descriptions.

Researchers importantly noted that ChatGPT sometimes made up facts. For example, when providing a reference, it generated a real-looking reference that was completely fabricated. The work and sometimes the authors did not even exist. The bot was seen to also make nonsensical mathematical errors such as adding two numbers in a subtraction problem or dividing numbers incorrectly.

Wanting to add to the intense ongoing debate about how models like ChatGPT should factor into education, lead study author David Wood, a BYU professor of accounting, decided to recruit as many professors as possible to see how the AI fared against actual university accounting students.

His co-author recruiting pitch on social media exploded: 327 co-authors from 186 educational institutions in 14 countries participated in the research, contributing 25,181 classroom accounting exam questions.

They also recruited undergraduate BYU students to feed another 2,268 textbook test bank questions to ChatGPT. The questions covered AIS, auditing, financial accounting, managerial accounting and tax, and varied in difficulty and type (true/false, multiple choice, short answer), reported the PTI.  

Related Stories

No stories found.
X
logo
EdexLive
www.edexlive.com