Generative AI Still Not As Good As Humans In Summarizing Document Content



The boom in the use of generative artificial intelligence (AI) has caused many to worry that their careers are in jeopardy. It is often seen that AI can summarize documents of hundreds of pages in just a few minutes compared to humans who need several hours. To see the ability of AI to be used by government agencies, the Australian Securities and Investments Commission (ASIC) conducted a study comparing its ability to summarize documents compared to humans.


Some AIs are being used in early testing through Amazon AWS. In the final stage, the Llama2-70B model was chosen because it has the best capabilities. A total of five documents sent to ASIC then need to be processed by AI with it needing to summarize by focusing on recommendations, references to legislation, summarizing page numbers are taken and finally providing context.



The same instructions were also given to 10 ASIC staff. The abstracts produced are then judged by a group of reviewers. They were not told there was an AI generated summary. After that a score is given to each comment produced. On average human produced summaries are given a score of 81% while those produced by AI receive a score of 47%.


AI summaries were found to not always be able to read important context and nuances in reports. Often emphasis is given to topics that are not very important. The instruction to place the page number of the information taken is also sometimes ignored even though it is among the instructions given. As many as 3 out of 5 reviewers then said they could detect which summaries were generated by the AI.


Although studies show that AI is not yet able to match humans in producing document summaries, researchers emphasize that Llama2-70B is not the latest AI model. Improvements to the model and more detailed instructions on prom may improve AI summarization capabilities in the future.

Previous Post Next Post

Contact Form