Guwahati: Indian Institute of Technology Guwahati Researchers have developed a multilingual and scalable method to identify and correct Surface Name Errors (SNEs) in Wikipedia, thus helping improve information reliability for both human users and artificial intelligence systems.

Wikipedia is a free, multilingual online encyclopaedia created and maintained by a global community of volunteers through open collaboration.



In a press statement, IIT Guwahati stated that a surface name refers to the text used in Wikipedia articles to mention or link to another entity.

"A Surface Name Error (SNE) occurs when this text is incorrect. For example, using a misspelt word like "Parise" to link to the page for Paris. A study conducted by the IIT Guwahati research team found that about 3% to 6% of all entity mentions in Wikipedia contain Surface Name Errors. While these errors may appear minor, they have significant implications," said the press statement.



For human users, an incorrect surface name can reduce the perceived credibility and reliability of the information provided.

Similarly, many machine learning and deep learning models use Wikipedia as a core dataset. Such errors in surface names can negatively impact AI tasks and model performance.