Title Comparing String Similarity Algorithms for Recognizing Task Names Found in Construction Documents
Authors Jeong, Sangwon ; Jeong, Kichang
DOI http://dx.doi.org/10.6106/KJCEM.2020.21.6.125
Page pp.125-134
ISSN 2005-6095
Keywords Natural Language Processing; String Approximation; String Matching; String Similarity
Abstract Natural language encountered in construction documents largely deviates from those that are recommended by the authorities. Such practice that is lacking in coherence will discourage integrated research with automation, and it will hurt the productivity in the industry for the long run. This research aims to compare multiple string similarity (string matching) algorithms to compare each algorithm’s performance in recognizing the same task name written in multiple different ways. We also aim to start a debate on how prevalent the aforementioned deviation is. Finally, we composed a small dataset that associates construction task names found in practice with the corresponding task names that are less cluttered w.r.t their formatting. We expect that this dataset can be used to validate future natural language processing approaches.