Title |
Comparing String Similarity Algorithms for Recognizing Task Names Found in Construction Documents |
Authors |
Jeong, Sangwon ; Jeong, Kichang |
DOI |
http://dx.doi.org/10.6106/KJCEM.2020.21.6.125 |
Keywords |
Natural Language Processing; String Approximation; String Matching; String Similarity |
Abstract |
Natural language encountered in construction documents largely deviates from those that are recommended by the authorities. Such practice that is lacking in coherence will discourage integrated research with automation, and it will hurt the productivity in the industry for the long run. This research aims to compare multiple string similarity (string matching) algorithms to compare each algorithm’s performance in recognizing the same task name written in multiple different ways. We also aim to start a debate on how prevalent the aforementioned deviation is. Finally, we composed a small dataset that associates construction task names found in practice with the corresponding task names that are less cluttered w.r.t their formatting. We expect that this dataset can be used to validate future natural language processing approaches. |