The Evaluation of Text String Matching Algorithms as an Aid to Image Search

Ochelska-Mierzejewska, Joanna

doi:https://doi.org/10.34658/jacs.2018.26.1.33-62

The Evaluation of Text String Matching Algorithms as an Aid to Image Search

Files

2_Evaluat_text_Ocelska-Mierzejewska_2018.pdf (674 KB)

Date

2018

Authors

Ochelska-Mierzejewska, Joanna

Publisher

Wydawnictwo Politechniki Łódzkiej
Lodz University of Technology Press

Abstract

The main goal of this paper is to analyse intelligent text string matching methods (like fuzzy sets and relations) and evaluate their usefulness for image search. The present study examines the ability of different algorithms to handle multi-word and multi-sentence queries. Eight different similarity measures (N-gram, Levenshtein distance, Jaro coefficient, Dice coefficient, Overlap coeffiient, Euclidean distance, Cosine similarity and Jaccard similarity) are employed to analyse the algorithms in terms of time complexity and accuracy of results. The outcomes are used to develop a hierarchy of methods, illustrating their usefulness to image search. The search response time increases signiﬁcantly in the case of data sets containing several thousand images. The ﬁndings indicate that the analysed algorithms do not fulﬁl the response-time requirements of professional applications. Due to its limitations, the proposed system should be considered only as an illustration of a novel solution with further development perspectives. The use of Polish as the language of experiments affects the accuracy of measures. This limitation seems to be easy to overcome in the case of languages with simpler grammar rules (e.g. English).

Keywords

text comparison, N-gram, Levenshtein distance, Jaro coefficient, Dice’s coefficient, Overlap coefficient, Euclidean distance, Cosine similarity, Jaccard similarity, porównanie tekstu, N-gram, odległość Levenshteina, współczynnik Jaro, współczynnik kości, współczynnik nakładania, odległość euklidesowa, Cosinus podobieństwa, podobieństwo Jaccarda

Citation

Ochelska-Mierzejewska, J. (2018). The Evaluation of Text String Matching Algorithms as an Aid to Image Search. Journal of Applied Computer Science, 26(1), 33-62. https://doi.org/10.34658/jacs.2018.26.1.33-62

URI

http://hdl.handle.net/11652/3882
https://doi.org/10.34658/jacs.2018.26.1.33-62

Collections

2018, Tom 26 Nr 1
Artykuły (WFTIMS)

Full item page

The Evaluation of Text String Matching Algorithms as an Aid to Image Search

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By