Engineering the counting filter for string matching algorithms




Tytuł czasopisma

ISSN czasopisma

Tytuł tomu


Wydawnictwo Politechniki Łódzkiej
Lodz University of Technology Press


We consider a new approach to the popular counting filter technique for approximate pattern matching. Our solution is based on q-grams combined with alphabet reduction by bin packing and using Streaming SIMD Extensions (SSE). We present a few variants that use the mentioned techniques and discuss pros and cons of them considering two approximate pattern matching problems. The first one is the well- known matching with k-differences and the second one is a biological problem of DNA sequence mutation called matching with inversions and translocations. The experimental results show the effectiveness of our ideas that speed up the counting filter and reduce the number of verifications by orders of magnitude.


Słowa kluczowe

approximate pattern matching, counting filter, string algorithms, q-grams, inversions and translocations, k-differences, SSE, przybliżone dopasowanie do wzorca, filtr liczący, algorytmy ciągów, q-gramy, inwersje i translokacje, k-różnice


Susik R., Grabowski Sz., Engineering the counting filter for string matching algorithms. W: Algorithms, Networking and Sensing for Data Processing, Mobile Computing and Applications, Romanowski A. (red.), Sankowski D. (red), Sikora J. (red.), Lodz University of Technology Press, Łódź 2016, s. 125-140, ISBN 978-83-7283-738-7.