Foundations and Trends® in Signal Processing > Vol 6 > Issue 2–3

Pattern Matching in Compressed Texts and Images

Don Adjeroh, Lane Department of Computer Science and Electrical Engineering, West Virginia University, USA, don@csee.wvu.edu Tim Bell, Department of Computer Science and Software Engineering, University of Canterbury, New Zealand, tim.bell@canterbury.ac.nz Amar Mukherjee, School of Electrical Engineering and Computer Science, University of Central Florida, USA, amar@eecs.ecf.edu
 
Suggested Citation
Don Adjeroh, Tim Bell and Amar Mukherjee (2013), "Pattern Matching in Compressed Texts and Images", Foundations and Trends® in Signal Processing: Vol. 6: No. 2–3, pp 97-241. http://dx.doi.org/10.1561/2000000038

Published: 25 Jul 2013
© 2013 D. Adjeroh, T. Bell and A. Mukherjee
 
Subjects
Search,  Text mining,  Information retrieval,  Data compression,  Image and Video Retrieval
 

Free Preview:

Article Help

Share

Download article
In this article:
1. Introduction
2. Search Strategies
3. Relationship Between Searching and Compression
4. Searching Compressed Data: Performance Measurement
5. Searching Compressed Data: Text
6. Searching Compressed Data: Images
7. Directions for Further Research
8. Conclusion
Acknowledgments
References

Abstract

This review provides a survey of techniques for pattern matching in compressed text and images. Normally compressed data needs to be decompressed before it is processed, but if the compression has been done in the right way, it is often possible to search the data without having to decompress it, or at least only partially decompress it. The problem can be divided into lossless and lossy compression methods, and then in each of these cases the pattern matching can be either exact or inexact. Much work has been reported in the literature on techniques for all of these cases, including algorithms that are suitable for pattern matching for various compression methods, and compression methods designed specifically for pattern matching. This work is surveyed in this review. The review also exposes the important relationship between pattern matching and compression, and proposes some performance measures for compressed pattern matching algorithms. Ideas and directions for future work are also described.

DOI:10.1561/2000000038
ISBN: 978-1-60198-684-9
147 pp. $99.00
Buy book
 
ISBN: 978-1-60198-685-6
147 pp. $230.00
Buy E-book
Table of contents:
1. Introduction
2. Search Strategies
3. Relationship Between Searching and Compression
4. Searching Compressed Data: Performance Measurement
5. Searching Compressed Data: Text
6. Searching Compressed Data: Images
7. Directions for Further Research
8. Conclusion
Acknowledgments
References

Pattern Matching in Compressed Texts and Images

Pattern Matching in Compressed Texts and Images surveys and appraises techniques for pattern matching in compressed text and images. Normally compressed data needs to be decompressed before it is processed. If however the compression has been done in the right way, it is often possible to search the data without having to decompress it, or, at least, only partially decompress it. The problem can be divided into lossless and lossy compression methods, and then in each of these cases the pattern matching can be either exact or inexact. Much work has been reported in the literature on techniques for all of these cases. It includes algorithms that are suitable for pattern matching for various compression methods, and compression methods designed specifically for pattern matching. This monograph provides a survey of this work while also identifying the important relationship between pattern matching and compression, and proposing some performance measures for compressed pattern matching algorithms.

Pattern Matching in Compressed Texts and Images is an excellent reference text for anyone who has an interest in the problem of searching compressed text and images. It concludes with a particularly insightful section on the ideas and research directions that are likely to occupy researchers in this field in the short and long term

 
SIG-038