EPIIC:: a novel encoding pluggable lossless data compression algorithm


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2018

Öğrenci: TAYLAN İSMAİL DOĞAN

Danışman: YUSUF SAHİLLİOĞLU

Özet:

Encoding pluggable inverted index compression (EPIIC) is a novel lossless data compression algorithm that applies a pipeline of conventional compression techniques on files that are transformed into a structure similar to inverted indexes. What makes this study novel is the idea of compressing the positions or indexes of hexadecimals that make up a file, instead of focusing on compressing the original file. By leveraging the inverted index structure underneath, we have been able to avoid storing the positional data of the most frequent hexadecimal in a file. Moreover, a slightly different variation of run length encoding is used to make the data even more compressible. As a result, it is observed that this new notion of compression performs on a par with widely known algorithms like LZMA and bzip2, especially when used on text and XML files.