It is known that universal compression of strings generated by i.i.d. sources over infinite alphabets entails infinite per-symbol redundancy. Continuing previous work [1], we consider alternative compression schemes which decompose the description of such strings into a description of the symbols appearing of the string and a description of the arrangement the symbols form. We consider two descriptions of the symbol arrangement: shapes and patterns. Roughly speaking, shapes describe the relative magnitude of the symbols while patterns describe only the order in which they appear. We prove that the per-symbol worst-case redundancy of compressing shapes is a positive constant less than one, and that the per-symbol redundancy of compressing patterns diminishes to zero as the blocklength increases. We also mention some results on sequential pattern compression.
机构:
Univ Fed Santa Catarina, Dept Math, BR-88040900 Florianopolis, SC, BrazilUniv Fed Santa Catarina, Dept Math, BR-88040900 Florianopolis, SC, Brazil
Goncalves, Daniel
Sobottka, Marcelo
论文数: 0引用数: 0
h-index: 0
机构:
Univ Fed Santa Catarina, Dept Math, BR-88040900 Florianopolis, SC, BrazilUniv Fed Santa Catarina, Dept Math, BR-88040900 Florianopolis, SC, Brazil
Sobottka, Marcelo
Starling, Charles
论文数: 0引用数: 0
h-index: 0
机构:
Univ Ottawa, Dept Math & Stat, 585 King Edward, Ottawa, ON K1N 6N5, CanadaUniv Fed Santa Catarina, Dept Math, BR-88040900 Florianopolis, SC, Brazil