|
Problem 98: Lempel-Ziv-Welch Decoding |
|
The Lempel-Ziv-Welch compression method is based on a type of Lempel-Ziv factorisation. It consists in encoding repeating factors of the input text by their code in a dictionary $D$ of words. The dictionary, initialised with all the letters of the alphabet $A$, is prefix-closed: every prefix of a word in the dictionary is in it.
Here is the encoding algorithm in which $\textit{code}_D(w)$ is the index of the factor $w$ in the dictionary $D$.
\begin{algorithmic}
\STATE $D\leftarrow A$
\STATE $w\leftarrow \mbox{first letter of } input$
\WHILE{$\mbox{not end of } input$}
\STATE $a\leftarrow \mbox{next letter of } input$
\IF{$wa\in D$}
\STATE $w\leftarrow wa$
\ELSE
\STATE write $\textit{code}_D(w)$
\STATE $D\leftarrow D\cup\{wa\}$
\STATE $w\leftarrow a$
\ENDIF
\ENDWHILE
\STATE write $\textit{code}_D(w)$
\end{algorithmic}
The decompression algorithm reads the sequence of codes produced by the encoder and updates the dictionary similarly to the way the encoder does.
\begin{algorithmic}
\STATE $D\leftarrow A$
\WHILE{$\mbox{not end of } input$}
\STATE $i\leftarrow \mbox{next code of } input$
\STATE $w\leftarrow \mbox{factor of code } i \mbox{ in } D$
\STATE write $w$
\STATE $a\leftarrow \mbox{first letter of next decoded factor}$
\STATE $D\leftarrow D\cup\{wa\}$
\ENDWHILE
\end{algorithmic}
The question highlights the only critical situation encountered by the decoder. The property provides the element to ensure it can correctly decode its input.