Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16.3 English Translation Added #302

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions C16-Greedy-Algorithms/16.3.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ Prove that a binary tree that is not full cannot correspond to an optimal prefix

更正式的,假设一个节点N不是满的(即它只有一个儿子),不妨设它的父节点为P,它只有左儿子L。则我们总可以将N删掉,并将L直接和P相连,得到一个新的树。这棵新树仍对应一个合法的编码,且N下面所有节点的深度都减少了1,从而新树是比原树更优的一种编码。这与原树是最优编码矛盾。因此一棵不满的二叉树一定不于最优编码对应。

#### Translation
To achieve the optimal prefix encoding, you must consider each case of the prefix, so it must be a full binary tree.

More formally, assuming that a node N is not full (ie, it has only one son), it is possible to set its parent node to P, which has only the left son L. Then we can always delete N and connect L directly to P to get a new tree. This new tree still corresponds to a legal encoding, and the depth of all nodes below N is reduced by 1, so that the new tree is a better encoding than the original tree. This contradicts the original tree as the optimal coding. Therefore, a dissatisfied binary tree must not correspond to the optimal encoding.

### Exercises 16.3-2
***
What is an optimal Huffman code for the following set of frequencies, based on the first 8 Fibonacci numbers?
Expand All @@ -24,6 +29,12 @@ Can you generalize your answer to find the optimal code when the frequencies are

H(n) = 0, H(k) = H(k+1) + 2<sup>n-k</sup> (0 < k < n), H(0) = H(1) + 1

#### Translation

Let the first n+1 Fibonacci numbers, f(0) = 1, f(1) = 1, ..., f(n) = f(n-2) + f(n-1), The corresponding Huffman codes are H(0), H(1), ..., H(n), which can be summarized by the above figure:

H(n) = 0, H(k) = H(k+1) + 2<sup>n-k</sup> (0 < k < n), H(0) = H(1) + 1

### Exercises 16.3-3
***
Prove that the total cost of a tree for a code can also be computed as the sum, over all internal nodes, of the combined frequencies of the two children of the node.
Expand All @@ -46,27 +57,47 @@ T = f(1)d(1) + ... + f(i)d(i+1) + f(i+1)d(i) + ... + f(n)d(n)

有T - S = (f(i) - f(i+1))(d(i+1) - d(i)) < 0 <=> T < S。所以交换后得到一个代价更小的编码。这样,我们可以一直交换,直到所有编码长度变为递增的,且其对应的编码也是最优的。

#### Translation

Let the character frequency be f(1), f(2), ..., f(n), and the code lengths are d(1), d(2), ..., d(n). It is known that f(1) >= f(2) >= ... >= f(n). If the code length is not incremented, then i is present such that d(i) > d(i+1). We can exchange the codes corresponding to i and i+1. The total cost before the exchange is

S = f(1)d(1) + ... + f(i)d(i) + f(i+1)d(i+1) + ... + f(n)d(n)

The total cost after the exchange is

T = f(1)d(1) + ... + f(i)d(i+1) + f(i+1)d(i) + ... + f(n)d(n)

There is T - S = (f(i) - f(i+1))(d(i+1) - d(i)) < 0 <=> T < S. So after the exchange, I get a less expensive code. In this way, we can always exchange until all code lengths are incremented, and their corresponding codes are also optimal.

### Exercises 16.3-5
***
Suppose we have an optimal prefix code on a set C = {0, 1, ..., n - 1} of characters and we wish to transmit this code using as few bits as possible. Show how to represent any optimal prefix code on C using only 2n - 1 + n ⌈lg n⌉ bits. (Hint: Use 2n - 1 bits to specify the structure of the tree, as discovered by a walk of the tree.)

### `Answer`
用**2n-1**位表示树的结构,内部节点用1表示,叶子节点用0表示.用nlog(n)为表示字母序列,每个字母的二进制编码长度为log(n),总共需要nlog(n)位.

#### Translation

The structure of the tree is represented by **2n-1** bits, the internal nodes are represented by 1, and the leaf nodes are represented by 0. The sequence of letters is represented by nlog(n), and the binary code length of each letter is log(n), total Need nlog(n) bit.

### Exercises 16.3-6
***
Generalize Huffman's algorithm to ternary codewords (i.e., codewords using the symbols 0, 1, and 2), and prove that it yields optimal ternary codes.

### `Answer`
那就推广到树的结点有三个孩子结点,证明过程同引理16.3的证明.

Then the node that is promoted to the tree has three child nodes, and the proof process is the same as the proof of Lemma 16.3.

### Exercises 16.3-7
***
Suppose a data file contains a sequence of 8-bit characters such that all 256 characters are about as common: the maximum character frequency is less than twice the minimum character frequency. Prove that Huffman coding in this case is no more efficient than using an ordinary 8-bit fixed-length code.

### `Answer`
此时生成的Huffman树是一颗满二叉树,跟固定长度编码一致.

The Huffman tree generated at this time is a full binary tree, which is consistent with the fixed length code.

### Exercises 16.3-8
***
Show that no compression scheme can expect to compress a file of randomly chosen 8-bit characters by even a single bit. (Hint: Compare the number of files with the number of possible encoded files.)
Expand Down