All files are in eight bit bytes. The Huffman table can be any vaild table including dynamic huffman tables. There should between 128 and 513 symbols represented by the table. This is to force a tree path that has at least an 8 bit stream of bits to represent the least likely occuring symbol and to force the shortest path to a leaf from any internal node to a leaf of eight bits or less. I don't count the top node as an internal node. In my code there are exactly 256 leaves all the time. Note the numbers between 128 and 513 were to guarantee the "one to one" nature of the compression decompress routines. A tree as small as 9 leaves could work if there was a path of 8 in it. Also a tree as large as any finite N number of leaves could be such that the shortest path to any leaf from an internal node is 8 or less.

The rules for the current method are as follows ( looking at decompression since like describing inverse of a nonsingular marix it might be better to look at matrix multiply)

__Fact one: __My huffman tree such that the all zero token is at least 8 bits long.

__Fact two:__ My huffman tree such that the shortest path form any internal node
to a leaf never contains a string of more than 8 ones next to each other.

__Fact three:__ A one byte huffman file has no hidden bits and decompresses
to one byte since the start tree has 256 leaves all at a lenght of 8 bits.

__Rule one: __To decompress if the tokens end such that file being decompressed
end on a full byte your are done no problem.

Example ...1 11110000 positions labeled 0 to 7 and in position 3 the token S = 10000 is last decoded you have ended decompression last symbol
out is a "S".

__Rule two: __If the last huffman token does not start in the last 7 bits of last byte of compressed file and yet when at end of byte your at an internal
node of the huffman tree you supply the missing ones since there are at most
8 ones that where chopped off during the compression of that compressed file.

Example ...1 11110000 the token E = 111100001111 was last token used
in making the compressed file and started on bit 0 of the last byte of file so the
last symbol out of the decompression is "E" the trailing ones where chopped off.

__Rule Three:__ If the last huffman token does start in the last 7 bits of the
last byte and ends at an internal node of the tree but the portion of the
token on the last byte of compressed file has at least one or more bits
that are the one bit. Then as in rule two you supply the hidden ones that
where chopped off the missing file.

Example ...1 11110000 postion 3 the start of last token and in this
case X = 100001 so in this case the symbol used in the
rule 1 example could not have existed in the current
huffman tree so token did not end at file end and the
hidden ones would be supplied and last
symbol out is "X"

__Rule Four:__ ( hardest rule of all) If the last byte has the last token start in
the last 7 bits of the last byte and only contains zeros on that byte
. You assume that there are hidden ones
that where chopped off during compression. Only if the the file that is
one token shorter than this one would have had hidden bits cutoff.
This rule means that the all zero start of a last token in the last 7 bits
could depend on what would have happened to a the next shorter file
and that file status could be a function even farther back in the file.
It was the lack of focus on my part that made me miss this
recursive rule for a while. And I kept getting more and more complicated
specail cases. I hope this solves this.

Example ...1 11110000 M =000011 starts in position 4 but previous token
was F = 11111 so since if file shorter by one token cut off would
have occurred there fore the M the last appearant token is
used and is the last charater out..

__Rule Five: __The only case left is the case where a token starts some
where in the last 7 postions of the last byte and contains all zero
bits in its positions of that last byte. And that the previous file had
it ended on the previous token would not have ones cut off during
compression. In this case and only this case the 1 to 7 trailing
zeros which are not a complete symbol mark the end of file and
are not used to form a new character for the output file.

Example ...1 11110000 M=0011and starts in position 6 but the
existance of this M not important since F = 1100 is previous token
and no cutoff would have occured so the 0000 at end
tells us that the F was the actual last character
in the uncompressed file.

here they are ASCII ARMOR 10 NEW NO PAGE YET

here they are ASCII ARMOR 26 NEW NO PAGE YET

ENTER here LINK TO MY ENCRYPTION PAGES

ENTER here MY NEW STYLE BIJECTIVE ARITHMETIC CODING
* updated july 28,2004
*

ENTER here MY NEW STYLE BIJECTIVE REVERSE ARITHMETIC CODING
* updated August 10,2004
*

ENTER here SCOTT STYLE BIEJCTIVE LZW COMPRESSION

ENTER here VITTER STYLE ADAPTIVE HUFFMAN COMPRESSOR

ENTER here FIRST ORDER BIJECTIVE STATIC ENGLISH COMPRESSOR

ENTER here SECOND ORDER BIJECTIVE STATIC ENGLISH COMPRESSOR

** OUTDATED:**MODED FAST RUSSIAN HUFFMAN CODE

this works on my machine but not on everyones I plan to update this by moding his current version some day

ENTER here THE BIJECTIVE COMPRESSION PROGRAM THAT STARTED IT ALL

ENTER here BWT COMPRESSION FOR ENCRYPTION

ENTER here for dicussion of the "DSC" The data sensitive combiner

ENTER here for alternate ''one to one'' HUFFMAN CODING

ENTER here for my best''one to one'' conditioned HUFFMAN CODING

** OUTDATED:** ENTER here for conditioned HUFFMAN CODING

** OUTDATED:** ENTER here for ''one to one'' conditioned HUFFMAN CODING

ENTER here for "one to one" FINITELY ODD Transforms and "one to one adaptive Huffman Compression

UNADULTERATED RLE COMPRESSION page is here

ENTER here for rejected on first pass ACM paper

ENTER here for Matt Timmermans site

David A. Scott, biject.bwts@gmail.com