David Scott's BIJECTIVE STATIC ARITHMETIC COMPRESSION for SECOND ORDER ENGLISH with SPACES
files updated on June 10, 2006
This page describes a static entropy coder with second order frequencies
based on ENGLISH. The complete package is in the
bijective second order english
arithmetic compressor
so that any 8-bit binary file can be thought of as a
compressed file made entirely of the letters:
"ABCDEFGHIJGKLMNOPQRSTUVWXYZ"
and
single embedded spaces -- not leading or trailing spaces.
If you need to know, this is based on a slight editing of
my first order bijective
entropy compressor
.
Go to that page for further information. The tables
for this compressor came from
http://www.data-compression.com/english.html. I
made this since some people wanted spaces and felt that I might as well make it second
order.
HERE ARE SOME EXAMPLES:
TEST FILE 1 TAKEN FROM John Savard Site with added spaces
0000 20 4E 4F 57 20 49 53 20 20 54 48 45 20 54 49 4D * NOW IS THE TIM*
0010 45 20 0D 0A . . . . . . . . . . . . *E ..*
number of bytes is 20
THIS IS FIRST ORDER BIJECTIVE ARITHMETIC COMPRESSION OF FILE 1
0000 E2 B9 5C FF 06 B4 . . . . . . . . . . *..\...*
number of bytes is 6
THIS IS THE UNCOMPRESS NOTE ONLY USES LETTERS "A-Z"
0000 4E 4F 57 49 53 54 48 45 54 49 4D 45 . . . . *NOWISTHETIME*
number of bytes is 12
THIS IS SECOND ORDER BIJECTIVE ARITHMETIC COMPRESSION OF FILE 1
NOTE 6 BYTES LIKE THE FIRST ORDER BUT INCLUDES 3 SPACES
0000 8A B9 8A 46 E5 28 . . . . . . . . . . *...F.(*
number of bytes is 6
THIS IS UNCOMPRESS OF ABOVE USES LETTERS "A-Z" and Single internal spaces
0000 4E 4F 57 20 49 53 20 54 48 45 20 54 49 4D 45 . *NOW IS THE TIME*
number of bytes is 15
FILE 2 A TEST OF ALL THE CHARACTERS "A-Z"
0000 20 54 48 45 20 51 55 49 43 4B 20 42 52 4F 57 4E * THE QUICK BROWN*
0010 20 46 4F 58 20 4A 55 4D 50 53 20 4F 56 45 52 20 * FOX JUMPS OVER *
0020 54 48 45 20 4C 41 5A 59 20 44 4F 47 0D 0A . . *THE LAZY DOG..*
number of bytes is 46
THIS IS FIRST ORDER BIJECTIVE ARITHMETIC COMPRESSION OF FILE 2
0000 4C 31 D9 EC E6 2E FA AB 1C 29 10 53 97 56 98 1E *L1.......).S.V..*
0010 A0 9C 3F 9C 10 C6 E8 . . . . . . . . . *..?....*
number of bytes is 23
THE FIRST ORDER UNCOMPRESS NOTE NO SPACES
0000 54 48 45 51 55 49 43 4B 42 52 4F 57 4E 46 4F 58 *THEQUICKBROWNFOX*
0010 4A 55 4D 50 53 4F 56 45 52 54 48 45 4C 41 5A 59 *JUMPSOVERTHELAZY*
0020 44 4F 47 . . . . . . . . . . . . . *DOG*
number of bytes is 35
THIS IS SECOND ORDER BIJECTIVE ARITHMETIC COMPRESSION OF FILE 2
NOTE SHORTER THAN FIRST ORDER AND INCLUDES 8 SPACE CHARACTERS
0000 CC D6 0B 7E 4F 0C 96 81 4D 45 FE E1 E1 1B DF 1C *...~O...ME......*
0010 54 82 AE 91 30 . . . . . . . . . . . *T...0*
number of bytes is 21
THIS IS UNCOMPRESS OF ABOVE USES LETTERS "A-Z" and Single internal spaces
0000 54 48 45 20 51 55 49 43 4B 20 42 52 4F 57 4E 20 *THE QUICK BROWN *
0010 46 4F 58 20 4A 55 4D 50 53 20 4F 56 45 52 20 54 *FOX JUMPS OVER T*
0020 48 45 20 4C 41 5A 59 20 44 4F 47 . . . . . *HE LAZY DOG*
number of bytes is 43
THIS IS A TEST MESSAGE BETWEEN TO LOVERS USING SIMPLE STEALTH ENCRYPTION
IT WILL LOOK LIKE AN EASIER ENCRYPTION BUT UNLESS ONE KNOWS WHAT YOU
DID ITS VERY HARD TO BREAK
0000 20 49 20 57 49 4C 4C 20 53 45 45 20 20 59 4F 55 * I WILL SEE YOU*
0010 20 41 54 20 4E 4F 4F 4E 20 20 44 4F 20 4E 4F 54 * AT NOON DO NOT*
0020 20 54 45 4C 4C 20 59 4F 55 52 20 44 41 44 0D 0A * TELL YOUR DAD..*
0030 0D 0A . . . . . . . . . . . . . . *..*
number of bytes is 50
THIS IS FILE AFTER ARI2A.EXE
0000 73 46 A2 F7 86 95 1C 35 34 DC D9 8E C2 2D 87 78 *sF.....54....-.x*
0010 86 FC CB CC . . . . . . . . . . . . *....*
number of bytes is 20
THIS IS FILE AFTER UNARIA.EXE
0000 55 56 45 54 54 4F 48 42 45 57 45 53 53 4C 45 41 *UVETTOHBEWESSLEA*
0010 4E 52 54 41 53 4E 45 4E 46 4E 49 45 49 4F 4C 46 *NRTASNENFNIEIOLF*
0020 4D 46 52 41 54 48 49 . . . . . . . . . *MFRATHI*
number of bytes is 39
THIS IS AFTER THERE SECRET ITS ADD "SEX" AND UP FIRST LETTER BY 3 THAT
MEANS THE U GOES TO X IF IT WAS AND A TO D IF Z TO C AND SO ON
0000 53 45 58 58 56 45 54 54 4F 48 42 45 57 45 53 53 *SEXXVETTOHBEWESS*
0010 4C 45 41 4E 52 54 41 53 4E 45 4E 46 4E 49 45 49 *LEANRTASNENFNIEI*
0020 4F 4C 46 4D 46 52 41 54 48 49 0D 0A . . . . *OLFMFRATHI..*
number of bytes is 44
AFTER ARIA.EXE
0000 15 9D C0 7A D3 A3 3E 8F 41 2E FB DE 06 B4 26 A7 *...z..>.A.....&.*
0010 57 8D 62 44 B1 70 A0 . . . . . . . . . *W.bD.p.*
number of bytes is 23
**** FOLLOWING IS WHAT IS SENT *****
AFTER UNARI2A.EXE SEND THIS IN AN EMAIL REST OF MESSAGE COULD BE LOWER CASE
0000 41 54 48 45 47 20 53 4E 43 52 45 56 45 53 20 4E *ATHEG SNCREVES N*
0010 20 54 48 41 47 4C 49 4E 47 20 49 4E 44 20 52 4C * THAGLING IND RL*
0020 59 53 48 45 52 41 53 20 4D 4D 4F 57 20 48 45 20 *YSHERAS MMOW HE *
0030 53 20 53 20 49 41 . . . . . . . . . . *S S IA*
number of bytes is 54
THIS IS AFTER ARI2A.EXE YOU RUN THIS WHEN YOU GET THE MESSAGE
0000 15 9D C0 7A D3 A3 3E 8F 41 2E FB DE 06 B4 26 A7 *...z..>.A.....&.*
0010 57 8D 62 44 B1 70 A0 . . . . . . . . . *W.bD.p.*
number of bytes is 23
AFTER UNARIA.EXE
0000 53 45 58 58 56 45 54 54 4F 48 42 45 57 45 53 53 *SEXXVETTOHBEWESS*
0010 4C 45 41 4E 52 54 41 53 4E 45 4E 46 4E 49 45 49 *LEANRTASNENFNIEI*
0020 4F 4C 46 4D 46 52 41 54 48 49 . . . . . . *OLFMFRATHI*
number of bytes is 42
AFTER YOU DO THE SECRET EDITING "DROP SEX AND LOWER FIRST LETTER BY 3"
0000 55 56 45 54 54 4F 48 42 45 57 45 53 53 4C 45 41 *UVETTOHBEWESSLEA*
0010 4E 52 54 41 53 4E 45 4E 46 4E 49 45 49 4F 4C 46 *NRTASNENFNIEIOLF*
0020 4D 46 52 41 54 48 49 0D 0A . . . . . . . *MFRATHI..*
number of bytes is 41
AFTER ARIA.EXE
0000 73 46 A2 F7 86 95 1C 35 34 DC D9 8E C2 2D 87 78 *sF.....54....-.x*
0010 86 FC CB CC . . . . . . . . . . . . *....*
number of bytes is 20
AFTER UNARIA.EXE **NOTE NO LEADING OR TRAILING SPACES**
**ALSO ONLY SINGLE SPACES BETWEEN WORDS***
0000 49 20 57 49 4C 4C 20 53 45 45 20 59 4F 55 20 41 *I WILL SEE YOU A*
0010 54 20 4E 4F 4F 4E 20 44 4F 20 4E 4F 54 20 54 45 *T NOON DO NOT TE*
0020 4C 4C 20 59 4F 55 52 20 44 41 44 . . . . . *LL YOUR DAD*
number of bytes is 43
IF YOU WANT REAL ENCRYPTION YOU CAN CHANGE THE ARRAYS IN THE SOURCE CODE
AND THE POSITIONS OF THE SLOTS WHERE THE LETTERS ARE. ALSO, YOU CAN USE
DIFFERENT STATS FOR THE FREQUENCIES.
good luck
David A. Scott
ENTER here for MY Home
Page