SCOTT's "DSC" data sensitive combiner

Last updated 16 Spetember 2000

The files for this section are here bwtdsc.zip

****USE AT YOUR ON RISK IN A DIRECTORY BY ITSELF***

 This set of files is so you can use or test DSC right
away. The program

ADDLONG takes a file and creates a new one with N attached.
   to run it type "ADDLONG"

SUBLONG takes a file that has a valid unsigned long in
   the correct range. Displays the value and creates a file
   with the long stripped off.
   to run it type "SUBLONG"

ROTATEN takes a file rotates it and adds N as a unsigned long
   to the end of file and creates a new one with a unsigned
   long N attached to it.
    

UNROTAT  does the reverse of roataten and shows the value
   of N used for the rotation. It writes shortened file out

BWTQ is a full BWT program with only one unsigned long field
   added for the location of "last"
   to run type "BWTQ filein fileout"

UNBWTQ is the revers of BWTQ
   to run type "UNBWTQ filein fileout"

****DSC***THE DATA SENSITIVE COMBINER******
this program takes any file of the form from BWTQ or ROTATEN
   or ADDLONG and optimally combines the value to the file
   so that it adds as little overhead as possible. The binary
   file DSC produces can be processed by either encryption or
   compression as it adds minimal overhead.
   UNDSC takes "any file" includeing this text file and produces
   a file that has a unsinged long attached in the correct range.

DSC and UNDSC are such that every possible binary file maps
  to a form with a long attached when running UNDSC and DSC
  can map very file of form with a long attached to every possible
  binary file. Note the long ranges in value from 0 to N-1 where
  N is the number of bytes.

  For example take a 256 byte file since there are 256 possible
  values for the pointer one might think the you need 257 bytes
  for the output. This is wrong the most you need is 257 bytes
  but many files map much smaller.
  to run type "DSC filein fileout"
  ro run reverse "UNDSC filein fileout"


Where this would be of most use is in compression where one wishes
to determine the N that must be added to the file if a stanard
BWT is done. I use it to point to last but one could modify to
point to middle or whatever floats your boat. Now if all the stages
from this point on bijective and they are not in BWTCODE.ZIP
since the RLE and ARITHMETIC routinues used are not. But if they
are bijective then you can get better compression and if later
an enemy tests a key with a quantum computer or whatever its nice
to have more than one set of files that could have been encrypted
to the encrypted file the attacker is looking at. If your in the UK
and you have to supply a key. You need test only a small amout of
keys to get an inverse. If the number of keys needed to be tested
is roughly 1 to N where N is the number of bytes in the message.
Where as if you used BWTCODE.ZIP there is likely only one key that
can decrypt the file. That is the one that you may go to jail if
you can't find.


EXAMPLE:
Here is file1.dat in your package note it is
137 charters long 

0000  49 20 61 6D 20 74 72 79 69 6E 67 20 74 6F 20 6C  *I am trying to l*
0010  65 61 72 6E 20 73 70 61 6E 69 73 68 20 61 6E 64  *earn spanish and*
0020  20 69 74 20 69 73 20 76 65 72 79 20 64 69 66 66  * it is very diff*
0030  69 63 75 6C 74 2E 20 54 68 65 20 6F 6E 6C 79 0D  *icult. The only.*
0040  0A 70 68 72 61 73 65 20 49 20 68 61 76 65 20 6C  *.phrase I have l*
0050  65 61 72 6D 65 64 20 69 73 3A 20 56 69 76 61 20  *earmed is: Viva *
0060  4D 65 78 69 63 6F 20 43 61 62 72 6F 6E 65 73 21  *Mexico Cabrones!*
0070  20 42 75 74 20 77 68 61 74 20 64 6F 65 73 20 69  * But what does i*
0080  74 20 6D 65 61 6E 3F 0D 0A  .  .  .  .  .  .  .  *t mean?..*
 number of bytes is 137 

Lets run ROTATEN with an offset of 48 Here is the
dump of the result


0000  69 63 75 6C 74 2E 20 54 68 65 20 6F 6E 6C 79 0D  *icult. The only.*
0010  0A 70 68 72 61 73 65 20 49 20 68 61 76 65 20 6C  *.phrase I have l*
0020  65 61 72 6D 65 64 20 69 73 3A 20 56 69 76 61 20  *earmed is: Viva *
0030  4D 65 78 69 63 6F 20 43 61 62 72 6F 6E 65 73 21  *Mexico Cabrones!*
0040  20 42 75 74 20 77 68 61 74 20 64 6F 65 73 20 69  * But what does i*
0050  74 20 6D 65 61 6E 3F 0D 0A 49 20 61 6D 20 74 72  *t mean?..I am tr*
0060  79 69 6E 67 20 74 6F 20 6C 65 61 72 6E 20 73 70  *ying to learn sp*
0070  61 6E 69 73 68 20 61 6E 64 20 69 74 20 69 73 20  *anish and it is *
0080  76 65 72 79 20 64 69 66 66 30 00 00 00  .  .  .  *very diff0...*
 number of bytes is 141 

Now run DSC and look at the dump of the output
 the file again happens to be back at 137 so in this
 example nothing was added to the length of file
 this is not always the case. But it will never increase
 by more bytes than what would be necessiary if one
 added the length of preceeding file in bytes. That
 is files of 2**8 characters or less increase at most
 by one bye. while files upto 2**16 increase by most 2 bytes
 and so on. But the whole binary file space is used.

 

0000  69 63 75 6C 74 2E 20 54 68 65 20 6F 6E 6C 79 0D  *icult. The only.*
0010  0A 70 68 72 61 73 65 20 49 20 68 61 76 65 20 6C  *.phrase I have l*
0020  65 61 72 6D 65 64 20 69 73 3A 20 56 69 76 61 20  *earmed is: Viva *
0030  4D 65 78 69 63 6F 20 43 61 62 72 6F 6E 65 73 21  *Mexico Cabrones!*
0040  20 42 75 74 20 77 68 61 74 20 64 6F 65 73 20 69  * But what does i*
0050  74 20 6D 65 61 6E 3F 0D 0A 49 20 61 6D 20 74 72  *t mean?..I am tr*
0060  79 69 6E 67 20 74 6F 20 6C 65 61 72 6E 20 73 70  *ying to learn sp*
0070  61 6E 69 73 68 20 61 6E 64 20 69 74 20 69 73 20  *anish and it is *
0080  76 65 72 79 20 64 69 66 61  .  .  .  .  .  .  .  *very difa*
 number of bytes is 137 

Now lets take same starting file and run BWTQ
In this case the file looks a lot different but know
we have the nasty pointer to combine.

0000  0D 0D 3F 79 21 6F 65 61 2E 3A 49 68 79 74 49 74  *..?y!oea.:IhytIt*
0010  64 64 73 65 6F 74 65 6E 67 6D 73 74 73 74 73 6E  *ddseotengmststsn*
0020  20 20 0A 20 20 20 20 76 43 20 65 20 70 65 65 72  *  .    vC e peer*
0030  68 68 61 69 69 65 6E 20 20 73 76 68 6D 6C 6C 6D  *hhaiien  svhmllm*
0040  76 6F 6E 4D 69 66 6E 73 77 20 54 70 78 66 64 79  *vonMifnsw Tpxfdy*
0050  20 20 6E 20 20 56 20 20 75 6E 61 20 72 72 61 61  *  n  V  una rraa*
0060  6F 69 61 6F 63 74 64 72 20 73 0A 68 61 61 62 65  *oiaoctdr s.haabe*
0070  74 65 69 65 69 61 69 20 61 69 69 75 6C 20 20 63  *teieiai aiiul  c*
0080  42 69 61 20 20 65 6C 72 72 22 00 00 00  .  .  .  *Bia  elrr"...*
 number of bytes is 141

Next run DSC the "data sensitive combiner" to combine
in a optimal way the pointer to last the result is
below
 
0000  0D 0D 3F 79 21 6F 65 61 2E 3A 49 68 79 74 49 74  *..?y!oea.:IhytIt*
0010  64 64 73 65 6F 74 65 6E 67 6D 73 74 73 74 73 6E  *ddseotengmststsn*
0020  20 20 0A 20 20 20 20 76 43 20 65 20 70 65 65 72  *  .    vC e peer*
0030  68 68 61 69 69 65 6E 20 20 73 76 68 6D 6C 6C 6D  *hhaiien  svhmllm*
0040  76 6F 6E 4D 69 66 6E 73 77 20 54 70 78 66 64 79  *vonMifnsw Tpxfdy*
0050  20 20 6E 20 20 56 20 20 75 6E 61 20 72 72 61 61  *  n  V  una rraa*
0060  6F 69 61 6F 63 74 64 72 20 73 0A 68 61 61 62 65  *oiaoctdr s.haabe*
0070  74 65 69 65 69 61 69 20 61 69 69 75 6C 20 20 63  *teieiai aiiul  c*
0080  42 69 61 20 20 65 6C 72 45  .  .  .  .  .  .  .  *Bia  elrE*
 number of bytes is 137 

 In this example the BWT also has the pointer added so
that in this case there is no increase in the original
files data length. The above is just a random test case
Take Care
David A. Scott


ENTER here for MY Home Page