The Nature of Information V

George Hrabovsky

MAST

Introduction

In this article I will write some Mathematica code for determining the entropy of a string.

Characterizing Probabilities

The new version of Mathematica allows you to deal with probabilities in a symbolic way.

First we begin by determiing a probability. Let’s say we choose a probability distribution, like the Gaussian, or normal, distribution. What is the probability of finding a value, say x, such that 1x4.

In[17]:=

natinfo5_1.gif

Out[17]=

natinfo5_2.gif

Here μ is the mean of the distribution and σ is the standard devaition. Erfc is the complementary error function for some complex number z is,

natinfo5_3.gif

where,

natinfo5_4.gif

We can make a plot of possible values for μ and σ,

In[21]:=

natinfo5_5.gif

Out[21]=

natinfo5_6.gif

From this we can choose μ=.5,σ=.5,

In[22]:=

natinfo5_7.gif

Out[22]=

natinfo5_8.gif

The Entropy and Strings

The entropy is

In[23]:=

natinfo5_9.gif

Out[23]=

natinfo5_10.gif

This can be extended for strings. Say we have four symbols drawn from the same distribution,

In[30]:=

natinfo5_11.gif

The probability of the string is then,

In[32]:=

natinfo5_12.gif

Out[32]=

natinfo5_13.gif

The entropy of the string is,

In[33]:=

natinfo5_14.gif

Out[33]=

natinfo5_15.gif

and

In[38]:=

natinfo5_16.gif

Out[38]=

natinfo5_17.gif

The actual number of strings of this length is,

In[35]:=

natinfo5_18.gif

Out[35]=

natinfo5_19.gif

This tells us that there are 256 possible strings, while the effective number of strings is

In[39]:=

natinfo5_20.gif

Out[39]=

natinfo5_21.gif

which is much lower than the actual number of strings. This reduces the possible error.

Spikey Created with Wolfram Mathematica 8.0