Page - 27 - in Applied Interdisciplinary Theory in Health Informatics - Knowledge Base for Practitioners
Image of the Page - 27 -
Text of the Page - 27 -
triple (m, AM, PM) where m is a “random variable” that can take on one of a number of
possible values from an alphabet (a set of legal characters) AM = {m1, m2, …, mK} with
respective probabilities PM = {p1, p2, …, pK}. That is to say, the probability that m = mk
for some ͳ ݇ ܫ is pk. We also require that
Ͳ for all k, and σ
ൌͳூ
ୀଵ .
A measure H(M) on the ensemble M can then be defined which is the average
Shannon information content of an outcome:
Eq 2. ܪሺܯሻؠെσ
݈ ݃ଶ
ୀଵ
Strictly, this is simply providing us with the expected value of the information
content in a message m that has been received from the ensemble M. However, the form
of equation 2 is identical (apart from a constant) to the definition of entropy in the
statistical mechanics model of thermodynamics:
ܵൌ െ݇
ሺ
ሻ
Here pi represents the probability of a certain microstate of the thermodynamic
system under consideration, and the sum is over all possible microstates. The natural
logarithm is used in thermodynamics, but essentially the different base of the logarithm
together with the use of Boltzmann’s constant kB simply provides a scaling between S
and H.
By analogy with the form of this version of Boltzmann’s equation, and the fact that
the ensemble M in some sense represents the possible states of the system (a person in
our case) under observation, H(M) is referred to as the (Shannon) entropy of that
ensemble. As with the Shannon information content, it also has the unit of bits (when
using logarithm to the base 2).
Let us look at a couple of general-purpose examples to gain a little more intuition
about how Equation 2 might be used before moving back to a diagnostic setting.
Consider an ensemble M in which an outcome is simply a character drawn at random
from an English document. That is, the random variable m will be instantiated by
selecting at random a character from an English document where AM = {a, b, c, d, e, …,
x, y, z, _}. We will not distinguish upper- and lower-case letters, but we do include the
use of a space character, “_”. PM = {.0575, .0128, .0263, .0285, .0913, …, .0007, .1928}2
are the respective pis for ͳ ݅ ʹ
.
Using the figures provided, it can be calculated that the outcome m = “z” has
Shannon information content 10.4 bits, while the outcome m = “e” has information
content of 3.5 bits. Overall, our English language document has an entropy of 4.1 bits.
The full table of probabilities and corresponding measures of information content can be
found in [7].
Let us examine this a little more. Providing a clear semantics to Shannon entropy is
still a matter of debate (see, for example, p. 65 of [8]). Although it has the same form of
thermodynamic entropy, it does not for example have the same units, as we have
discussed; equation 2 has units of bits, whilst Boltzmann’s entropy has units of Joules
2 These values were estimated by the late David Mackay for use in his Information Theory, Inference and
Learning Algorithms text book, Cambridge, 2003. His choice of text from which to estimate the probabilities,
The Frequently Asked Questions Manual for Linux, of course means that these probabilities are conditional on
the assumption that this text is representative of the distribution of letters in an English language document.
P.Krause / InformationTheoryandMedicalDecisionMaking 27
back to the
book Applied Interdisciplinary Theory in Health Informatics - Knowledge Base for Practitioners"
Applied Interdisciplinary Theory in Health Informatics
Knowledge Base for Practitioners
- Title
- Applied Interdisciplinary Theory in Health Informatics
- Subtitle
- Knowledge Base for Practitioners
- Authors
- Philip Scott
- Nicolette de Keizer
- Andrew Georgiou
- Publisher
- IOS Press BV
- Location
- Amsterdam
- Date
- 2019
- Language
- English
- License
- CC BY-NC 4.0
- ISBN
- 978-1-61499-991-1
- Size
- 16.0 x 24.0 cm
- Pages
- 242
- Category
- Informatik