Seite - 28 - in Applied Interdisciplinary Theory in Health Informatics - Knowledge Base for Practitioners
Bild der Seite - 28 -
Text der Seite - 28 -
per Kelvin. However, although it would be wrong to say that Shannon entropy is “the
same thing” as “entropy”, it would be equally wrong to say they are unrelated: the two
equations only differ by a constant (which defines the scale of measurement), and one
can begin to reconcile the two if one relates the probabilities of the microstates of the
system under consideration with the probabilities of the symbols generated by that
system. Indeed, Jaynes argued in depth that the information theoretic view of entropy
was a generalisation of thermodynamic entropy [3][4]. We implicitly advocate the same
position in the context of medical diagnosis.
Going back to our document example. If we take a new document, pick a character
at random and that character turns out to be a “z”, a character with one of the lowest
probabilities of occurrence in a typical English document, then that is providing us with
more information about it (relative to a “normal” document) than if we had received an
“e”.
Two general properties are also worth noting. Firstly, if only one outcome in an
ensemble M has a non-zero probability of occurring (in which case, its probability must
be 1), then:
Property 1: H(M) = 0
(By convention, if p(mk) = 0, then Ͳൈ݈ ݃ଶͲؠͲ).
At the other end of the scale, the H(M) is maximized if all of the outcomes are
equally likely. An expression for the value for this is quite easy to derive. Let our
ensemble Me have K possible outcomes. Then we must have for all k, p(mk) = 1/K.
Substituting this into Equation 2, we get:
ܪሺܯ ሻൌെ ͳ
ܭ ݈ ݃ଶ ͳ
ܭ
ୀଵ ൌ ͳ
ܭ ݈ ݃ଶሺܭሻ ͳൌ݈ ݃ଶሺܭሻ
ୀଵ
(Noting that log(1/K) = -log(K) and that log(K) is a constant and so can be factored
to the outside of the summation). So,
Property 2: H(Me) = log2(K) if all K outcomes are equally likely
In the case of our English document example, if the characters were uniformly
distributed, then we would have H(Muniform) = log2(27) = 4.76 bits. This is slightly higher
than that for our representative English language document (4.1 bits).
Returning to the application of this to medical diagnosis, we can interpret these two
situations as follows:
H( ) = 0 if only one message/positive test result is possible. That is, a specific
diagnosis has been confirmed.
H is at its maximum when all messages are equally possible. That is, we are at
a state of complete ignorance about the patient’s internal state.
From this we can see that the challenge of diagnosis is to reduce the entropy to as
close to zero as possible, and to select tests so that the result of each test (what we are
calling “messages” here) maximises the reduction of entropy.
Two points should be emphasised here before we move on:
1. We are equating the probability of occurrence of messages with the probability
of microstates of the patient under examination, to justify the usage of the term
“entropy”;
P.Krause /
InformationTheoryandMedicalDecisionMaking28
zurück zum
Buch Applied Interdisciplinary Theory in Health Informatics - Knowledge Base for Practitioners"
Applied Interdisciplinary Theory in Health Informatics
Knowledge Base for Practitioners
- Titel
- Applied Interdisciplinary Theory in Health Informatics
- Untertitel
- Knowledge Base for Practitioners
- Autoren
- Philip Scott
- Nicolette de Keizer
- Andrew Georgiou
- Verlag
- IOS Press BV
- Ort
- Amsterdam
- Datum
- 2019
- Sprache
- englisch
- Lizenz
- CC BY-NC 4.0
- ISBN
- 978-1-61499-991-1
- Abmessungen
- 16.0 x 24.0 cm
- Seiten
- 242
- Kategorie
- Informatik