Claude Shannon

Shannon, Claude Elwood (1916-2001). Eric Weisstein's World of Scientific Biography

American mathematician and father of information theory.
From an early age, Shannon showed an affinity for both engineering and mathematics.
While at MIT, Shannon studied with both Wiener and Bush.
He combined mathematical theories with engineering principles to set the stage for the development of the digital computer and the modern digital communication revolution.
Core concepts: relay switches, Boolean logic & information entropy.
In 1941 he began a serious study of communication problems, partly motivated by the demands of the war effort. This research resulted in the classic paper entitled "A mathematical theory of communication" in 1948.
This pioneering paper begins by observing that 'the fundamental problem of communication is that of reproducing at one point exactly or approximately a message selected at another point.
“Whatever came up, he engaged it with joy, and he attacked it with some surprising resource--which might be some new kind of technical concept or a hammer and saw with some scraps of wood.”

Wyner, A. D. "The Significance of Shannon's Work."

Shannon saw the communication process as essentially stochastic in nature. The semantic meaning of information plays no role in the theory.
The theory seeks to answer questions such as how rapidly or reliably can the information from the source be transmitted over the channel, when one is allowed to optimize with respect to the encoder/decoder?
His solution has two parts. First he gives a fundamental limit which, for example, might say that for a given source and channel, it is impossible to achieve a fidelity or reliability level better than a certain value. Second, he shows that for large encoder delays, it is possible to achieve performance that is essentially as good as the fundamental limit.
One of Shannon's most brilliant insights was the separation of problems like these (where the encoder must take both the source and channel into account) into two coding problems.

Claude Shannon, Wikipedia

He graduated in 1936 with two bachelor's degrees, one in electrical engineering and one in mathematics.
He soon began his graduate studies in electrical engineering at the Massachusetts Institute of Technology (MIT), where he worked on Vannevar Bush's differential analyzer, an early analog computer.
Shannon's work became the foundation of practical digital circuit design when it became widely known in the electrical engineering community during and after World War II.
This research resulted in Shannon's doctor of philosophy (Ph.D.) thesis at MIT in 1940, called An Algebra for Theoretical Genetics.
In 1940, Shannon became a National Research Fellow at the Institute for Advanced Study in Princeton, New Jersey. Shannon worked freely across disciplines, and began to shape the ideas that would become Information Theory.
Shannon then joined Bell Labs to work on fire-control systems and cryptography during World War II, under a contract with section D-2 (Control Systems section) of the National Defense Research Committee (NDRC).
Inside the volume on fire control a special essay titled Data Smoothing and Prediction in Fire-Control Systems, coauthored by Shannon, Ralph Beebe Blackman, and Hendrik Wade Bode, formally treated the problem of smoothing the data in fire-control by analogy with "the problem of separating a signal from interfering noise in communications systems."
At the close of the war, he prepared a classified memorandum for Bell Telephone Labs entitled "A Mathematical Theory of Cryptography," dated September 1945.

Claude E. Shannon: A Retrospective on His Life, Work, and Impact. Robert Gallager.

His mind was always full of new ideas, so many of his results were never published.
While he was doing his Ph.D. research, he was also becoming interested in the fundamental problems of communication, starting to nibble around the edges of what would later become his monumental “A Mathematical Theory of Communication.”
During the war, Shannon also became interested in cryptography. He realized that the fundamental issues in cryptography were closely related to the ideas he was developing about communication theory.
Since he reported these ideas [entropy] first in his classified cryptography paper, some people supposed that he first developed them there. In fact, he worked them out first in the communication context, but he was not yet ready to write up his mathematical theory of communication.
He had been working on this project, on and off, for eight years.
There [Bell Labs] was a remarkable group of brilliant people to interact with, and he tended to quickly absorb what they were working on and suggest totally new approaches.
His style was not that of the expert who knows all the relevant literature in a field and suggests appropriate references. Rather, he would strip away all the complexity from the problem and then suggest some extremely simple and fundamental new insight.
He felt no obligation to work on topics of value to the Bell System, and the laboratory administration was happy for him to work on whatever he chose.
In the years immediately after the publication of [A mathematical theory of communication], Claude had an amazingly diverse output of papers on switching, computing, artificial intelligence, and games.
He did not teach regular courses, and did not really like to talk about the same subject again and again. His mind was always focused on new topics he was trying to understand.
Before 1948, there was only the fuzziest idea of what a message was. There was some rudimentary understanding of how to transmit a waveform and process a received waveform, but there was essentially no understanding of how to turn a message into a transmitted waveform. There was some rudimentary understanding of various modulation techniques, such as amplitude modulation, frequency modulation, and pulse code modulation (PCM), but little basis on which to compare them.
The use of simple toy models to study real situations appears not to have been common in engineering and science before Shannon’s work. Earlier authors in various sciences used simple examples to develop useful mathematical techniques, but then focused on an assumed “correct” model of reality. In contrast, Shannon was careful to point out that even a Markov source with a very large state space would not necessarily be a faithful model of English text (or of any other data).
The purpose of a model is to provide intuition and insight. Analysis of the model gives precise answers about the behavior of the model, but can give only approximate answers about reality.
In summary, data sources are modeled as discrete stochastic processes in [1], and primarily as finite-state ergodic Markov sources.
No doubt Shannon saw that it was necessary to exclude considerations of delay and complexity in order to achieve a simple and unified theory.
“In our age, when human knowledge is becoming more and more specialized, Claude Shannon is an exceptional example of a scientist who combines deep abstract mathematical thought with a broad and at the same time very concrete understanding of vital problems of technology."
In fact, Shannon’s discoveries were not bolts from the blue. He worked on and off on his fundamental theory of communication from 1940 until 1948, and he returned in the 1950s and 1960s to make improvements on it.
He was fascinated not by problems that required intricate tools for solution, but rather by simple new problems where the appropriate approach and formulation were initially unclear.
Shannon’s research style combined the very best of engineering and mathematics.
The combination of engineering and mathematics, the delight in elegant ideas, and the effort to unify ideas and tools are relatively common traits that are also highly admired by others.
In graduate school, doctoral students write a detailed proposal saying what research they plan to do. ... It is a much less reasonable approach to Shannon-style research, since writing sensibly about uncharted problem areas is quite difficult until the area becomes somewhat organized, and at that time the hardest part of the research is finished.
This [Shannon-style research] is the kind of research that turns an area from an art into a science.
Shannon rarely wrote about his research goals. In learning to do Shannon-style research, however, writing about goals in poorly understood areas is very healthy. Such writing helps in sharing possible approaches to a new area with others. It also helps in acquiring the good instincts needed to do Shannon style research.
In the early years after 1948, many people, particularly those in the softer sciences, were entranced by the hope of using information theory to bring some mathematical structure into their own fields. In many cases, these people did not realize the extent to which the definition of information was designed to help the communication engineer send messages rather than to help people understand the meaning of messages.
Shannon-style research, namely, basic research that creates insight into how to view a complex, messy system problem, moves slowly. It requires patience and reflection.

Creative Thinking , Claude Elwood Shannon: Miscellaneous Writings. LCCN:94148004

The first one that I might speak of is the of simplification. Suppose that you are given a problem to solve, I dont care what kind of problem a machine to design, or a physical theory to develop, or a mathematical theorem to prove or something of that kind probably a very powerful approach to this is to attempt to eliminate everything from the problem except the essentials; that is, cut it down to size. Almost every problem that you come across is befuddled with all kinds of extraneous data of one sort or another; and if you can being this problem down into the main issues, you can see more clearly what youre trying to do and perhaps find a solution. Now in so doing you may have stripped away the problem youre after. You may have simplified it to a point that it doesnt even resemble the problem that you started with; but very often if you can solve this simple problem, you can add refinements to the solution of this until get back to the solution of the one you started with.

A Conversation with Claude Shannon: one man's approach to problem solving. Robert Price, 1984, IEEE.

When I went to take my National Research Fellowship under Weyl [1940], I told him that I wanted to work on information, the measurement of information, and how much is required. I told him that I had already read Hartley’s paper [R. V. L. Hartley, “Transmission of information,” Bell Syst. Tech. J., vol. 7, pp. 535ff, 1928.], and that it had been an important influence in my life. ... That paper struck me as important in this area.
I would say that it was in 1940 that I first started modeling information as a stochastic process or a probabilistic process.
What would be the simplest source you might have, the simplest thing you were trying to send? I would think of tossing a coin, heads or tails, and then to try to send that stream of data.
My first getting at that was information theory, and I used cryptography as a way of legitimizing the work.
My mind wanders around, and I conceive of different things day and night. ... and I’m not caring whether someone is working on it or not. [This is not going to work for me.]

The making of information theory, Erico Marui Guizzo, 2003.

In fact, the connection between the two fields is so straight that many believe that cryptography originated information theory. This is not true, as Shannon was thinking about the problems of communication much before coming to Bell Labs.
In the 1920s, he [Nyquist] studies the transmission of signals in the telegraph system with a mathematical tool known as Fourier analysis, which decomposes a complicated signal into a sum of simpler components.
"The crucial point is that a finite amount of information implies an essentially discrete message variable," wrote James Massey, a former professor of electrical engineering at ETH, Zurich's Swiss Federal Institute of Technology, in 1984. [The process of searching solution is like cutting fat from a huge piece of meat: as you put constraints on the originally opaque problem, you realize that what has been discarded is insignificant to the result. Thus, without compromising the quality of solution, we've vastly reduced the "solution space", although which is as opaque a concept as the formulation of problem. As we cut fat from the body, step by step, we gradually get a clear picture of the essence.]
Shannon began his paper by noting that frequently the messages produced by an information source have meaning. That is , they refer to things or concepts that "make sense" to people. but from the engineering standpoint, Shannon observed, these "semantic aspects" were not important -- agreeing with Hartley's attempt to eliminate the "psychological factors" in communication. For Shannon, too, any message selected should be considered "valid". What is meaningful to a person -- a certain kind of music, a text in a foreign language -- can be meaningless to another. And a system should be able to transmit any message. [engineering standpoint; insignificant aspects; generality of system]
Shannon then concluded that a stochastic process -- in particular, one special kind of stochastic process know as a Markov process -- could be a satisfactory model of English text. In more general terms, he noted, a sufficiently complex stochastic process could represent satisfactorily any discrete source.
In fact, the philosophical implications of the second law were vast; every physicist had his own interpretation.
[The concept of information entropy is independently put forward in information theory, not a simple analogy of thermodynamic entropy.]
In fact, the paper did have holes. ... But this very approach to the problem of communication was perhaps the main reason why Shannon's work was so successful. Shannon was a mathematician and an electrical engineer. ... He had to avoid certain mathematical formalisms and move ahead in his theory.

🏷 Category=Communication Theory