Monday, September 29, 2008

Mathematics = Language of Interdisciplinarity?

Michael Mitzenmacher had an interesting post last week on his blog, my biased coin. Mitzenmacher is professor of computer science at Harvard University. He's written a few extremely interesting papers on power laws which we'll probably read later in the term.

Mitzenmacher was writing about attending the opening of Microsoft Research New England. The theme of the opening symposium was interdiscplinary research. Mitzenmacher writes that

a natural question was how such research should be encouraged. Nobody on the panel seemed to make what I thought was an obvious point: one way to encourage such research is to make sure people across the sciences are well trained in mathematics and (theoretical) computer science. Interdisciplinary research depends on finding a common language, which historically has been mathematics, but these days more and more involves algorithms, complexity, and programming.

Mitzenmachmer then goes on to describe a subsequent talk by Erik Demaine. The abstract of the talk is:

Theoretical computer science, and the algorithmic way of thinking, transcends our traditional boundaries. I believe that algorithms are relevant to every discipline of study, and will give eclectic examples from the arts and sciences to business and society. The examples span the spectrum from serious topics like protein folding and decoding Inka khipu to fun topics like juggling and magic.

There's a link to a video of Demaine's talk here, although I can't get the video to work right now.

I find myself quite intrigued by this and I'm not quite sure what to think, although I think I'm basically in agreement. Having a solid base in math, statistics, and some computer programming/computer science strikes me as almost indispensible for work in most sciences and social sciences. Quite generally, I think that having a strong understanding of these areas greatly expands the sort of scientific problems one can tackle. And it certainly increases the range of other scientists one can talk to and the depths that those conversations can go.

So as far as the sciences go, I'm basically quite comfortable with Mitzenmacher's statement. I wonder, though, how his statement might have to be ammended to apply to the humanities. Is there a "common language" that helps philosophers, anthopologists, historians, and literary theorists research together? Is this language broad enough to include political scientists, psychologists, or economists? My guess is that there is a semi-canonical body of thinkers or schools of thought that scholars in these areas would all be familiar with, and which could serve as a useful touchstone or frame of reference for interdisciplinary collaboration. But I'm not sure, as I'm not even close to an expert in these fields.

As for the centrality of computer science and mathematics for science, I worry sometimes that COA could be doing more to prepare students in these areas. We've graduated lots of students who are very well prepared in math and who have gone on to make good use of their math backgrounds in grad school. But I think we can do better. Part of the problem is that we could use a few more classes in math, statistics, and computer science. But I also think that there might be a subtle bias against mathematics, a perhaps unspoken idea that if you learn too much math you'll lose your sense of creativity and joy and acquire a simplistic and reductionist approach to everything. Needless to say, I disagree.

Anyway, partly inspired by Mitzenmacher, for the next two classes I want to attempt to present a sort of crash course or primer in interdisciplinary probability and stochastic processes. There are just some good, basic, widely applicable things about this area that I think (almost) every scientist and social scientist should know. We'll see how it goes. Fasten your seat belts...

Monday, September 22, 2008

Simple Networks for a Brighter Tomorrow

Lately I have been giving much thought to the psychological applications of graph theory to neural networks. If indeed a neural network is an accurate model of the human brain, what may we discover by analyzing it as a graph? Is the brain a small-world graph? Does it have significant clumping? If its degree distribution, as I hypothesize, is not poisson, perhaps it isn't arbitrary after all, and what we thought was subjectivity is actually just the logic of pathways. But undoubtedly, since it is probably not a Erdos-Remyi graph, there's something there. The possibilities are limitless. And limiting: “although neural nets do solve a few toy problems, their powers of computation are so limited that I am surprised anyone takes them seriously as a general problem-solving tool.” Jeepers. I suddenly feel very small.

Sunday, September 21, 2008

Fearfull Parents Increasing the Risk of Infectious Disease

A recent report by the American Medical Association warns that the number of parents opting not to vaccinate their children is leading to an increase in outbreaks of preventable infectious diseases. They cite recent measles outbreaks around the United States. Apparently most of the children infected had not been vaccinated against the disease.

Vaccinations work on two levels: they protect the individuals that receive the vaccines and they protect others in the community by decreasing the number of individuals in a community who are susceptible to a disease. In theory, if a large enough percentage of a group is vaccinated, there is a really low chance that the un-vaccinated individuals will be exposed to the disease, creating group immunity. This provides the reasoning behind allowing parents with particular "religious" beliefs to opt out of vaccinating their children. As long as the majority of a population is immune to a disease, the rest of the population is also protected. The problem comes when there is no longer a critical number of people being vaccinated.

My first thought was that an increasing number of parents must be choosing not to vaccinate their children. There have been a lot of stories about the dangers of vaccines so perhaps these are scaring parents away. Then I saw at 2004 report from the CDC that the number of children being vaccinated is increasing. While that is a few years old, perhaps the overall percentage of children is not the problem. Perhaps the recent outbreaks have more to do with the distribution and connectivity of the susceptibles than their number.

Even if the number of parents who choose not to vaccinate their children is a fairly small percentage of the total population, they are likely not distributed evenly. Certain beliefs are likely to be clustered in certain geographic areas which may lead to a clustering of susceptible individuals. The availability of health care will likely leave some populations more vaccinated than others. Siblings of susceptible children and likely to also be susceptible. Additionally, one parent who doesn't believe in vaccinating their children may convince other parents that vaccination is unnecessary, leading to the growth of an un-vaccinated and highly susceptible cluster.

The AMA study states that many of the current measles outbreaks are linked to international travel. Some parents may be lulled into a sense of security by the fact that most of these diseases are very rare in the United States and so the chance that their children being exposed seems really small. However, it only takes one individual to serve as a connection between an infected group and a susceptible group. According to the small world theory, these diseases are likely far closer than we would like to believe.

While the small world theory suggests that we are all closely connected, clusters with similar beliefs are likely more closely connected than other clusters within the population. For example, parents who choose not to vaccinate their children may also tend to send their children to the same types of schools or summer camps, attend similar events and choose similar vacation destinations. All of this conspires to create well connected clusters of un-vaccinated individuals. While only a small percentage of the population may be included, the infection of any one of the individuals could lead to a large outbreak among the susceptible clusters.

There is an interesting dynamic of self defeating, infectious nodes within a larger network of the population. If you picture a network of people in the United States and their daily social connections, most of the individuals are vaccinated from these diseases. There are some individuals scattered throughout the population that remain susceptible but since they are far out numbed by immune individuals, the population as a whole remains immune. Now, imagine that from each of the susceptible nodes spreads the infectious belief of not vaccinating their children. Over time, clusters of susceptibles will form. Perhaps if disease appears in one cluster, the rest of the cluster will again be convinced to vaccinate their children. There must be a fine balance between the infection rate of disease and the popularity of not vaccinating children. Perhaps it is an ongoing cycle: the more distant the disease seems to be, the more contagious the practice of not vaccinating, the more susceptible the population becomes until it becomes infected, then the popularity of the vaccine increases until the disease once again seems distant. This creates interesting challenges of convincing parents to vaccinate their children even when the threat of disease seems so minimal.

Friday, September 19, 2008

The 2 Königsbergs

In The structure and function of networks, one of M.E.J Newman’s first assertions is that Leonhard Euler’s “1735 solution of the Königsberg bridge problem is often cited as the first true proof in the theory of networks.” Dave also presented Euler’s solution in class today, advocating the importance of Euler’s solution to the field of networks. Nonetheless, this is a fairly bold assertion. Learning more about the origins of the field of networks, while perhaps slightly arbitrary, struck me as profoundly interesting. It seemed that some investigation was merited.

This research quickly yielded Euler’s original latin publication in PDF form, as well as an informative a wikipedia entry on the Seven Bridges of Königsberg problem. Perusing the Latin publication by Euler is an interesting experience – particularly seeing Euler’s original drawings of the problem. Euler employs the universal point labeling system in his proof – A, B, C, D…which apparently transcends time and language. It also left me wondering if Euler himself ever drew a figure of nodes and edges to represent the problem, or if the jump from 1736 to 20th century logic came later. Euler's original paper is interesting to ponder for a few moments before reading the Wikipedia entry.

Wikipedia additionally features a nice entry on Euler –a renowned Swiss mathematician and physicist who is famous for Euler's formula in calculus, among other discoveries. For those who are unaware (or need refreshing), the Seven Bridge of Königsberg was a problem involving seven bridges crossing the Pregel River, Prussia, via two large islands in the middle of the river. The object was to determine if it was possible to go for a walk, crossing each bridge exactly once, beginning and ending in the same location. Königsberg was the capital of Eastern Prussia until WW II, when it was occupied by the Soviet Army and renamed Kaliningrad. Kaliningrad still exists today – although it’s unclear if all the bridges survived the Soviet Army. The following, however, is a nice picture of one of The Islands from the Königsberg bridge
problem.
Image:IMG 6448.jpg
(source: Wikipedia)

Coincidentally, Königsberg was also the name of a German cruiser in WW I. It’s unclear how or why Königsberg the city and Königsberg German warship were related (or maybe it’s just a small world?), but the warship faced a similar fate as the city – it was sunk by a British plane in 1915). Thus, neither the city nor the warship Königsberg remain.

The end of this story, which many are familiar with, is that Euler solved the Bridges of Königsberg problem. He did so by reducing the system of bridges to what we would now call a network: a system of nodes and edges. Ultimately, he argued that the feasibility of such a walk depended entirely on the degree of the nodes in the system. The walk existed only if there were 2 or 0 nodes of an odd degree. This discovery has lead to the existence of the so-called “Euler path” or Eulerian trail, as well as the Eularian circuit, and the beginning of networks theory. Euler and his paths and circuits remain entirely alive and relevant in modern science. A search for Euler circuits on Google scholar brings back 27,700 articles.

Wednesday, September 17, 2008

The Future of the Internet

There is only one machine.
The web is its OS.
All screens will look into the One.
No bits will live outside the One.
To share is to gain.
Let the One read it.
The One is us.


This is Kevin Kelly's vision of the Internet in a not so distant future--the next 5,000 days. I know what you are thinking: he's some techno-cult leader promising us all lower mortgages, better sex lives and digital salvation. I thought so too, but no--Kelly is the Executive Editor at Wired, and is known for his deep and accurate insights into the future of technology, specifically our favorite network: the Web. At the 2007 TED conference Kelly gave a presentation at called "Predicting the Next 5,000 Days of the Web" where gave his vision for the future of the Internet which is exciting, powerful and scary (all at once). What is interesting from the point of network studies is his view that everything will be part of the Web--EVERYTHING--all machines, all data, all systems. It will be single global machine that is smarter, more personalized and more ubiquitous, were a sense of unity will emerge, a sense of consciousness. As Kelly says, "we all thought the Internet was going to be TV but better." Perhaps the next 5,000 days of the Web will be much more than the "Web but better." Perhaps one day there will no longer be networks but only The Network, or as Kelly calls it, "The One."

Click here to watch his presentation, it is about 20 minutes long. It is also worth checking out the TED website, they have a huge collection of amazing thinkers giving short presentation on all sorts of really cool stuff.

Kelly first gives some really interesting statistics that help us get a sense of the size of the Internet. These are some informal metrics similar to the metrics we have been talking about in class. Internet=100 billion clicks/day, 55 trillion links (edges), 2 billion chips, 2 million emails/sec, 8 TB/sec, 65 billion phone calls/day, 600 billion RFID tags. He is giving data on both the physical network (the Internet) and the data that is on it (www, voip, etc.). In addition consuming 5% of the worlds electricity.

He then goes on to make some (fairly controversial) comparisons between the complexity of the Internet and the human mind. Though the numbers of nodes and edges in the brain (neurons) and on the Internet (wires and switches) may be the same, I personnly consider the current emergent properties of the two networks to be very dissimilar, though I suppose it depends on how one defines "intelligence". By 2040 the power of the Internet will equal the processing power of humanity which he supports by positing that unlike computers, our brains aren't doubling in power every two years. Perhaps by 2040 complexity will be indeed prooved to be the cause of life.

From his point of view, this "new" Web will be different in three ways:
  1. Embodiment
  2. Restructuring
  3. Codependency
Embodiment means that the web will really start to have a physical form. It will be a collection of all the machines (computers, cell phones, cameras, microphones, GPS, cars, etc.) that can speak to the network, or what Kelly calls the Cloud. Think of the cloud this way: when you check your email, where is it sitting waiting for you to eagerly log on and read it? The Cloud. The beauty of the could is that it really doesn't matter where it is, what matters is that you can access it from any where, and with an ever growing list of devices, especially wireless ones. Devices that communicate with the Cloud serve two functions: they act as windows into into the Web, as well as acting as the eyes (cameras) and ears (microphones) of the web. In Kelly's view the web will no longer just be a set of web sites and links between them, but all things will be synchronized with the web so that everything "lives" on the web as well as in physical reality. It will become an "Internet of things." The web will branch out and include all things in its network, nothing will be offline. The network will become a network of networks. From the network's point of of view, we humans will just be "extended sensing tools". As much as the Internet will serve us, we will serve it.

Restructuring is a process where the nodes of the Internet gain an ever increasing level of granularity. The early Internet was computers linking to computer. Now we have the WWW, which are pages linking to pages. The Internet of the future will be data linking to data, ideas will link with ideas. Everything will be linked to, and in turn be defined by links to other things. The fourth stage will be an Internet of things, were physical things become linked to the web.

The last change the Internet will see will be Codependency. This means that we rely on the Internet for our very survival as much as the Internet relies on us for its existence. Kelly posits that using the Internet as a tool is no different than using written language as a tool. The codependency process has just one catch: you have to be willing to have your data shared. To really have the Internet become a tool humanity relies on in a dependent way we can have no secrets. "Total personalization will require total transparency." Having information shared in such an open way could lead to abuses and privacy issues. I think as we become more codependent, Internet privacy issues will become a much debated issue. We already have seen abuses with telecom companies allowing the US government secretly monitoring international calls of US citizens. With Codependency, the line between the virtual and real world will be very fuzzy, "we will become the web." The moral ramifications of this synergy will be huge, and needs to be scrutinized very careful to insure responsible access and usage of information.

I don't think we can fully anticipate what the Web will look like in the future--I don't think any body really knows. What this article does show us however is that at one point there was no Internet, and then it emerged, and began evolving to become the Web that we all know and love. From Kelly's perspecive Internet seems to be evolving and becoming more complex, as if it were an organism. The power of the Internet is greater than the sum of its parts, as is any organism were life is the emergent property. Emergent properties emerge (for lack of a better word) from the interactions of all of the nodes. There seem to be three properties of network evolution: more nodes, and more edges, and the emergent properties that come from the interaction between nodes.

If you subscribe to evolution (design without a designer) then the web looks very much like a life form. It began as very simple networks and has grown into something that is evolving and growing on its own. This raises an interesting questions regarding networks: is there a point where a network become sufficiently complex and large where where really unexpected and amazing things start to happen? Like life, or the Web.

Though Kelly not positing that the Web is an organism in the traditional sense. I think he is positing that it is an organism in the non-traditional sense, which started like all network do, biological or technological: a random linking of two nodes. He is making us think about what it means to be intelligent, what it means to be conscious and ultimately what it means to be One.

"The first person to buy a fax machine was an idiot." But the second...

Tuesday, September 16, 2008

Students in the 21st Century

Tuesday in class I made some remarks about dragging COA and its students into the 21st century. I also used this phrase last Spring in my Differential Equations course. I thought it might be a good exercise for me to say a little bit more about what I mean by this. Perhaps it will be of interest to others, too.

It might be easier to say what I don't mean. I'm not concerned that students keep up with all the latest electronic gadgets. And I certainly don't want to encourage students to unquestioningly accept new technology. I don't think debates about whether or not it's better to read the New York Times online or on paper are very interesting. And I'm not interested in virtual reality, Second Life, computer art, website design, or whether or not instant messaging will destroy this generation's ability to write grammatically.

What I am interested in is the vast amount of new technology that's actually very, very useful. How can this technology -- the myriad "web 2.0" apps and gizmos -- help us do the important and fun work we want to do. This technology isn't an end in and of itself, although it may seem like it. But it's a potentially powerful tool. Which tools are good, and which should be avoided? Moreover, like it or not, much of this new technology is here to stay. Email, the web, wikipedia, social networking, and so on, aren't going anywhere.

We live in a world that is "information rich." Email, blogs, podcasts, online journals and newspapers: there is a dizzying array of information to which we have easy access. A lot of this information comes streaming at us, whether we want it or not. This information is, of course, of wildly varying quality. Lots of it is distracting and irrelevant, and some is pure rubbish. But there's so much good stuff out there, too, that it would self-defeating to turn ones back on the sea of online information.

I worry occasionally that students get strong messages that the internet is never to be trusted, and that real knowledge comes in books or at least printed on paper. Or that electronic media is a little frivolous and library and book research is more serious and scholarly. The result is that sometimes students reflexively turn away from the online world. I want students to turn toward it and embrace it, at least long enough to see what it has to offer. I love books -- my home and office are full of them -- and I love libraries. But if I solely relied on paper resources it would be almost impossible to stay current in my fields. And it would take an enormous amount of paper.

I think it is important to gain skills and learn how to use tools to efficiently sample a lot of the new (and old) knowledge and ideas that are being produced. Even more important is being able to sort, index, store, share, and re-find references and resources that are useful. My experience has been that many COA students (and faculty) are unaware of lots of tools and strategies for working in an information rich environment.

I am hard pressed to think of many jobs/careers/callings which don't require some sort of facility with lots of different forms of (mostly) electronic communication and reading. There will almost surely be some fields or bodies of knowledge that students will need to keep up with: the art scene in Chicago, or politics in Nebraska or Nigeria, or an academic field (usually more than one), or the goings on in one's professional societies or associations, and so on. I want students to have good strategies and techniques for doing so efficiently and smartly. And email can be soul-crushing and time consuming, but it's here to stay. Better get strategies and techniques for dealing with it. Lots of it.

Ultimately, it's not up to me to determine what strategies people adopt to navigate this information-rich world. This will vary lots from person to person. But I do think it's appropriate for me to pester students to think about these issues and gently coerce them to trying some different approaches. In fact, the more I think about it, I worry that I would be remiss if I didn't do so.

These thoughts feel a little incoherent to me. I'd welcome questions and comments. If there is interest, I may follow up this post with a few others concerning some more specific (and practical) thoughts I have about particular strategies for working in information-rich settings.