Notes on Kleinberg’s “The Convergence of Social and Technological Networks”

A few notes on Jon Kleinberg survey article “The Convergence of Social and Technological Networks”, CACM, Nov. 2008, pp. 66-72.

 Collecting social-network data has traditionally been hard work […] Social interaction in online settings, on the other hand, leaves extensive digital traces by its very nature. […] the beginnings of a new research area—one that analyzes and builds theories of large social systems by using their reflections in massive datasets.”

 

 

Two major areas of social network investigation

 

  1. Small-world principle, decentralized search, 6 degrees of separation, peer-to-peer
  2. Social contagion, spread of ideas, diffusion of innovations, influencers

 

 

Small world

 

There really are “6 degrees of separation” in human social networks, and those short paths are findable by the people in the network using only local information.  The classic experiment by Milgram, asked someone in, Tulsa, say, to route a letter to someone in Boston by mailing it to someone the Tulsan knew on a first-name basis, who would forward it on to someone he/she knew, etc.

 

(That term “short path” is used throughout, but I don’t think it’s formally defined in this article.  I assume it means something like a path that has many standard deviations fewer segments than the mean of a randomly chosen path between the source and target.)

 

Although “adding even a small number of random social connections to a highly clustered network causes a rapid transition to a small world, with short paths appearing between most pairs of people,” that alone is not enough to make these paths findable with only local information. For example, if the random connections were distributed uniformly across all pairs in the whole network, the short paths would seem arbitrary, perhaps from Tulsa to Miami back to Omaha and then to Boston.

 

I interpret this section to be asking –

 

    If the local choice is a greedy one that chooses the connection that most improves the closeness to the target, which distribution of random connections is most likely to lead to paths that are short?

 

and answering –

 

   An inverse-square power law — if Alice lives twice as far from you as Bob does, then you’re twice as likely to be connected to Bob.

 

(A version of this reference is available here.)

 

Real social networks obey this inverse-square power law.  What underlying processes cause social networks to grow this way?  That’s still an open question. (It’s possibly related to the “preferential attachment” mentioned in the concluding “Further Directions” section.)

 

(On ‘inverse-square’ I still have some confusion though.  What is the metric? Most of the time in this article it seems physical distance, perhaps adjusted for population densities.  But I personally communicate more with technologists in Boston than I do neighbors on my block.  Milgram-type experiments “have tended to be the most successful when the target is affluent and socially accessible”. I’d suppose that my neighbors and I are OK in that regard, yet they are not honestly a significant part of my social network.  Maybe I suppose wrong, especially about my social accessibility? Or maybe I should be asking, ‘What is the social network?’, too, not just ‘What is the metric?’.

 

And Kleinberg writes “Other research using online data has considered how friendship and communication depend on nongeographic notions of ‘distance.’ For example, the probability that you know someone is affected by whether you and they have similar occupations, cultural backgrounds, or roles within a large organization. Adamic and Adar studied how communication depends on one such kind of distance: they measured how the rate of email messaging between employees of a corporate research lab fell off as they looked at people who were farther and farther apart in the organizational hierarchy. Here too, this rate approximated an analogue of the inverse-square law— in a form adapted to hierarchies—although the messages in the researchers’ data were skewed a bit more toward long-range contacts in the organization than short-range ones.”)

 

 

 

Influence

 

How does information and influence diffuse across a social network?

 

Not via short paths.  Analysis of chain letter data show that “Rather than fanning out widely, reaching many people with only a few degrees of separation, the chain letter spread in a deep and narrow pattern, with many paths consisting of several hundred steps.” ( ‘Why?’ is an open question, but Kleinberg discusses several hypotheses.)

 

Naturally, people are influenced by their neighbors in the social network — the more friends you observe doing something the more inclined you are to do it – and, naturally, there’s a diminishing returns effect – if 99 friends are doing it, the seeing the 100th one do it doesn’t change your behavior much.  (Not mentioned, but commonsense says to me that you are more likely to copy people you identify with, but, on the other hand, may not matter because the people in your social network tend anyway to be people you identify with.)

 

However, there’s an important psychological effect – the 0-1-2 effect — that runs contrary to diminishing returns, because

 

        “the probability of joining an activity when two friends have done so is significantly more than twice the probability of joining when only one has done so”.  

 

(Denning defines innovation as “fostering a change of practice in a community”.  The 0-1-2 effect suggests that a successful innovator must not merely make initial converts, but also encourage them to be visible about it.)

 

This is related to “triadic closure” – “links are much more likely to form between two people when they have a friend in common”.

 

The last paragraph of this section seems worth focusing on –

 

         “Large-scale social contagion data […] provides the opportunity to identify highly influential sets of people in a social network—the set of people who would trigger the largest cascade if they were to adopt an innovation. The search for such influential sets is a computationally difficult problem, although recent work has shown that when social influence follows the kind of ‘diminishing returns’ pattern discussed here, it is possible to find approximate methods with provable guarantees”

 

 

Effects mentioned in “Further Directions” section

 

If the number of people in the network grows –

 

  1.     “densification” – the number of links per person also grows
  2.     “shrinking diameters” – the length of the shortest path can decrease

 

 

It’s hard (or perhaps impossible) to predict the growth of a network in detail, because the growth is sensitive to initial conditions.

 

 

Advertisements

1 Comment

  1. According to Marián Boguñá and Dmitri Krioukov

    Greedy-routing paths are asymptotically shortest paths in scale-free, strongly clustered networks.

    Random scale-free networks are ultrasmall worlds. The average length of the shortest paths in networks of size N scales as lnlnN. Here we show that these ultrasmall worlds can be navigated in ultrashort time. Greedy routing on scale-free networks embedded in metric spaces finds paths with the average length scaling also as lnlnN. Greedy routing uses only local information to navigate a network. Nevertheless, it finds asymptotically the shortest paths, a direct computation of which requires global topology knowledge. Our findings imply that the peculiar structure of complex networks ensures that the lack of global topological awareness has asymptotically no impact on the length of communication paths. These results have important consequences for communication systems such as the Internet, where maintaining knowledge of current topology is a major scalability bottleneck.

    A popular exposition is this by Mark Buchanan, who writes

    The internet has a “small world” architecture: in other words, it takes only a handful of steps to link any two parts, even though the network is enormous. But just because such short paths exist doesn’t mean they are easy to find.

    and

    Their idea was to embed the network in a multidimensional space, giving each node a set of coordinates locating its position. Information is then routed from a given node to another by choosing whichever of its neighbours is closest to the destination in the imaginary space.

    For networks with the internet’s “small world” structure, this simple trick, known as “greedy routing”, is enough to ensure that information follows a short path, even when the network grows extremely large. “It’s possible to navigate the network efficiently using only local information,” Boguñás says.”

    Like

Tell me (anonymous OK)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s