Here's what I've come up with so far, in trying to create some basic, "common-sense" standards for transforming social graphs. I'm trying to specify some logical maintenance processes that practically everybody can agree make sense to perform regularly on our social graphs.
The two recalculation processes described below will operate on graph X, which contains, as its values, a set of Internet addresses, and, associated with each address, a number between 0 and 1, with the sum of these numbers at a given time approximating 1. Assume that each of these addresses points to another graph, which can be retrieved, and which I'll call a "2nd-order graph" from the perspective of graph X.
Process 1:
Graph X will take new values using the values of all of its 2nd-order graphs. The number in graph X associated (before the recalculation) with a given address will determine the weight of the contribution of the values of the 2nd-order graph retrieved at that address to the values of the recalculated graph X. So each address-number pair in each 2nd-order graph will contribute to the number for that same address in the recalculated graph X an amount equal to the number in that pair (in the 2nd-order graph), multiplied by the number associated in graph X (before the recalculation) with the 2nd-order graph containing the pair. We can expect that the address representing graph X itself will appear, not self-referentially in graph X, but frequently in second-order graphs. When this happens, it contributes to the number for each address that appeared in graph X before the recalculation, an amount equal to the address's initial number, multiplied by the number in graph X for the 2nd-order graph that is pointing back to graph X, multiplied by the number in that 2nd-order graph for the address pointing to graph X.
Process 2:
Graph X will be recalculated based on the user's input. This process will not look at the contents (values) of the 2nd-order graphs. Assume that each pixel on the user's screen at any time will be associated with one particular address in graph X. Then, a selection of any pixel via an input action such as a mouse click will increase the number in graph X for the corresponding address, and will accommodate this larger number by decreasing the numbers for the rest of the addresses. If you click on a pixel and the number for the corresponding address goes from 0.1 to 0.2, then the number for each of the rest of the addresses will become eight-ninths as big as it was.
- Algorithm Alpha :-)
Storage size
(Anonymous)
2011-07-31 06:53 am (UTC)
On every iteration of your algorithm, this entire social graph will need to be processed. This is not a local transformation; it is a global one. Even if you aggressively prune the connections somehow, it's not clear that this algorithm will be useful, or will even converge to a stable result rather than going haywire. It's also unclear how well this would compare to existing algorithms for statistical classification, friend-based recommender systems, and so on.
Have you considered making a small toy graph and actually trying this out? I know it's hard to get real data, but without some kind of experimentation, this is just another vague graph algorithm.
Re: Storage size
2011-07-31 09:37 am (UTC)
We might cap the number of vertices (addresses) in a graph at something reasonable, like 10. After running Process 1, put all the addresses in graph X in decreasing order of the numerical values associated with them, retain the first 10, and prune the other 90. (Although you'll be starting with fewer than 100 if some addresses appeared in more than one 2nd-order graph.)
We discard most of the data with each iteration, but we're only trying to generate a snapshot of "what's up" at this moment -- to automate the generation of such snapshots in such a way that our interfaces can display rich, flexible landscapes based on the patterns that emerge from the snapshots' transformations and interconnections.
More recently I've tended to lean towards the idea of using social graphs that don't have any numerical values associated with the addresses. Algorithm Beta may instead simply convert an ordered one-dimensional array of addresses into another ordered one-dimensional array of addresses.