Data Viz

August 7, 2011

The picture below is a visualization of the complexity of algebra. The vertices are theorems and the edges between theorems are dependencies. Technically the edges should be directed, since if Theorem A depends on Theorem B, we shouldn’t have it the other way around too!

This comes from data mining my husband’s open source Stacks Project; I should admit that, even though I suggested the design of the picture, I didn’t implement it! My husband used graphviz to generate this picture – it puts heavily connected things in the middle and less connected things on the outside. I’ve also used graphviz to visualize the connections in databases (MySQL automatically generates the graph).

Here’s another picture which labels each vertex with a tag. I designed the tag system, which gives each theorem a unique identifier; the hope is that people will be willing to refer to the theorems in the project even though their names and theorem numbers may change (i.e. Theorem 1.3.3 may become Theorem 1.3.4 if someone adds a new result in that section). It’s also directed, showing you dependency (Theorem A points to Theorem B if you need Theorem A to prove Theorem B). This visualizes the results needed to prove Chow’s Lemma:

  1. Aaron
    August 7, 2011 at 10:57 pm

    Cool graph. What do the loops mean?


    • August 8, 2011 at 11:55 am

      Yeah, that’s weird and shouldn’t happen. I see two of them. This graph was made a long time ago and after I started making the graphs with the tags I discovered a couple of lemmas (don’t remember which) where I mistakenly referred to the lemma itself (by copying the wrong latex label in the \ref{-} invocation)!

      But actually, I just modified my script that checks for forward references to also look at self-referential lemmas — and it found a bunch more. I’ll fix them. Thanks!


  2. Richard Séguin
    August 7, 2011 at 11:41 pm

    Nice! I wish there was some software that could produce graphs such as these based on the \ref pointers in a .tex document. In general, it would help detect accidental forward references due to moving text around, dead ends, and general messiness. I’ve been working on a document that currently has 221 propositions, lemmas and corollaries, and producing graphs such as these by hand is a nightmare. I was horrified when I inserted a proposition on page 172 and realized that there was a counterexample in contradiction. Due to the complexity of all of the dependencies in the document, I knew that finding the problem could be like finding a needle in a haystack. Manually graphing all of the backwards dependencies would have taken a huge amount of time and a large sheet of paper. As it happens, I accidentally found the problem on page 83 when I wanted to use that proposition for something else, and I was saved. I traced dependencies forward (much less daunting than tracing backwards) and wound up removing several pages.

    When I initially visited this page a while ago several links on the page seem to get redirected to a phishing page. The links are working now, but I seemed to have lost my email and name auto-fills.


    • Richard Séguin
      August 8, 2011 at 12:56 am

      Correction: I should have said “malware” rather than “phishing”.

      Also, the the document of mine that I talked about has 1024 \ref pointers.

      I just attempted to submit this comment without noticing that my email address and name were not remembered from the above post.


  3. August 8, 2011 at 3:21 pm

    You guys should make a new T-shirt with that diagram. I would buy one (already people ask me about the other T-shirt — most recently a clerk at Trader Joe’s).


  4. ben
    August 9, 2011 at 2:55 pm

    The visualization of Chow’s Lemma looks like the north hemisphere.


  5. Amber
    August 9, 2011 at 10:22 pm

    So Stephen Wolfram is trying to create a computer programs which starts with a set of axioms and generates a whole field of math, basically just applying different subsets of the axioms to see what theorems come out of it.

    Seeing the complexity of math makes me seriously question that he can do that. But on top of this, think how many man hours have been put into adding new vertices? If we suddenly had a computer program which could generate math, without the beauty, elegance, and personal satisfaction that comes from mathematics, where would we, as mathematicians be? Imagine if your drawing could be created in only a few days, instead of a few centuries… I certainly don’t want that to happen, oddly enough, but would love to hear your input, Cathy.


  6. Jonah S
    August 20, 2011 at 7:37 pm

    Fascinating, but also intimidating for one who hopes to learn algebraic geometry 🙂


  7. vcvpaiva
    May 3, 2012 at 9:07 am

    Very nice idea, thanks for the picture! but surely it ought to be possible to use colours to make this more informative, no? I mean it would be nice to see say theorems that only apply to modules in a different colour than ones that applied to groups, perhaps by clustering using keywords or even just words in the abstracts…


  8. vcvpaiva
    May 3, 2012 at 9:15 am

    Very nice idea, thanks for the picture! but surely the graph could be made much more informative by using colours, perhaps via AMS subclassifications or even better by clustering concepts/keywords from the abstracts, no? This way theorems that only applied to modules say, would be a different colour than ones that were about groups and we would be able to see more of the structure of algebra…


  9. Pablo Padilla
    February 16, 2013 at 11:52 pm



  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: