Behind The Scenes Of “The Whole Brilliant Enterprise”
Or, how to distill 11,000 pages of text into a single graphic
The final graphic
For the July issue of Popular Science, we—the Office for Creative Research—created a data visualization celebrating NASA’s long history of aerospace innovation. Since 1959, NASA has published a document called “Astronautics & Aeronautics Chronology” nearly every year, compiling news coverage of science, technology, and policy at the agency. In these compilations, NASA is reporting its own history. What kinds of stories do these documents hold? How has their language changed over the last six decades? To explore these questions, we created “The Whole Brilliant Enterprise,” a text-based visualization drawn from—by our count—4,861,706 words of NASA history.
The first step was to dig through the NASA chronologies by hand. We discovered that while the reports were an extremely descriptive history of aerospace, they lacked a hierarchy—they were simply straightforward timelines recounting events. A story about the hiring of a new NASA employee might appear alongside a story of a shuttle launch, representing chronological order but not relative importance. That mixed-up quality makes the documents wonderful to skim, but difficult to visualize.
To address the hierarchy issue, we turned to the archives of The New York Times, seeking out NASA-related headlines and articles. We took the articles’ placement in the paper of record—was it front-page news or did the story appear at the back of a section?—as a proxy for cultural impact. Then, we mapped that importance rating back onto the NASA archives, and used it to pull out the text of just the most consequential stories to act as the foundation of the visualization. It was in compiling these results that we realized that the piece should not be a rigid timeline of key NASA events, but instead a rolling impression of the agency’s eras, created by displaying some of the more popular and important terms within the articles.
Once we had the structure in place, the challenge became finding the balance between a term’s chronological location and the type size that would represent its place in the “cultural impact” hierarchy. We also had to space the individual terms evenly along a curved path. It took many iterations of the code that generated the graphic to strike that balance, but eventually we settled on a process that produced an image with the character that we had originally envisioned.
We followed a circuitous path to generate the graphic—the extent of which is evident in our sketches [below]—but we felt it was an appropriate process given the breadth of the archive. The value of our explorations is—like the histories themselves—more striking when viewed in hindsight.
A Small Gallery of Our Sketches and In-Progress Images
Counting the number of NASA-related New York Times stories
A selection of NASA-related New York Times stories, plotted by their length and location in the paper
A quick visualization of the interconnectedness of select New York Times story abstracts on different NASA topics
Identifying the most important terms and beginning to sort them by topic
A process shot, as we calculate allowable text heights along the curves of the graphic
A study of our path-generation algorithm for the flare of “-ing” words running across the background
Distributing the curves that will corral the text for on each topic in the final graphic
Finding the perpendicular lines at the curves’ inflection points, for running text along the curves later