In a prime example of cosmic timeliness, shortly after Hurricane Helen hit an embargoed article about how deaths due to tropical cyclones are likely undercounted due to the long-term ramifications of having to deal with tropical cyclones hit our inboxes. I took on the challenge of making the graphics for Andrea's coverage of this embargoed article.
As I wrote about on the project page for these graphics, press receive notification of upcoming peer-reviewed articles about to be published, but are not allowed to publish anything themselves on the research until the embargo date. This is usually done for research that the authors think will receive widespread attention (or are using the embargo in an effort to generate more attention).
We picked up the story on a Thursday and the embargo was for the following Wednesday. Andrea and I zeroed in on the overlapping line chart (Figure 3C) and the demographics (Figure 4B) as the data of most relevance to her coverage of the article.


This meant I had three and a half days to turn around two graphics. It was the fastest turn-around time I have faced as an intern so far. And during the process, I found an error in the data that delayed the process as I waited for the corresponding author to respond. And then I was also sick. It was a long 3.5 days, haha.
While remaking the swirly figure in R, I realized that the labels of the hurricanes in my figure did not match the labels of the hurricanes in the original figure. We first became aware of the issue because the Hurricane David label pointed to a different line.
First, I triple-checked that I had merged the datasets correctly. The first had mortality numbers by tropical cyclone ID (which included month and year) and the second was a look-up table of the IDs and names, if available, also provided by the authors. I generated a third dataset of names, landfall dates, and categories scraped from Wikipedia because I was initially wanted to color-code the graphic not by decade but by intensity. It's entirely possible I had flubbed something when merging the datasets, but I didn't see any errors. I reached out to the corresponding author and indeed, the data they provided for the press package had an indexing error causing the misalignment of mortality data and labels. So, go me!
With the correct data, I could proceed with remaking the figures for a general audience. Thankfully, I already had a bit of experience with this from my experience remaking graphics for the Moore's Law article. The major differences this time were that these were peer-reviewed and published figures (the Moore's Law figures were not published in a peer-reviewed journal), and we were choosing which figure of multiple subfigures to feature and reproduce (whereas the Moore's Law figures were all standalone graphics).
I used the provided data and remade the figures from scratch in R before outputting the SVG files to Adobe Illustrator for design tweaks.
The Science Behind the Story
In the wake of a deadly hurricane, it is somewhat easier to count the bodies it leaves behind who succumbed to drowning and other direct causes of death. However, tropical cyclones have lasting consequences that survivors wrestle with long after the storm has dissipated. These include but are not limited to damaged infrastructure, loss of jobs, damaged food supplies, and so on. Hospitals might be damaged, ambulances unable to reach those in need due to collapsed roads. Individuals may end up using retirement savings to pay for the repairs and without these funds they might not be able to weather future health issues, leading to premature deaths. Those who choose to relocate lose support structures and community. The local government will likely allocate funds to rebuilding infrastructure, resulting in initiatives that might have improved the long-term health of residents going without.
While immediate deaths due to tropical cyclones are known, storms continue to be deadly long after the skies have cleared.
To estimate this 'excess mortality,' the researchers compared the mortality rates of states that deal with hurricanes to those without, generated statistical models, and found that tropical cyclones are likely responsible for 3.2-5.1% of all deaths along the Atlantic coast. The models are designed to account for known and unknown differences between the states that might also impact mortality numbers.
The researchers found that on average, tropical cyclones continue to kill for 172 months after they make landfall, peaking around 60 months or 5 years after the storm. The models indicate that 13% of deaths in Florida, 11% in North Carolina, and 9% and 8% in South Carolina and Louisiana respectively are all attributable to tropical cyclones.
That is wild. And scary.
The Candycane Graph
I made a few key design decisions when remaking the overlapping line chart. It shows excess mortalities attributable to each cyclone for years extending past landfall, and the overlapping nature of the lines builds the silhouette to show the cumulative and ongoing toll of past cyclones on American lives.
I started by removing the zeroes in the data. Each cyclone line had these tremendously long tails in the original because the dataset was wide rather than tidy, meaning every cyclone had data for every year, even if it was 0. I didn't feel the need to retain that null information - one could say it was pointless.
The next and biggest decision was choosing to highlight the category 5 hurricanes. I did this for two main reasons: first, these hurricanes often had the highest mortality of the bunch (not always) and therefore impacted the overall silhouette the most. Secondly, I was intensely curious about which of the lines in the original figure were cat-5s and figured a general audience would be as well. Even if that's not specifically related to the article's main points, why not try to preemptively answer a question that a lot of people will probably have?
I started by coloring all of the cyclones by categorical intensity (1-5), but there were so many storms that never amounted to even a category 1 that the chart was mostly grey.
Then I tried coloring just the Cat 5's, and this draft earned the chart the 'Candycane' nickname.

While we all enjoyed this draft aeshetically, we missed the added context provided by coloring by decade. I decided to try merging the two idea - coloring by decade but muting everything that wasn't a Cat 5. This worked surprisingly well and felt like a good balance of amplifying the overlapping nature of the chart (which is, after all, the main point) and highlighting particularly notable storms.

And then lastly, several of the purple lines overlapped so much that you couldn't even really differentiate them. To help solve this issue, I added white lines at a slightly thicker stroke behind the purple strokes to help provide a visual offset. A subtle and nifty trick I learned from Jen.

The Lollipop Chart
Regarding the bar chart, the key issue I needed to solve, from a data visualization perspective, was the use of double y-axes. Traditionally, that's no bueno. I was puzzling over the best ways to show both raw numbers and percentage of total together when Amanda suggested using a lollipop chart but scaling the lollipop part to be percentage (a graphic she had made for a previous article). That was such a good idea I immediately ran with it.
There weren't any particularly notable intermediate versions of this chart as the process was rather straightforward. I sized the bars to be the width of the text labels and rounded the ends to help provide a cleaner appearance, and kept the lollipop circles transparent to allow the audience to see the underlying data conveyed by the bars.
One of the spacing considerations I routinely deliberate on is how to space out objects of variable sizes. Do you space them out to retain a constant width between objects, or do you place the objects in a regular grid, meaning that larger objects will be closer to their neighbors than smaller objects? The bars below necessitate the use of the grid, meaning that the lollipops have the variable distance to one another - smaller circles appear farther away from their neighbors than the larger circles. I'm not sure this is one of those topics that has a right answer, but it was certainly on my mind as I was making these charts.
The colors are obviously connected to the original grey and red candycane draft of the overlapping line chart. However, as we moved back to the colorful version for that chart, I did not see a good reason to add more colors to this graph. And ultimately, I did not see a reason to change the red to another color, either—red is thematically appropriate for a chart about death.


Observant readers will note that the ordering of the lollipops is different between the desktop and mobile versions of this graphic. I needed to do this to accommodate the sizable annotations. For desktop, I used descending and ascending order in the age and race data to create a semi-symmetrical look and used the middle white space for the lengthy annotation about how to interpret the bubbles. For mobile, I reversed the demographic order and sorted the age bars chronologically. This allowed me to make the lengthy annotation full width along the bottom.
Wrap Up
I recognize my contributions were more superficial than material in this project, but my role was still impactful and I love how they turned out. The candycane graph was conceived of by the authors, and the idea for the scaled lollipop chart came from Amanda. However, I was responsible for several design decisions that I believe add clarity and visual intrigue to the charts.
And the Sonja Kuijpers weighed in on my chart.

And the fact that I did all of this in three and half days, with delays caused by erroneous data while being sick? I'm proud of myself.
Comments