BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Visualizing Stop-and-Frisk and Murder Rates in New York City

This article is more than 10 years old.

Debate over the New York Police Department’s controversial stop-and-frisk policy has intensified recently with an editorial attack by The New York Times and an NYPD rebuttal.  One of many major points of contention is whether this tactic can be credited with reducing the NYC murder rate.  The NYPD and its supporters have repeatedly stressed that stop-and-frisk is part of a policy that saves lives.  The Daily News reports that according to the NYPD’s top spokesman, Deputy Commissioner Paul  Browne, “Over the past 10 years, there were 5,430 murders in New York City, compared with 11,058 in the decade before Mayor Michael Bloomberg took office.”  Police Commissioner Ray Kelly directly links a significant drop in the city’s murder rate with the stop-and-frisk policy.

Is it so?  Despite all the talk of declining crime and increased numbers of stop-and-frisks, are the two connected?  The short answer is no!  All of the graphs in today's post make it clear that the astronomical increase in stop-and-frisks came well after the significant decrease in number of murders, and thus cannot be the cause of the drop.

I present these different versions as a means to discuss choices and considerations when plotting two trends simultaneously.  To compare, the most straightforward method is simply to plot the number of murders and the number of stop-and-frisks, as I did in Figure 1. This clearly shows that the number of murders decreased sharply between 1990 and 1998 while the number of stop-and-frisks had a sharp increase beginning in 2002.  (Stop-and-frisk data are not available for years prior to 2002.)

Figure 1. The numbers of murders and the number of stop-and-frisks in New York City.

I chose two panels since the magnitudes for the two measures are not the same, and plotting them on the same set of axes would hide any variation in the numbers of murders over time.  Although I often recommend including units in axis labels rather than cluttering a figure with repeated percent signs, dollar signs or other units, I deviated from this recommendation here in the y-axis labels--note the multiple 0's and K's--to emphasize the fact that the scale for stop-and-frisks is in thousands while the scale for murders is not.

To demonstrate the problem with dual-axis figures, I plotted both trends in the same panel, shown in Figure 2.  Although it clearly shows the time gap between the decrease in murders and the increase in stop-and-frisks, the left and right scales are independent so the designer can give many different impressions of the data by manipulating the scales. Also, readers place meaning on the point where the curves cross when no meaning exists since the relationship of the two scales is arbitrary.  For these reasons, this figure is not my first choice.In Figures 1 and 2, I used raw numbers, not rates, and in some cases this can be problematic since it ignores population growth. In this case both trends refer to the same New York City population, so there is less potential for distortion, but if the two trends involved two locations with different population dynamics, it would be critical to consider rate.

To test the effect of using rates rather than raw numbers, I chose to express murder and stop-and-frisk rates using the conventional number per 100,000 population.  If the rates were comparable I would consider using one axis.  In this case, though, the stop-and-frisk rate is so much higher than the murder rate that it requires a separate scale.  For the reasons discussed above, when I have separate scales my first choice is to use two panels, as I did in Figure 3.

As it turns out, Figure 3 looks very similar to Figure 1, showing that adjusting for population did not have much effect in this particular case.

Despite my reservations, clients like the simplicity of a one panel graph, and therefore I worked next at creating a version that minimizes the problems with dual axes.  In Figure 4, I made the separate scales less arbitrary by setting the stop-and-frisk rate scale at 100 times the murder rate scale.  As it turned out, the two trends lines did not cross, so the problem of a meaningless intersection discussed above was averted.

Finally, indexing the data to a given year is a second way to minimize the dual-axis problem.  With indexing, the plotted value represents the percentage change from the base year, so, for example, a plotted value of 3 represents a tripling of the rate compared to the base year.  I chose 2002 to be the base year for two reasons: the year 2002 is the first year for which I have data on stop-and-frisks and much of the discussion I’ve read focuses on the last decade, since Bloomberg took office. By definition, the plotted values for both murder rate and stop-and-frisk rate are 1 for the base year.  This figure shows that although the stop-and-frisk rate increased six fold, the murder rate continued the same slight rate of decline during the last decade as it has since 1997.

Figure 5. The murder rate and the stop-and-frisk rate are indexed to 2002 values.

Which figure would you use? How would you plot this data in a fair and objective manner?

Thanks to Hjalmar Gislason of DataMarket for finding, cleaning, and making data available for the world to use. Thanks to Kaiser Fung and Richard Heiberger for their helpful comments and to Joyce Robbins for suggesting this topic to me.