Black Civil War Soldiers: A Data Exploration in History 214

In the Fall 2014 semester, students in History 214: The American Civil War and Reconstruction class got help from DASIL to explore data about black soldiers who enlisted in the U.S. Army during the Civil War.  Where did black soldiers come from?  What kinds of economic and social factors influenced their experiences?  Students read an article by the very creative economic historians Dora L. Costa and Matthew E. Kahn, “Forging A New Identity: The Costs and Benefits of Diversity in Civil War Combat Units for Black Slaves and Freemen” (2004),” which used a variety of census, enlistment, and pension data to examine some of the effects of serving in the military on the almost 200,000 black men who enlisted in the U.S. Army after 1862.  Students had also read several articles that used Geographic Information Systems to do spatial analysis, and I was interested in doing an in-class exercise to help them think critically about military data and to introduce them to using the GIS technology.

DASIL Director Kathy Kamp, Post-Baccalaureate Fellow Sara Sanders, and DASIL student Harry Maher worked with me to design an exercise that could help students to visualize data about black soldiers for themselves and to think about the effects of geography on black enlistment.  We used a dataset on “Union Army Recruits in Black Regiments in the United States Army 1862-1865” compiled by Jacob Metzger and Robert A. Margo available through the ICPSR as a starting point.

The DASIL staff created tables that contained the Metzger and Margo data, along with 1860 census information on population and agricultural production and some tables on black enlistment from Freedom’s Soldiers: The Black Military Experience in the Civil War  (Berlin, Reidy, and Rowland, 1998).  We worked together to create a GIS exercise based on the data that could whet the students’ appetites for working with spatial data in just one class period.

The in-class exercise had two points: to get the students thinking about the nature of the datasets and to have them use ArcGIS software to create maps that showed the relationship between the proportion of black soldiers enlisted in relation to the total population by state and cotton production by county.  The Metzger and Margo dataset is based on a judgment sample, so students were able to examine the dataset alongside census records to appreciate how judgment samples do not include comprehensive data.  Nor do they represent a random, representative sampling of black soldiers.  But used alongside the census data and the tables from Freedom’s Soldiers, the data were still able to help them form some useful conclusions.

Map showing Number of Black Soldiers and Cotton Production in the United States in 1860

Continue reading →

The Unemployment Rate(s)

Every month, the Bureau of Labor Statistics releases information about the US job market in the Current Population Survey (CPS).  Out of the extensive information and data in the report, the media highlights one, and typically only one, piece of information: the ‘average’ unemployment rate for the country.  This single number is then used to draw conclusions about the state and health of the US economy.

This single number, though, masks considerable diversity in the economic condition of individuals and the economic activity in different areas of the country.  These differences can be understood through some simple and straightforward data visualization graphs that DASIL has put together.  In my Introduction to Economics class, I ask students to spend some time familiarizing themselves with the CPS and the associated graphics produced by DASIL to consider the employment outcomes of different groups of individuals.

The interactive graphic created by DASIL displays the unemployment rate over time of individuals with different demographic characteristics.  The graphic focuses on unemployment rates conditioned on race (all race, white, black, Hispanic or other), gender (male or female), age (all age, 15-24, 25-44, 45 and over), and education (all education, no high school, high school, college).  Using the interactive buttons on the website, students can explore how the unemployment rate varies across these different groups. Students will learn, for example, that:

  • Relative to whites, the unemployment rate for blacks is typically twice as high.

Comparison of Black and White Unemployment Rates, August 2005-November 2014

Continue reading →

Visualizing Racial Segregation

In honor of this week’s Dr. Martin Luther King Day holiday, we encourage you to explore the continued pattern of racial segregation in housing with this map from the Weldon Cooper Center for Public Service at the University of Virginia. The map has one dot for every person in the United States as of the 2010 Census, with different colored dots for people reporting different races or ethnicities on the Census: red for Asian, orange for Hispanic, green for African American, blue for white, and brown for other. The image below shows St. Louis, Missouri, and its suburbs from this map.

Map of racial segregation in St. Louis, Missouri


Image Copyright, 2013, Weldon Cooper Center for Public Service, Rector and Visitors of the University of Virginia (Dustin A. Cable, creator)

What is the State of the Union Address Really About?

The State of the Union Address has been an American tradition since 1790.  The President of the United States addresses Congress, reports the current condition of the country, and also presents legislative plans.  But, does the content of speeches change over time?  To examine this question, DASIL staff downloaded the full text of every State of the Union Address, and I used NVivo to analyze them.

Some topics have been more pressing in recent times—like terrorism.

Graph of Mentions of Terrorism in State of the Union Addresses

Continue reading →

Federal Election Commission Data Forecast: Occasional Clouds

About a century ago, Louis Brandeis – before taking a seat on the U.S. Supreme Court – gave his oft-quoted recipe for transparency: “Sunlight is said to the best of disinfectants.” Regulation of campaign finance taps into that principle when it mandates disclosure – reporting – to the Federal Election Commission (FEC).

Political committees – like parties, candidates and PACs – have to detail receipts and disbursements in regular reports filed with the FEC, a provision of the regulatory framework that policy makers seem to have gotten right. The data generated, analyzed by legions of scholars, are key for tracking the influence of money in politics. The picture that emerges for those traditional political players may not be pretty, but at least it’s a picture.

Newly active “dark money” organizations, backed by deep pockets, evade FEC disclosure requirements. Hence, the “dark” tag. But disclosure may even be a little shaky for the old-school organizations that report to the FEC. Research by Grinnell College juniors Becca Heller and Emma Lange reveals that the sun may not shine as brightly as it should.

Becca’s and Emma’s Fall 2014 Mentored Advanced Project research examined the Court’s recent decision in McCutcheon v. FEC. Like others, they were intrigued by the impact of the abolition of aggregate contribution limits, which had been in place for decades. Though the jury is still out on McCutcheon’s impact, Becca and Emma identified some data quality issues in the course of their research. They can tell you the rest.

We noticed several irregularities, some possibly due to human error in data entry. Whether the donor, the committee or the FEC were responsible, there were a variety of typos, misspellings and variations in names and occupations in the records. This is more problematic than it might seem, since compiling an individual donor’s history requires matching information. These minor errors also make us wonder whether there are inaccuracies in stated donations amounts.

To get a donor’s history, we relied on the donor look-up function of the FEC website. Beyond those typo-related issues, we saw other lapses in reporting. For example, many donors who gave at the $2,600 maximum in the Iowa US Senate contest did not show up in the donor look-up, while their records were included in the campaign’s own disclosure reports.

Occasionally we also saw evidence of apparent illegal donations to candidates, going beyond the $2,600 limit. In time these dollars might be returned to the donor, with a FEC paper trail documenting it. But while we were looking, the data suggested that some donors had given beyond the legal limit.

In one case – an important one for our research – we couldn’t get to the necessary data because of FEC time lags. Data updates for many committees happen quickly, almost instantaneous upon submission, with one notable exception: reports from Senate campaigns, which are exempt from mandatory electronic disclosure. No, it doesn’t make sense, and the most recent correction attempt never made it out of committee. But it means that updated data are not always available in a timely fashion.

We didn’t run into the senate campaign data problem, since the candidates we focused on filed electronically. However, we did have a time-lag problem with Joint Fundraising Committees (JFCs), a rather obscure fundraising vehicle, but one hypothesized to become more important in the post-McCutcheon era. The JFC data were not available after the November election, and the FEC – when we inquired on the phone – said that it couldn’t even project how long it would take to process the large volume of 2014 JFC reports.

These various campaign finance data problems surprised us. Academics and journalists fail to mention irregularities, though possibly they only emerge when looking at the granular level, as our research did. Maybe there’s an implicit judgment that the problem is minimal, especially against the backdrop of millions of data points. Or maybe everyone just acknowledges that collecting data is a human enterprise, subject to error.

The political significance of these data problems could cut either way. A disclosure system without sufficient transparency might be lacking as a disinfectant. Yet possibly the threat of transparency – as opposed to the reality of it – is the important factor for democracy.