This page is dedicated to DASIL-affiliated data sets available for download.
Mass Communications and State Institutions in Wartime China, 1937-45
Dataset 1: Chinese Mass Communications State and City Data
Download: Shape files (2.82MB)
Dataset 2: Aggregated File
National Incident Based Reporting System (NIBRS) Database (2000 – 2015)
These data from NIBRS include the nature and types of specific offenses in the incident, characteristics of victim(s) and offender(s), and types and values of property stolen and recovered.
Dataset 1: Individual Incident Level
Dataset 2: State By Date Level
National Longitudinal Survey of Youth 1979
National Longitudinal Survey of Youth (1979 – 2012) is a longitudinal project that follows a sample of American youth born between 1957-64 on various life aspects from 1979 to 2012. The data set provided below is a subset of this database, focusing on variables of 4 main topics: socioeconomic status, employment, education, and marriage. Some recommended statistical analysis techniques to be applied are multiple regression, time series analysis, logistic regression, and ANOVA.
Dataset 1: Individual by Year Level
Dataset 2: Individual Level
National Longitudinal Survey of Youth 1997
National Longitudinal Survey of Youth (1997 – 2012) is a longitudinal project that follows a sample of American youth born between 1980-84 on various life aspects from 1997 to 2012.
Download: CSV (41.0MB)
New York Police Department Stop-and-Frisk Policy Report
A summarized dataset containing information on stops related to the stop-and-frisk program in New York City as recorded by police officers. Fields included in the dataset include the race and sex of the suspect, as well as the specific actions (frisking, searching, and/or arresting the suspect) conducted by the police officer and the reasons reported by the officer for doing so.
Download: CSV (1.76MB)
College Scorecard Data
This dataset is adapted from the U.S. Department of Education’s College Scorecard dataset, which contains information about every higher education institution in the United States for every year available from 1997 to 2015. This subset of the data contains identifying information, average standardized test scores, ethnic and economic demographics, financial and debt information, and admission and completion rates.
Food and Agriculture Data of Africa
These datasets are cleaned versions of the UN’s (Food and Agriculture Organization) processing, producing, and trading of food statistics. Observations are set at a countrywide level with the range of observations spanning the entire African continent.
Toxic Release Inventory
The Toxic Release Inventory tracks the management of toxic chemicals that may pose a threat to human health and the environment. This dataset records annual volume of toxic chemicals disposed and managed by almost 22,000 facilities in the US from 1987 – 2014. These data also include total release of 30 most common chemicals tracked by this program, total release of metal and carcinogen for each company/facility.
Dataset 1: Company Name Level
Dataset 2: Facility ID Level
County-Level Presidential Election Data 2008 – 2016
This dataset includes county-level Democratic and Republican voter data from the 2008, 2012, and 2016 presidential elections. It also includes county-level data on key social and economic factors, including labor force participation, median household income, educational attainment, poverty, international and domestic migrations, population, race, gender, age, per capita income and occupations.
Atlantic Basin Storms and Eastern/Central Pacific Storms
These two datasets measure tallied by storm for Atlantic and Pacific-ocean storms, to allow comparison of storm severity, size, and impact over time.
Fatality Analysis Reporting System (FARS) Data 2010 to 2015
The dataset contains data on fatal motor vehicle crashes within the 50 States, the District of Columbia, and Puerto Rico. To be included in FARS, a crash must involve a motor vehicle traveling on a trafficway customarily open to the public, and must result in the death of an occupant of a vehicle or a non-occupant within 30 days (720 hours) of the crash.
Dataset 1: FARS Data 2015
Dataset 2: FARS Data 2010-2014
Plain text files from Grinnell College’s student newspaper, the Scarlet and Black (1894-2010)
Plain-text files generated via OCR from the digitized Grinnell College Library archive of the student newspaper, the Scarlet and Black. Access via a GitHub repository.