This page is dedicated to DASIL-affiliated data sets available for download.
Mass Communications and State Institutions in Wartime China, 1937-45
Dataset 1: Chinese Mass Communications State and City Data
Download: Shape files (2.82MB)
Dataset 2: Aggregated File
Download: Excel (58.4KB) Codebook
National Incident Based Reporting System (NIBRS) Database (2000 – 2022)
These data from NIBRS include the nature and types of specific offenses in the incident, characteristics of victim(s) and offender(s), and types and values of property stolen and recovered.
Dataset 1: Individual Incident Level
Download: CSV (2.1GB) STATA (2.3GB) Codebook
Dataset 2: State By Date Level
Download: CSV (15.0MB) STATA (20.5MB) Codebook
National Longitudinal Survey of Youth 1979
National Longitudinal Survey of Youth (1979 – 2012) is a longitudinal project that follows a sample of American youth born between 1957-64 on various life aspects from 1979 to 2012. The data set provided below is a subset of this database, focusing on variables of 4 main topics: socioeconomic status, employment, education, and marriage. Some recommended statistical analysis techniques to be applied are multiple regression, time series analysis, logistic regression, and ANOVA.
Dataset 1: Individual by Year Level
Download: CSV (39.1MB) STATA (39.1MB)
Dataset 2: Individual Level
Download: CSV (412KB) STATA (527KB)
National Longitudinal Survey of Youth 1997
National Longitudinal Survey of Youth (1997 – 2012) is a longitudinal project that follows a sample of American youth born between 1980-84 on various life aspects from 1997 to 2012.
Download: CSV (41.0MB)
New York Police Department Stop-and-Frisk Policy Report
A summarized dataset containing information on stops related to the stop-and-frisk program in New York City as recorded by police officers. Fields included in the dataset include the race and sex of the suspect, as well as the specific actions (frisking, searching, and/or arresting the suspect) conducted by the police officer and the reasons reported by the officer for doing so.
Download: CSV (1.76MB)
College Scorecard Data
This dataset is adapted from the U.S. Department of Education’s College Scorecard dataset, which contains information about every higher education institution in the United States for every year available from 1997 to 2015. This subset of the data contains identifying information, average standardized test scores, ethnic and economic demographics, financial and debt information, and admission and completion rates.
Download: Excel (27.7 MB) CSV (10.4 MB) STATA (9.88 MB)
Food and Agriculture Data of Africa
These datasets are cleaned versions of the UN’s (Food and Agriculture Organization) processing, producing, and trading of food statistics. Observations are set at a countrywide level with the range of observations spanning the entire African continent.
Download: Crop Processing (340KB) Crop Production (5.92MB) Food Balance (16.6MB) Food Supply: Livestocks(2.81MB) Food Supple: Crops (7.56MB) Value of Agriculture Production (9.28MB)
Toxic Release Inventory
The Toxic Release Inventory tracks the management of toxic chemicals that may pose a threat to human health and the environment. This dataset records annual volume of toxic chemicals disposed and managed by almost 22,000 facilities in the US from 1987 – 2014. These data also include total release of 30 most common chemicals tracked by this program, total release of metal and carcinogen for each company/facility.
Dataset 1: Company Name Level
Download: CSV (30.1MB) STATA (30.1MB)
Dataset 2: Facility ID Level
Download: CSV (71.5MB) STATA (41.1MB)
County-Level Presidential Election Data 2008 – 2016
This dataset includes county-level Democratic and Republican voter data from the 2008, 2012, and 2016 presidential elections. It also includes county-level data on key social and economic factors, including labor force participation, median household income, educational attainment, poverty, international and domestic migrations, population, race, gender, age, per capita income and occupations.
Download: CSV (935KB) STATA (0.98MB)
Atlantic Basin Storms and Eastern/Central Pacific Storms
These two datasets measure tallied by storm for Atlantic and Pacific-ocean storms, to allow comparison of storm severity, size, and impact over time.
Download: Atlantic (56.0KB) Pacific (36.0KB)
Fatality Analysis Reporting System (FARS) Data 2010 to 2015
The dataset contains data on fatal motor vehicle crashes within the 50 States, the District of Columbia, and Puerto Rico. To be included in FARS, a crash must involve a motor vehicle traveling on a trafficway customarily open to the public, and must result in the death of an occupant of a vehicle or a non-occupant within 30 days (720 hours) of the crash.
Dataset 1: FARS Data 2015
Download: CSV (904KB) STATA (988KB)
Dataset 2: FARS Data 2010-2014
Download: CSV (3.89MB) STATA (4.41MB)
Plain text files from Grinnell College’s student newspaper, the Scarlet and Black (1894-2010)
Plain-text files generated via OCR from the digitized Grinnell College Library archive of the student newspaper, the Scarlet and Black. Access via a GitHub repository.
Baseball Player Information and Performance Statistics (amateur players drafted in the Rule 4 draft, 2009-2011)
CSV files that include three tables with data on baseball players drafted through the Rule 4 amateur draft between 2009 and 2011. “Drafted Player Info” table includes: player draft year, years of eligibility remaining, draft round, draft slot, school type, school drafted out of, signing bonus, and drafting team. “Pitching Statistics” table includes: season, age, age differential, team, league, level, affiliation, W, L, W-L%, ERA, RA9, G, GS, GF, CG, SHO, SV, IP, H, R, ER, HR, BB, IBB, SO, HBP, BK, WP, BF, WHIP, H9, HR9, BB9, SO9, SO/W. “Offensive Statistics” table includes: season, age, age differential, team, league, level, affiliation, G, PA, AB, R, H, 2B, 3B, HR, RBI, SB, CS, BB, SO, BA, OBP, SLG, OPS, TB, GDP, HBP, SH, SF, IBB.
The datasets, data dictionaries, and other information about the datasets (including sample R scripts for generating visualizations) are available online at the project website.
Combined public datasets relating to various facets of the school-to-prison pipeline
Combined public datasets relating to various facets of the school-to-prison pipeline are available via a GitHub repository.
The Datasets folder includes data from the following organizations:
- Office of Juvenile Justice and Delinquency Prevention
- Iowa Department of Human Services
- Iowa Department of Public Health
- U.S. Census Bureau
- U.S. Dept. of Education Civil Rights Data Collection
- Iowa Department of Education
- Iowa Workforce Development