The National Crime Victimization Survey (NCVS) is the nation's primary source of information on criminal victimization. The NCVS collects information on nonfatal personal crimes and household property crimes both reported and not reported to the police. Survey respondents provide demographic information about themselves and whether they experienced a victimization. For each victimization incident, the NCVS collects information about the offender (e.g., age, race and Hispanic origin, sex, and victim-offender relationship), characteristics of the crime (e.g., time and place of occurrence, use of weapons, nature of the injury, and economic consequences), whether the crime was reported to police, reasons the crime was or was not reported, and victim experiences with the criminal justice system.
The NCVS was originally designed to provide national-level estimates of criminal victimization. Since its inception in 1973, BJS has recognized the need for victimization data at the state and/or local levels. The three major reviews of the NCVS program—Penick and Owens, 1976; Biderman et al., 1986; and Groves and Cork, 2008—point to the demand from local criminal justice administrators for empirical information and data on crime that they can use to inform policy and practice.
Interest in subnational victimization data is met with practical limitations in producing these data. The NCVS is a complex household survey, which involves about 240,000 interviews on criminal victimization, involving 160,000 unique persons in about 95,000 households each year. Administration of the NCVS to produce reliable national-level estimates is costly and can potentially involve the risk of disclosing sensitive information. These challenges are amplified in producing estimates for lower levels of geography. Thus, options for producing subnational victimization data through the NCVS require careful consideration.
Accordingly, BJS crafted plans to produce subnational crime data through multiple strategies and responses and supported research that demonstrated the NCVS can be enhanced to produce several types of subnational estimates. Since 2012, BJS worked with subject matter experts to develop various approaches for producing subnational victimization estimates, including:
- Boosting the NCVS sample size in large states to obtain direct state-level estimates
- Obtaining direct estimates in subnational areas using reweighting methodologies and existing NCVS data collected under the national design
- Modeling state-level estimates using existing NCVS sample and external sources of data
- Creating generic areas with geocoded identifiers
- Generating a cost-effective alternative local-area survey based on the NCVS for direct administration within subnational areas.
These approaches are illustrated in the figure below, and the relative benefits and limitations are summarized in Approaches to Subnational Estimation with the NCVS.
Approaches to Subnational Estimation with the NCVS
Beginning in 2016, BJS increased the size of the NCVS core sample in the 22 most populous states, based on preliminary findings from a boost pilot test. Concurrent with the sample boost, BJS also adjusted the allocation of the sample as needed within these 22 states to enhance the representativeness of the NCVS sample relative to the population within each of these states. Together, these changes were designed to enable production of state-level estimates of violent victimization for the 22 states and specific metropolitan areas within those states with three years of aggregated data. BJS decided to focus on the 22 most populous states based on the estimated sample size that would be needed to produce representative estimates with sufficient precision, while balancing the increased costs associated with a larger and more geographically diverse sample. The 22 states identified for state-level estimates are Arizona, California, Colorado, Florida, Georgia, Illinois, Indiana, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Jersey, New York, North Carolina, Ohio, Pennsylvania, Tennessee, Texas, Virginia, Washington, and Wisconsin. These states selected for the sample boost accounted for 79% of the total U.S. population and approximately 80% of violent crime reported in the FBI's Part 1 Uniform Crime Reports.
BJS is working to carefully examine and validate the initial three-year direct NCVS estimates and plans to produce a report on these data. See forthcoming publications for more information.
- The state-level sample design uses direct observation to collect information from sample members within subnational areas to generate estimates.
- It provides the full content of the NCVS core and supplements for subnational areas.
- The boosted sample design is based on same methodology as the national NCVS, facilitating ease of understanding and replication.
- The sample boost reduces coverage error in the target states and large areas within those states given the goal of producing representative state-level estimates.
- The significant cost involved with expanding the NCVS sample limits the number of states and large areas within states for which reliable estimates can be produced.
- The nature of the sample design generally requires a minimum of three years of data to obtain precise estimates.
- The sample sizes and estimated precision of estimates rely on methodological assumptions about the violent crime rate within states. If the violent crime rate within a state differs significantly from these assumptions, or the analysis involves other crime types or population subgroups, more than three years of data may be required to obtain precise estimates.
- Given that the subnational design is only now emerging, the availability of historical data for subnational areas is limited.
To mitigate disclosure concerns, state identifiers are not currently included on NCVS public-use files. Independent state-level analyses must be conducted using the restricted-use data files available at the U.S. Census Bureau headquarters or at a Census Research Data Center.
To address the needs of data users and stakeholders for more localized victimization data and to allow for the evaluation of trends in victimization over time within subnational areas prior to the sample redesign in 2016, BJS explored the feasibility of producing subnational estimates for years prior to 2016 using reweighting methods. These methods involve aggregating data from the national sample over multiple years and recalibrating the distribution of NCVS respondents within a particular subnational area (i.e., state or MSA) for as many known demographic and socioeconomic characteristics as possible. Using this method, victimization estimates have been generated for the 11 largest states and 52 Metropolitan Statistical Areas (MSAs) from 2008-2015.
Additional information about the reweighting methodology and victimization estimates for MSAs and states are available in the BJS third-party report, Estimating Crime Victimization in Large States and MSAs Through Reweighting: Evaluation and Methodology. The accompanying NCVS public-use files for survey years 2000-2015 contain geographic identifiers for the 52 largest MSAs. These files are available here.
- The reweighting approach benefits from using direct observation and, once the weights have been adjusted, standard analysis techniques can be used to produce NCVS victimization estimates.
- Because the reweighting method was devised to work with data from the nationally representative sample, NCVS estimates based on this method can be calculated for multiple years prior to the 2016 redesign allowing trends in victimization within subnational areas to be examined over time.
- NCVS public-use files for survey years 2000-2015 containing geographic identifiers for the 52 largest MSAs allow analysts to conduct MSA-level analyses without the need to access the restricted-use data files.
- Estimates may not be as robust or accurate as estimates produced after the sample boost due to sampling error and coverage error.
- Sampling error, which occurs in all surveys, is the result of not obtaining data from all persons in the population and can be particularly problematic when sample sizes are small. Although multiple years of data were aggregated to increase the sample sizes in subnational areas, they were often smaller than what is typically used to produce national estimates for the NCVS.
- Coverage error occurs when the sample does not adequately represent the population of interest. Prior to the 2016 redesign, the national sample was not allocated with the goal of representing the population within states. The reweighting process was designed to correct for this type of error for as many characteristics as possible. However, if there are unobserved attributes of the population that were not accounted for during the reweighting process and those characteristics are highly correlated with victimization, estimates may still contain an unknown level of bias. State-based analyses are more likely than MSA-based analyses to be subject to coverage error and, therefore, state-based comparisons should be viewed with greater caution. See reweighting report for more information.
- The different potential error sources make it inadvisable to compare estimates from before the NCVS sample boost to those from after the sample boost. Relative comparisons are also recommended over absolute comparisons. For example, within a given year, one could examine whether a state has a higher or lower rate than the national average, but should not compare a state across two points in time.
State identifiers are currently not included on NCVS public-use files so state-level analyses must be conducted using the restricted-use data files available at the U.S. Census Bureau or at a Census Research Data Center.
When direct estimation techniques are not feasible to employ due to insufficient sample sizes, the indirect or model-based estimation approach can be used to generate victimization estimates for subnational areas. The model-based approach leverages auxiliary data or data from different time periods to effectively increase the sample size by linking related small areas. Auxiliary data from the American Community Survey, the FBI's Uniform Crime Reporting Program, and the decennial census have been used to augment the NCVS sample to generate model-based estimates of criminal victimization for all 50 states, DC, large metropolitan areas, and counties.
- The use of auxiliary data allows reliable estimates to be produced in subnational areas that would not be possible with the direct design-based estimation approach, including areas that the NCVS subnational design does not cover.
- Model-based estimates can be produced for multiple years allowing trends in victimization over time to be analyzed.
- Model-based estimates can also be benchmarked to ensure that subnational estimates are consistent with national totals.
- The process for creating model-based estimates can be extremely labor intensive and estimates are not easily updated as new information from the NCVS or auxiliary sources becomes available.
- In contrast to a data file containing the full scope of NCVS variables and allowing subgroup analyses to be conducted, estimates are limited to those that have been modeled.
- The indirect estimation approach is highly dependent on having high quality auxiliary data that is predictive of the target outcomes.
The subnational geographic identifiers needed to produce the model-based estimates are not available on NCVS public-use files so analyses must be conducted using the restricted-use data files available at the U.S. Census Bureau or a Census Research Data Center.
Below is a list of reports using or describing the application of the model-based, or small area estimation, methodology for NCVS data:
- Fay, R.E. (2021). Constructing and Disseminating Small Area Estimates from the National Crime Victimization Survey, 2007-2018. Prepared for the Bureau of Justice Statistics. Washington, DC: Westat.
- Liao, D., Ph.D., Zimmer, S. Ph.D., Berzofsky, M. Dr.P.H. (2021). Small Area Estimation for the National Crime Victimization Survey: A Guide for Data Processing and Estimation Procedures. Prepared for the Bureau of Justice Statistics. Washington, DC: RTI International.
- Fay, R.E. and Diallo, M.S. (2015). Developmental Estimates of Subnational Crime Rates Based on the National Crime Victimization Survey. Prepared for the Bureau of Justice Statistics. Washington, DC: Westat.
- Fay, R.E., Planty, M., and Diallo, M.S. (2013). Small Area Estimates from the National Crime Victimization Survey. Proceedings of the Survey Research Methods Section, Joint Statistical Meetings, American Statistical Association. Prepared for the Bureau of Justice Statistics. Washington, DC: Westat.
- Fay, R.E. and Diallo, M.S. (2012). Small Area Estimation Alternatives for the National Crime Victimization Survey. Proceedings of the Survey Research Methods Section (pp. 3742-3756), Joint Statistical Meetings, American Statistical Association. Prepared for the Bureau of Justice Statistics. Washington, DC: Westat.
- Fay, R.E. and Li, J. (2011). Predicting Violent Crime Rates for the 2010 Redesign of the National Crime Victimization Survey (NCVS). Proceedings of the Survey Research Methods Section, Joint Statistical Meetings, American Statistical Association. Prepared for the Bureau of Justice Statistics. Washington, DC: Westat.
As part of the subnational program, BJS evaluated the use of generic area typologies based on various geographic, social, economic, or demographic characteristics as a way to provide more localized information about criminal victimization. These generic areas are intended to represent all places that are similar to each other based on a particular set of characteristics (e.g., urban areas in the northeast with a population size of 1 million or more persons). The use of generic areas allows data users to identify the typology which best aligns with their primary area of interest and make relative comparisons against similar and different types of places from that perspective.
- The generic area typologies can be defined to be mutually exclusive and collectively exhaustive such that all subnational areas within the U.S. are covered by one combination of the characteristics chosen.
- NCVS public-use files contain several subnational geographic identifiers that could be combined to create geographic generic area typologies. These subnational geographic identifiers include region, population size, and urbanicity.
- Generic area estimates are less granular and the process assumes that all areas with the same value for the characteristic(s) of interest have similar rates of victimization.
- Because the NCVS was not designed to produce representative estimates for these areas, generic area estimates could suffer from coverage error and result in biased estimates.
- The majority of generic area typologies will have to be developed using the NCVS restricted-use data files available at the U.S. Census Bureau or at a Census Research Data Center.
Below is a list of reports using or describing the application of the generic area approach for NCVS data:
- Violent Victimization in New and Established Hispanic Areas, 2007–2010
- Assessing the Coverage and Reliability of Subnational Geographic Identifiers in the NCVS Public-Use File
- Victimization in Different Types of Areas in the United States: Subnational Findings from the National Crime Victimization Survey, 2010–2015
Another avenue for producing subnational estimates of victimization explored by BJS as part of the subnational estimation program is through the administration of the Local Area Crime Survey (LACS). Using mail data collection procedures and a truncated instrument based on the NCVS, these methods were constructed to be relatively inexpensive so that they could be administered by states, MSAs, cities, or police jurisdictions. Victimization estimates produced from the LACS are designed to correlate with those from the core NCVS and to support estimates of change over time within an area as well as comparisons across areas. A field test evaluating the feasibility of the LACS was administered in the 40 largest core-based statistical areas (CBSAs) in 2015 and 2016.
The LACS survey kit includes survey questionnaires, a template for a request for proposals from survey vendors, and sample supporting materials from the Field Test. This document offers general guidance on how to use those materials to conduct the survey and analyze the results.
- Estimates from the LACS utilize direct observation with data being collected directly from households and persons in the subnational area of interest.
- The LACS can be administered at different jurisdictional levels depending on the needs of stakeholders.
- Administration of the LACS is less costly than the NCVS.
- The LACS includes measures of community safety and perceptions of police that can be collected alongside victimization data for a particular area.
- Although less expensive to administer than the core NCVS design, the LACS did not support comparable estimates of crime incidence (e.g., the number of violent crimes per 1,000 persons) and exhibited relatively low correlations with estimates from the NCVS and UCR for some crime types.
- The LACS instrument contains fewer victimization incidents with less detail than the core NCVS.
- Unless widely adopted by multiple jurisdictions, the ability to make cross-sectional comparisons across similar subnational areas will be limited.
Below is a list of reports related to the LACS instrument and methodology:
Public-use data from the National Crime Victimization Survey (NCVS) are available from the National Archive of Criminal Justice Data (NACJD) within the Inter-University Consortium for Political and Social Research (ICPSR) in a variety of forms designed to facilitate their accessibility and analytical use. These include the full data collection, which is stored as a hierarchically structured data file, and a variety of smaller extract files.
Access to the NCVS, geocoded restricted-use data files are made available through the Census Bureau’s Federal Statistical Research Data Centers (RDC). Before researchers can access NCVS microdata, a research proposal must be submitted and approved by BJS and the Census Bureau. For more information on the proposal process, visit Center for Economic Studies (CES).
What can subnational data be used for?
Subnational victimization estimates have great potential value for federal and nonfederal data users, stakeholders, and researchers. Federal stakeholders who currently allocate funding or resources for crime victims and crime prevention based on official police crime estimates could use the subnational victimization estimates to understand how the allocation of funding might change if estimates of unreported crime are taken into account. Policymakers could use subnational estimates to examine state and local variation in crimes reported and not reported to the police and make comparisons across states and localities. Law enforcement officials could use the findings to begin to understand differences in rates of crime and reporting to police within and across jurisdictions. The data could also be used in conjunction with official police statistics to better to understand the correlation between the NCVS and official police reports of crime. In addition to subnational victimization estimates, perceptions of police and community-related safety measures are also of potential interest to local areas in their efforts to combat and respond to crime.
Why have different methods been used to produce estimates for the same areas?
Each of the potential subnational estimation methodologies have their own set of benefits and limitations. Prior to examining the different approaches, it was unknown how these methods would perform within certain area types. For example, the reweighting methodology is relatively straightforward to implement but the number of subnational areas that would support estimates was limited due to insufficient sample sizes. In contrast, model-based estimates using existing NCVS data and external sources can be produced for a much greater number of subnational geographies, but implementation of these methods is complicated and difficult to replicate. Although the direct sample boost method may be the most desirable option, it is also the most expensive to implement. Further, the ability to evaluate trends over time is limited by the national-level design used prior to 2016 and the need to aggregate multiple years of data. Data users are encouraged to examine and understand the methodology used to produce the different types of estimates and the inherent limitations of each approach.
Which subnational estimates can be trusted?
Each of the estimation approaches pursued by BJS is based on rigorous statistical methodology, but, as with all survey-based approaches, the resulting estimates are subject to different sources of error which introduces uncertainty as reflected in the variance of the estimate. As such, the various subnational estimates should be interpreted as the midpoint of a confidence interval that is likely to contain the true value. This interval may be quite large for some estimates, and can vary considerably across the different methodologies and area types due to the degree to which a given estimate is subject to the various sources of error. Each report produced using a subnational methodology details considerations that readers should take into account in evaluating the data.
Can I use the subnational estimates to rank areas across the nation?
Crime estimates, even at the national level, are subject to different types of potential error. The incidence of crime at local levels is correlated with different factors that are difficult to capture within a single survey. Subnational estimates, like national NCVS estimates, are designed to provide an indicator of crime victimization outside of indicators generated by police and to provide data on characteristics of victims and crimes for population and subgroups. Therefore, users are cautioned against ranking areas solely based on these data without considering the fuller context of local conditions and characteristics.
Which methodology is appropriate for my analysis?
When choosing a method for a particular analysis or reporting a subnational estimate from a published BJS analysis, it is important to understand the benefits and limitations of each approach. Comparisons of estimates produced from the different methodologies, even for the same area, are discouraged as it is not possible to know how much of the difference is due to different levels of bias across the estimates. Even for estimates produced using the same methodology, relative comparisons rather than absolute comparisons are encouraged as point estimates that may appear different may not actually be statistically different when the uncertainty, or standard error, of the estimates is considered. Also, it is important to note that different methods require different file types in order to be implemented. With the exception of a limited number of generic area typologies and direct estimation with reweighted data for the 52 largest MSAs (public-use file forthcoming), all other methods described require access to the restricted-use data files only available at the Census Bureau or through a Census Research Data Center.
- Biderman, A.D., Cantor, D., Lynch, J.P., and Martin, E. (1986). Final Report of Research and Development for the Redesign of the National Crime Survey. Prepared for the Bureau of Justice Statistics. Washington, DC: Bureau of Social Science Research.
- Groves, R.M. and Cork, D.L. (2008). Surveying Victims: Options for Conducting the National Crime Victimization Survey. Panel to Review the Programs of the Bureau of Justice Statistics, National Research Council, Committee on National Statistics and Committee on Law and Justice, Division of Behavioral and Social Sciences and Education. Washington, DC: National Academies Press.
- Penick, B.K.E. and Owens, M.E.B. (1976). Surveying Crime. Panel for the Evaluation of Crime Surveys, Committee on National Statistics, Academy of Mathematical and Physical Sciences. Washington, DC: National Academy of Sciences.