STA 9750 Mini-Project #03: Who Goes There?

US Internal Migration and Implications for Congressional Reapportionment in Texas

Author

Raúl J. Solá Navarro

Published

April 25, 2026

Introduction

Between 2020 and 2024, Texas added more people than any other state in the nation. The Sun Belt surge that has reshaped American politics over the past two decades has made Texas the center of the country’s demographic transformation. As the “Lone Star” gains congressional clout, its northern counterparts watch seats slip away. But what does the data actually say? Where are Texans coming from, where are they going, and can a governor with a targeted advertising budget shift the numbers ahead of the 2030 census?

This report uses American Community Survey (ACS) migration state-to-state and metro-to-metro flow data to quantify those patterns, build population forecasts through 2030, and estimate Texas’s likely congressional apportionment for the 2032 elections. We close with a data-driven advertising strategy designed to maximize the state’s political footprint.


Reproducibility note: All external files are cached in data/mp03/. If you are running this document for the first time, the files will be downloaded automatically. Subsequent renders use the cached copies and require no internet access.

State Population Data

Texas has been the fastest-growing large state in the nation over the past decade. The table below ranks states by their current 2024 population. Texas sits firmly in second place at over 31 million residents, trailing only California. For growth rates and absolute population added since 2015, see the analysis in Q1 of the Exploratory Data Analysis section.

Code
# B01003_001 = Total Population
# Note: 2020 ACS-1 was never released due to COVID-19 low response rates.

pop_raw <- map(
  c(2015:2019, 2021:2024),
  \(yr) get_acs(
    geography   = "state",
    variables   = c(total_population = "B01003_001"),
    year        = yr,
    survey      = "acs1",
    geometry    = TRUE,
    cache_table = TRUE
  ) |>
    mutate(year = yr)
) |>
  list_rbind()

state_pop <- pop_raw |>
  tigris::shift_geometry() |>
  filter(!NAME %in% c("District of Columbia", "Puerto Rico")) |>
  transmute(
    geoid            = GEOID,
    state_name       = NAME,
    state_abbr       = state.abb[match(NAME, state.name)],
    year             = year,
    total_population = estimate,
    pop_moe          = moe,
    geometry         = geometry
  )
Code
state_pop_2024 <- state_pop |>
  filter(year == 2024) |>
  arrange(desc(total_population))

bind_rows(
  slice_head(state_pop_2024, n = 10),
  slice_tail(state_pop_2024, n = 10)
) |>
  st_drop_geometry() |>
  mutate(rank = c(1:10, 41:50)) |>
  select(rank, state_name, state_abbr, total_population) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Populations, 2024 ACS-1 Estimates**"),
    subtitle = "Largest and smallest states by total population"
  ) |>
  cols_label(
    state_name       = "State",
    state_abbr       = "",
    total_population = "Population"
  ) |>
  fmt_integer(total_population) |>
  tab_row_group(label = "10 Smallest States", rows = 11:20) |>
  tab_row_group(label = "10 Largest States",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_abbr == "TX")
  ) |>
  tab_footnote(
    footnote  = "Texas highlighted in blue.",
    locations = cells_column_labels(state_name)
  ) |>
  tab_source_note(
    "Source: US Census Bureau, ACS 1-Year Estimates (2024), Table B01003."
  )
State Populations, 2024 ACS-1 Estimates
Largest and smallest states by total population
State1 Population
10 Largest States
1 California CA 39,431,263
2 Texas TX 31,290,831
3 Florida FL 23,372,215
4 New York NY 19,867,248
5 Pennsylvania PA 13,078,751
6 Illinois IL 12,710,158
7 Ohio OH 11,883,304
8 Georgia GA 11,180,878
9 North Carolina NC 11,046,024
10 Michigan MI 10,140,459
10 Smallest States
41 New Hampshire NH 1,409,032
42 Maine ME 1,405,012
43 Montana MT 1,137,233
44 Rhode Island RI 1,112,308
45 Delaware DE 1,051,917
46 South Dakota SD 924,669
47 North Dakota ND 796,568
48 Alaska AK 740,133
49 Vermont VT 648,493
50 Wyoming WY 587,618
1 Texas highlighted in blue.
Source: US Census Bureau, ACS 1-Year Estimates (2024), Table B01003.

The chart below tracks cumulative population growth since 2015 for Texas and four peer states, indexed to 100 at the start of the period. The 2020 gap reflects the suspension of ACS-1 data collection due to COVID-19 since no ACS-1 was released for that year. Texas and Florida’s trajectories clearly separate them from the pack around 2021.

Code
focus_states <- c("Texas", "California", "Florida", "New York", "Arizona")

pop_indexed <- state_pop |>
  st_drop_geometry() |>
  filter(state_name %in% focus_states) |>
  group_by(state_name) |>
  mutate(
    pop_2015 = total_population[year == 2015],
    index    = 100 * total_population / pop_2015
  ) |>
  ungroup()

us_total <- state_pop |>
  st_drop_geometry() |>
  group_by(year) |>
  summarise(total_population = sum(total_population), .groups = "drop") |>
  mutate(
    state_name = "US Total",
    pop_2015   = total_population[year == 2015],
    index      = 100 * total_population / pop_2015
  )

state_colors <- c(
  "Texas"      = "#002868",
  "Florida"    = "#BF0A30",
  "Arizona"    = "#D4A853",
  "California" = "#2E8B8B",
  "New York"   = "#6B6B6B",
  "US Total"   = "black"
)

state_lty <- c(
  "Texas"      = "solid",
  "Florida"    = "solid",
  "Arizona"    = "solid",
  "California" = "solid",
  "New York"   = "solid",
  "US Total"   = "dashed"
)

bind_rows(pop_indexed, us_total) |>
  ggplot(aes(x = year, y = index,
             colour = state_name, linetype = state_name)) +
  geom_line(linewidth = 1.1) +
  geom_point(size = 2) +
  geom_text(
    data = \(d) d |> filter(year == 2024),
    aes(label = glue("{state_name}\n{round(index, 1)}")),
    hjust = -0.08, size = 3, show.legend = FALSE
  ) +
  scale_colour_manual(values = state_colors) +
  scale_linetype_manual(values = state_lty) +
  scale_x_continuous(
    breaks = 2015:2024,
    limits = c(2015, 2026.5)
  ) +
  scale_y_continuous(
    labels = \(x) paste0(x - 100, "%"),
    breaks = seq(95, 125, 5)
  ) +
  labs(
    title    = "Population Growth Since 2015 (Indexed to 100)",
    subtitle = "Texas leads all major states and the national average",
    x        = NULL,
    y        = "Cumulative Growth Since 2015",
    colour   = NULL,
    linetype = NULL,
    caption  = "Source: ACS 1-Year Estimates, Tables B01003 (2015-2024)."
  ) +
  theme_tx() +
  theme(legend.position = "none")

Texas population growth versus the national average and selected comparison states, 2015-2024. The 2020 gap reflects the suspension of ACS-1 data collection during COVID-19. Texas outpaced all comparison states over the full period.

State-to-State Migration Flows

To understand where Texans are coming from, and where they’re going, we turn to the Census Bureau’s state-to-state migration flow files, derived from ACS-1 respondents who reported living in a different state one year prior. We parse the 2023 and 2024 flow files separately (they have different formats) and combine them into a single tidy table.

The table below highlights a striking pattern among the four largest states: California is simultaneously Texas’s largest source of in-migrants (77,161 people) and the top destination for Texans leaving the state (45,447 people). This two-way flow defines the largest migration corridor in the country.

Code
migration_flows |>
  filter(
    state_current %in% c("Texas", "California", "New York", "Florida"),
    state_1y      %in% c("Texas", "California", "New York", "Florida"),
    year_current  == 2024
  ) |>
  arrange(state_current, state_1y) |>
  mutate(
    label = if_else(
      state_current == state_1y,
      glue("Stayed in {state_current}"),
      glue("{state_1y} → {state_current}")
    )
  ) |>
  select(label, population, year_current, year_1y) |>
  gt() |>
  tab_header(
    title    = md("**State-to-State Migration Flows, 2024**"),
    subtitle = "Selected flows among four largest states"
  ) |>
  cols_label(
    label        = "Flow",
    population   = "People",
    year_current = "Survey Year",
    year_1y      = "Prior Year"
  ) |>
  fmt_integer(population) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = grepl("Texas", label))
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = grepl("Stayed", label),
      columns = label
    )
  ) |>
  tab_source_note(
    "Source: US Census Bureau, ACS 1-Year State-to-State Migration Flow Tables."
  )
State-to-State Migration Flows, 2024
Selected flows among four largest states
Flow People Survey Year Prior Year
Stayed in California 38,325,648 2024 2023
Florida → California 15,988 2024 2023
New York → California 24,927 2024 2023
Texas → California 45,447 2024 2023
California → Florida 36,194 2024 2023
Stayed in Florida 22,289,936 2024 2023
New York → Florida 50,661 2024 2023
Texas → Florida 45,259 2024 2023
California → New York 31,367 2024 2023
Florida → New York 28,080 2024 2023
Stayed in New York 19,210,972 2024 2023
Texas → New York 16,324 2024 2023
California → Texas 77,161 2024 2023
Florida → Texas 52,219 2024 2023
New York → Texas 28,233 2024 2023
Stayed in Texas 30,069,004 2024 2023
Source: US Census Bureau, ACS 1-Year State-to-State Migration Flow Tables.

Metro-Area Migration Flows

State-level data tells us where people are moving in aggregate, but metro-area flows reveal the specific urban corridors driving Texas’s growth. The Houston, Dallas, and Austin metros are among the fastest-growing in the country, but which cities are feeding them? We use the Census Bureau’s ACS-5 metro-to-metro migration API to find out.

The Los Angeles metro sends more people to Texas than any other single metro area with over 27,000 per year during the 2016-2020 period. New York and Chicago follow, confirming that Texas’s growth is largely powered by residents fleeing the high costs and congestion of the country’s three largest legacy metros.

Code
metro_flows |>
  filter(
    metro1_state == "TX",
    !is.na(metro2_state),
    metro2_state != "TX"
  ) |>
  group_by(metro2_name) |>
  summarise(
    total_moved_in = sum(moved_in, na.rm = TRUE),
    .groups = "drop"
  ) |>
  slice_max(total_moved_in, n = 10) |>
  mutate(rank = row_number()) |>
  select(rank, metro2_name, total_moved_in) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Top 10 Metro Sources of In-Migration to Texas**"),
    subtitle = "Non-Texas metros sending the most residents to Texas, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    metro2_name    = "Origin Metro Area",
    total_moved_in = "People Moved In"
  ) |>
  fmt_integer(total_moved_in) |>
  data_color(
    columns = total_moved_in,
    palette = c("#EEF2FF", "#002868")
  ) |>
  tab_source_note(
    "Source: US Census Bureau, ACS 5-Year Migration Flows API (2016-2020)."
  )
Top 10 Metro Sources of In-Migration to Texas
Non-Texas metros sending the most residents to Texas, ACS 5-Year 2016-2020
Origin Metro Area People Moved In
1 Los Angeles-Long Beach-Anaheim, CA Metro Area 27,445
2 New York-Newark-Jersey City, NY-NJ-PA Metro Area 19,853
3 Chicago-Naperville-Elgin, IL-IN-WI Metro Area 16,253
4 Washington-Arlington-Alexandria, DC-VA-MD-WV Metro Area 14,916
5 Phoenix-Mesa-Chandler, AZ Metro Area 12,695
6 San Francisco-Oakland-Berkeley, CA Metro Area 11,203
7 Miami-Fort Lauderdale-Pompano Beach, FL Metro Area 10,300
8 San Diego-Chula Vista-Carlsbad, CA Metro Area 9,509
9 Denver-Aurora-Lakewood, CO Metro Area 9,222
10 Atlanta-Sandy Springs-Alpharetta, GA Metro Area 9,159
Source: US Census Bureau, ACS 5-Year Migration Flows API (2016-2020).

Exploratory Data Analysis

The following analyses address nine key questions about US internal migration patterns, with particular attention to Texas and the Houston metro area.

Q1: Highest State Population Growth Rates

Which states have grown the fastest over the past decade? We measure growth as the percentage change in total population from 2015 to 2024. Idaho and Utah lead on a percentage basis, but Texas leads all states in absolute population added with nearly 3.8 million people over the period.

Code
growth_rates <- state_pop |>
  st_drop_geometry() |>
  filter(year %in% c(2015, 2024)) |>
  select(state_name, state_abbr, year, total_population) |>
  pivot_wider(names_from = year, values_from = total_population,
              names_prefix = "pop_") |>
  mutate(
    growth_pct = 100 * (pop_2024 - pop_2015) / pop_2015,
    pop_added  = pop_2024 - pop_2015
  ) |>
  arrange(desc(growth_pct))

growth_rates |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  select(rank, state_name, state_abbr, pop_2015, pop_2024,
         growth_pct, pop_added) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Population Growth Rates, 2015-2024**"),
    subtitle = "Fastest and slowest growing states by percentage change"
  ) |>
  cols_label(
    state_name  = "State",
    state_abbr  = "",
    pop_2015    = "2015 Population",
    pop_2024    = "2024 Population",
    growth_pct  = "Growth Rate",
    pop_added   = "People Added"
  ) |>
  fmt_integer(c(pop_2015, pop_2024, pop_added)) |>
  fmt_number(growth_pct, decimals = 1, suffix = "%") |>
  tab_row_group(label = "10 Slowest Growing States", rows = 11:20) |>
  tab_row_group(label = "10 Fastest Growing States", rows = 1:10) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_abbr == "TX")
  ) |>
  data_color(
    columns = growth_pct,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year Estimates, Table B01003 (2015, 2024)."
  )
State Population Growth Rates, 2015-2024
Fastest and slowest growing states by percentage change
State 2015 Population 2024 Population Growth Rate People Added
10 Fastest Growing States
1 Idaho ID 1,654,930 2,001,619 20.9 346,689
2 Utah UT 2,995,919 3,503,613 16.9 507,694
3 Florida FL 20,271,272 23,372,215 15.3 3,100,943
4 Texas TX 27,469,114 31,290,831 13.9 3,821,717
5 Nevada NV 2,890,845 3,267,467 13.0 376,622
6 South Carolina SC 4,896,146 5,478,831 11.9 582,685
7 Delaware DE 945,934 1,051,917 11.2 105,983
8 Arizona AZ 6,828,065 7,582,384 11.0 754,319
9 Washington WA 7,170,351 7,958,180 11.0 787,829
10 Montana MT 1,032,949 1,137,233 10.1 104,284
10 Slowest Growing States
41 Kansas KS 2,911,641 2,970,606 2.0 58,965
42 Hawaii HI 1,431,603 1,446,146 1.0 14,543
43 California CA 39,144,818 39,431,263 0.7 286,445
44 New York NY 19,795,791 19,867,248 0.4 71,457
45 Wyoming WY 586,107 587,618 0.3 1,511
46 Alaska AK 738,432 740,133 0.2 1,701
47 Illinois IL 12,859,995 12,710,158 −1.2 −149,837
48 Louisiana LA 4,670,724 4,597,740 −1.6 −72,984
49 Mississippi MS 2,992,333 2,943,045 −1.6 −49,288
50 West Virginia WV 1,844,128 1,769,979 −4.0 −74,149
Source: ACS 1-Year Estimates, Table B01003 (2015, 2024).

Q2: Migration To and From New York State

New York’s migration picture is clear: it sends far more people to other states than it receives. Florida and Texas are the top two destinations for departing New Yorkers which directly benefits Texas’s congressional apportionment at New York’s expense.

Code
ny_in <- migration_flows |>
  filter(state_current == "New York",
         state_1y != "New York",
         year_current == 2024) |>
  arrange(desc(population)) |>
  slice_head(n = 10) |>
  mutate(direction = "Into New York")

ny_out <- migration_flows |>
  filter(state_1y == "New York",
         state_current != "New York",
         year_current == 2024) |>
  arrange(desc(population)) |>
  slice_head(n = 10) |>
  mutate(direction = "Out of New York")

bind_rows(
  ny_in  |> select(direction, state = state_1y,      population),
  ny_out |> select(direction, state = state_current, population)
) |>
  group_by(direction) |>
  mutate(rank = row_number()) |>
  ungroup() |>
  pivot_wider(names_from  = direction,
              values_from = c(state, population),
              names_sep   = "_") |>
  select(rank,
         `state_Into New York`, `population_Into New York`,
         `state_Out of New York`, `population_Out of New York`) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**New York State Migration Flows, 2024**"),
    subtitle = "Top 10 origin states for in-migrants and destination states for out-migrants"
  ) |>
  cols_label(
    `state_Into New York`        = "Origin State",
    `population_Into New York`   = "People",
    `state_Out of New York`      = "Destination State",
    `population_Out of New York` = "People"
  ) |>
  fmt_integer(c(`population_Into New York`, `population_Out of New York`)) |>
  tab_spanner(label   = "Moving INTO New York",
              columns = c(`state_Into New York`, `population_Into New York`)) |>
  tab_spanner(label   = "Moving OUT of New York",
              columns = c(`state_Out of New York`, `population_Out of New York`)) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )
New York State Migration Flows, 2024
Top 10 origin states for in-migrants and destination states for out-migrants
Moving INTO New York
Moving OUT of New York
Origin State People Destination State People
1 New Jersey 36,002 New Jersey 56,799
2 California 31,367 Florida 50,661
3 Florida 28,080 Pennsylvania 29,274
4 Pennsylvania 23,152 Texas 28,233
5 Massachusetts 17,862 Connecticut 25,095
6 Texas 16,324 California 24,927
7 Connecticut 15,160 Massachusetts 18,697
8 Virginia 10,310 South Carolina 18,324
9 North Carolina 10,076 North Carolina 17,978
10 Maryland 8,400 Virginia 16,364
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q3: Migration To and From New York City Metro

At the metro level, NYC’s largest domestic migration partner is a relatively short-distance neighbor: Philadelphia. More consequential for Texas is the strong flow from NYC to Miami and Los Angeles, as the data show NYC residents dispersing to desirable Sun Belt cities rather than directly to Texas. Nonetheless, New York City was the 4th-largest source of in-migrants to the Houston metro area specifically.

Code
nyc_pattern <- "New York-Newark-Jersey City"

# Top sources INTO NYC - US metros only, excluding non-metro categories
nyc_in <- metro_flows |>
  filter(
    str_detect(metro1_name, nyc_pattern),
    !str_detect(metro2_name, nyc_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_in),
    moved_in > 0
  ) |>
  arrange(desc(moved_in)) |>
  slice_head(n = 10) |>
  select(metro = metro2_name, people = moved_in)

# Top destinations OUT of NYC - US metros only
nyc_out <- metro_flows |>
  filter(
    str_detect(metro1_name, nyc_pattern),
    !str_detect(metro2_name, nyc_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_out),
    moved_out > 0
  ) |>
  arrange(desc(moved_out)) |>
  slice_head(n = 10) |>
  select(metro = metro2_name, people = moved_out)

bind_cols(
  nyc_in  |> mutate(rank = row_number()) |>
    select(rank, in_metro = metro, in_people = people),
  nyc_out |>
    select(out_metro = metro, out_people = people)
) |>
  mutate(
    in_metro  = str_remove(in_metro,  " Metro Area$"),
    out_metro = str_remove(out_metro, " Metro Area$")
  ) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**New York City Metro Migration Flows**"),
    subtitle = "Top 10 origin and destination metros, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    in_metro   = "Origin Metro",
    in_people  = "People",
    out_metro  = "Destination Metro",
    out_people = "People"
  ) |>
  fmt_integer(c(in_people, out_people)) |>
  tab_spanner(label   = "Moving INTO NYC",
              columns = c(in_metro, in_people)) |>
  tab_spanner(label   = "Moving OUT of NYC",
              columns = c(out_metro, out_people)) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020)."
  )
New York City Metro Migration Flows
Top 10 origin and destination metros, ACS 5-Year 2016-2020
Moving INTO NYC
Moving OUT of NYC
Origin Metro People Destination Metro People
1 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 19,600 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 35,254
2 Washington-Arlington-Alexandria, DC-VA-MD-WV 11,162 Miami-Fort Lauderdale-Pompano Beach, FL 22,411
3 Boston-Cambridge-Newton, MA-NH 10,019 Washington-Arlington-Alexandria, DC-VA-MD-WV 17,558
4 Los Angeles-Long Beach-Anaheim, CA 9,761 Los Angeles-Long Beach-Anaheim, CA 17,107
5 Poughkeepsie-Newburgh-MiddleTown, NY 9,492 Bridgeport-Stamford-Norwalk, CT 16,709
6 Miami-Fort Lauderdale-Pompano Beach, FL 8,085 Poughkeepsie-Newburgh-MiddleTown, NY 15,475
7 Bridgeport-Stamford-Norwalk, CT 7,534 Boston-Cambridge-Newton, MA-NH 15,162
8 Trenton-Princeton, NJ 7,208 Atlanta-Sandy Springs-Alpharetta, GA 12,928
9 San Francisco-Oakland-Berkeley, CA 6,923 Trenton-Princeton, NJ 12,735
10 Albany-Schenectady-Troy, NY 5,512 Allentown-Bethlehem-Easton, PA-NJ 12,284
Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020).

Q4: States With Highest In-, Out-, and Net Migration

Texas leads all states in net domestic in-migration with a surplus of nearly 72,000 people in 2024. At the other extreme, California experienced the largest domestic out-migration drain in the country, losing a net 252,000 residents to other states. Florida’s total flows in both directions reflect its role as both a major destination and a pass-through state.

Code
state_migration_totals <- migration_flows |>
  filter(
    state_current != state_1y,
    year_current  == 2024
  ) |>
  group_by(state = state_current) |>
  summarise(total_in = sum(population, na.rm = TRUE), .groups = "drop") |>
  left_join(
    migration_flows |>
      filter(state_current != state_1y, year_current == 2024) |>
      group_by(state = state_1y) |>
      summarise(total_out = sum(population, na.rm = TRUE), .groups = "drop"),
    by = "state"
  ) |>
  mutate(net_migration = total_in - total_out) |>
  arrange(desc(net_migration))

state_migration_totals |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  left_join(
    state_pop |> st_drop_geometry() |>
      filter(year == 2024) |>
      select(state_name, state_abbr),
    by = c("state" = "state_name")
  ) |>
  select(rank, state, state_abbr, total_in, total_out, net_migration) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Migration Totals, 2024**"),
    subtitle = "Highest and lowest net migration states"
  ) |>
  cols_label(
    state         = "State",
    state_abbr    = "",
    total_in      = "In-Migration",
    total_out     = "Out-Migration",
    net_migration = "Net Migration"
  ) |>
  fmt_integer(c(total_in, total_out, net_migration)) |>
  tab_row_group(label = "10 Highest Net Out-Migration", rows = 11:20) |>
  tab_row_group(label = "10 Highest Net In-Migration",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state == "Texas")
  ) |>
  data_color(
    columns = net_migration,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )
State Migration Totals, 2024
Highest and lowest net migration states
State In-Migration Out-Migration Net Migration
10 Highest Net In-Migration
1 Texas TX 553,869 482,029 71,840
2 Florida FL 571,998 505,072 66,926
3 North Carolina NC 297,704 239,780 57,924
4 Arizona AZ 234,723 178,678 56,045
5 South Carolina SC 189,282 134,348 54,934
6 Nevada NV 129,109 88,898 40,211
7 Georgia GA 265,364 228,039 37,325
8 Tennessee TN 191,386 155,499 35,887
9 Oklahoma OK 106,621 72,461 34,160
10 Alabama AL 119,767 97,317 22,450
10 Highest Net Out-Migration
41 Iowa IA 59,199 69,417 −10,218
42 Alaska AK 29,456 39,842 −10,386
43 Louisiana LA 70,847 82,616 −11,769
44 Pennsylvania PA 232,208 247,858 −15,650
45 Colorado CO 181,683 204,323 −22,640
46 Massachusetts MA 150,513 180,497 −29,984
47 New Jersey NJ 149,512 211,969 −62,457
48 Illinois IL 199,235 279,613 −80,378
49 New York NY 280,073 412,187 −132,114
50 California CA 404,745 656,646 −251,901
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q5: Metro Areas With Highest In-, Out-, and Net Migration

At the metro level, Phoenix leads all US metros in net in-migration, followed by Riverside and Dallas-Fort Worth. Notably, two Texas metros, Dallas-Fort Worth and Austin, rank in the top four nationally. The bottom of the table is dominated by coastal legacy metros: New York, Los Angeles, Chicago, and San Francisco all rank among the largest net losers.

Code
metro_totals <- metro_flows |>
  filter(!is.na(metro1_state), !is.na(metro2_state)) |>
  group_by(metro = metro1_name, state = metro1_state) |>
  summarise(
    total_in  = sum(moved_in,  na.rm = TRUE),
    total_out = sum(moved_out, na.rm = TRUE),
    .groups   = "drop"
  ) |>
  mutate(
    net_migration = total_in - total_out,
    metro_short   = str_remove(metro, " Metro Area$")
  ) |>
  arrange(desc(net_migration))

metro_totals |>
  slice(c(1:10, (nrow(metro_totals) - 9):nrow(metro_totals))) |>
  mutate(rank = c(1:10, (nrow(metro_totals) - 9):nrow(metro_totals))) |>
  select(rank, metro_short, state, total_in, total_out, net_migration) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Metro Area Migration Totals**"),
    subtitle = "Highest and lowest net migration metros, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    metro_short   = "Metro Area",
    state         = "State",
    total_in      = "In-Migration",
    total_out     = "Out-Migration",
    net_migration = "Net Migration"
  ) |>
  fmt_integer(c(total_in, total_out, net_migration)) |>
  tab_row_group(label = "10 Highest Net Out-Migration", rows = 11:20) |>
  tab_row_group(label = "10 Highest Net In-Migration",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  data_color(
    columns = net_migration,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020)."
  )
Metro Area Migration Totals
Highest and lowest net migration metros, ACS 5-Year 2016-2020
Metro Area State In-Migration Out-Migration Net Migration
10 Highest Net In-Migration
1 Phoenix-Mesa-Chandler, AZ AZ 182,053 126,179 55,874
2 Riverside-San Bernardino-Ontario, CA CA 178,045 137,731 40,314
3 Dallas-Fort Worth-Arlington, TX TX 215,511 180,491 35,020
4 Austin-Round Rock-GeorgeTown, TX TX 116,913 85,901 31,012
5 Las Vegas-Henderson-Paradise, NV NV 92,544 61,696 30,848
6 Tampa-St. Petersburg-Clearwater, FL FL 134,637 105,146 29,491
7 Orlando-Kissimmee-Sanford, FL FL 133,976 105,395 28,581
8 Jacksonville, FL FL 73,419 54,772 18,647
9 Deltona-Daytona Beach-Ormond Beach, FL FL 41,276 23,312 17,964
10 Nashville-Davidson--Murfreesboro--Franklin, TN TN 73,461 55,589 17,872
10 Highest Net Out-Migration
383 Detroit-Warren-Dearborn, MI MI 72,716 94,314 −21,598
384 Boston-Cambridge-Newton, MA-NH MA 134,190 160,705 −26,515
385 San Jose-Sunnyvale-Santa Clara, CA CA 78,555 107,431 −28,876
386 Miami-Fort Lauderdale-Pompano Beach, FL FL 135,718 176,510 −40,792
387 San Juan-Bayamón-Caguas, PR PR 20,943 62,430 −41,487
388 Washington-Arlington-Alexandria, DC-VA-MD-WV DC 209,588 251,422 −41,834
389 San Francisco-Oakland-Berkeley, CA CA 162,213 204,077 −41,864
390 Chicago-Naperville-Elgin, IL-IN-WI IL 156,175 245,692 −89,517
391 Los Angeles-Long Beach-Anaheim, CA CA 242,529 375,810 −133,281
392 New York-Newark-Jersey City, NY-NJ-PA NY 235,844 490,636 −254,792
Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020).

Q6: States With Lowest Proportional In-Migration

What are the “sticky” states where residents are least likely to have moved from elsewhere? California leads this ranking, with just 1.05% of its population having arrived from another state in 2024. Texas ranks 9th on this list despite its rapid growth, reflecting the fact that its enormous existing population base dilutes even large absolute in-migration flows.

Code
migration_flows |>
  filter(year_current == 2024) |>
  group_by(state = state_current) |>
  summarise(
    total_pop = sum(population, na.rm = TRUE),
    moved_in  = sum(population[state_current != state_1y], na.rm = TRUE),
    .groups   = "drop"
  ) |>
  mutate(pct_moved_in = 100 * moved_in / total_pop) |>
  arrange(pct_moved_in) |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  left_join(
    state_pop |> st_drop_geometry() |>
      filter(year == 2024) |>
      select(state_name, state_abbr),
    by = c("state" = "state_name")
  ) |>
  select(rank, state, state_abbr, total_pop, moved_in, pct_moved_in) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Population Stickiness, 2024**"),
    subtitle = "States ranked by fraction of residents who moved in from another state"
  ) |>
  cols_label(
    state        = "State",
    state_abbr   = "",
    total_pop    = "Total Population",
    moved_in     = "Moved In From Another State",
    pct_moved_in = "% In-Migrants"
  ) |>
  fmt_integer(c(total_pop, moved_in)) |>
  fmt_number(pct_moved_in, decimals = 2, suffix = "%") |>
  tab_row_group(label = "10 Least Sticky (Most In-Migration)", rows = 11:20) |>
  tab_row_group(label = "10 Most Sticky (Least In-Migration)", rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state == "Texas")
  ) |>
  data_color(
    columns = pct_moved_in,
    palette = c("#EEF2FF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )
State Population Stickiness, 2024
States ranked by fraction of residents who moved in from another state
State Total Population Moved In From Another State % In-Migrants
10 Most Sticky (Least In-Migration)
1 California CA 38,730,393 404,745 1.05
2 Michigan MI 9,988,333 139,882 1.40
3 New York NY 19,491,045 280,073 1.44
4 Louisiana LA 4,520,727 70,847 1.57
5 Illinois IL 12,491,116 199,235 1.60
6 New Jersey NJ 9,309,234 149,512 1.61
7 Ohio OH 11,704,591 197,471 1.69
8 Pennsylvania PA 12,886,895 232,208 1.80
9 Texas TX 30,622,873 553,869 1.81
10 Minnesota MN 5,698,255 103,924 1.82
10 Least Sticky (Most In-Migration)
41 Rhode Island RI 1,090,525 34,860 3.20
42 New Hampshire NH 1,389,355 45,900 3.30
43 North Dakota ND 781,941 27,062 3.46
44 South Carolina SC 5,391,825 189,282 3.51
45 Idaho ID 1,970,568 72,442 3.68
46 Vermont VT 641,786 23,884 3.72
47 Hawaii HI 1,419,055 53,093 3.74
48 Nevada NV 3,205,924 129,109 4.03
49 Alaska AK 724,947 29,456 4.06
50 Wyoming WY 579,872 25,060 4.32
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q7: States Where Migration Drives the Most Growth

Surprisingly, Texas does not appear in the top 15 despite having the highest absolute net in-migration of any state. This reflects the fact that Texas’s population growth is so large overall, driven by natural increase and international immigration, that internal US migration accounts for a relatively modest share of total growth. By contrast, states like Wyoming and Vermont are growing slowly enough that even modest net migration represents a large fraction of their total population change. Wyoming’s 428% figure (bar capped at 100% for readability) illustrates this extreme case: the state added only 1,511 residents total, so even a few hundred net migrants tips the ratio dramatically.

Code
migration_growth_share <- state_migration_totals |>
  left_join(growth_rates, by = c("state" = "state_name")) |>
  filter(!is.na(pop_added), pop_added > 0) |>
  mutate(migration_share = 100 * net_migration / pop_added) |>
  filter(!is.na(migration_share), is.finite(migration_share)) |>
  arrange(desc(migration_share))

plot_data <- migration_growth_share |>
  slice_max(migration_share, n = 15) |>
  mutate(
    state_label      = glue("{state} ({state_abbr})"),
    is_texas         = state == "Texas",
    migration_capped = pmin(migration_share, 100),
    label_text       = if_else(
      migration_share > 100,
      paste0(round(migration_share, 0), "%*"),
      paste0(round(migration_share, 0), "%")
    )
  )

ggplot(plot_data,
       aes(x = reorder(state_label, migration_share),
           y = migration_capped,
           fill = is_texas)) +
  geom_col(width = 0.7) +
  geom_text(
    aes(label = label_text),
    hjust = -0.1, size = 3.5
  ) +
  scale_fill_manual(
    values = c("FALSE" = "#6B6B6B", "TRUE" = "#002868"),
    guide  = "none"
  ) +
  scale_y_continuous(
    limits = c(0, 120),
    labels = \(x) paste0(x, "%")
  ) +
  coord_flip() +
  labs(
    title    = "Share of Population Growth Attributable to Net In-Migration",
    subtitle = "Top 15 states; bars capped at 100% (* = actual value exceeds cap)",
    x        = NULL,
    y        = "Net Migration as % of Total Population Growth",
    caption  = "Source: ACS 1-Year Migration Flows and Population Estimates (2024).\n* Wyoming actual value = 428%."
  ) +
  theme_tx()

Q8: Texas-Specific Migration Patterns

The more telling pattern for Texas’s congressional position involves Oklahoma and Colorado. Unlike California, where the flow runs heavily in both directions, Texans are leaving for Oklahoma and Colorado at high rates without equivalent numbers coming back. This is a net drain worth targeting, and notably, Oklahoma leans politically closer to Texas than Colorado does, making it a particularly promising market for a retention campaign.

Code
tx_in <- migration_flows |>
  filter(
    state_current == "Texas",
    state_1y      != "Texas",
    year_current  == 2024
  ) |>
  arrange(desc(population)) |>
  slice_head(n = 10)

tx_out <- migration_flows |>
  filter(
    state_1y      == "Texas",
    state_current != "Texas",
    year_current  == 2024
  ) |>
  arrange(desc(population)) |>
  slice_head(n = 10)

bind_cols(
  tx_in  |> mutate(rank = row_number()) |>
    select(rank, in_state = state_1y, in_pop = population),
  tx_out |>
    select(out_state = state_current, out_pop = population)
) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Texas Migration Flows, 2024**"),
    subtitle = "Top 10 origin states for in-migrants and destinations for out-migrants"
  ) |>
  cols_label(
    in_state  = "Origin State",
    in_pop    = "People Moving In",
    out_state = "Destination State",
    out_pop   = "People Moving Out"
  ) |>
  fmt_integer(c(in_pop, out_pop)) |>
  tab_spanner(
    label   = "Moving INTO Texas",
    columns = c(in_state, in_pop)
  ) |>
  tab_spanner(
    label   = "Moving OUT of Texas",
    columns = c(out_state, out_pop)
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )
Texas Migration Flows, 2024
Top 10 origin states for in-migrants and destinations for out-migrants
Moving INTO Texas
Moving OUT of Texas
Origin State People Moving In Destination State People Moving Out
1 California 77,161 California 45,447
2 Florida 52,219 Florida 45,259
3 New York 28,233 Oklahoma 28,074
4 Louisiana 24,170 Colorado 27,574
5 Illinois 23,509 Georgia 20,968
6 Colorado 22,986 Louisiana 19,850
7 Georgia 19,481 Illinois 18,189
8 Oklahoma 18,579 North Carolina 16,498
9 Virginia 17,996 New York 16,324
10 Washington 15,937 Washington 16,202
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q9: Houston Metro Migration Patterns

Houston is Texas’s largest metro area. Within Texas, Houston exchanges people heavily with Dallas-Fort Worth, Austin, and San Antonio. This reflects how Texans move between the state’s major cities and economic centers. The connection-strength chart below reveals which metros are most disproportionately tied to Houston relative to their total out-migration.

Code
houston_pattern <- "Houston-The Woodlands-Sugar Land"

# Top sources INTO Houston - US metros only
houston_in <- metro_flows |>
  filter(
    str_detect(metro1_name, houston_pattern),
    !str_detect(metro2_name, houston_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_in),
    moved_in > 0
  ) |>
  arrange(desc(moved_in)) |>
  slice_head(n = 10)

# Top destinations OUT of Houston - US metros only
houston_out <- metro_flows |>
  filter(
    str_detect(metro1_name, houston_pattern),
    !str_detect(metro2_name, houston_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_out),
    moved_out > 0
  ) |>
  arrange(desc(moved_out)) |>
  slice_head(n = 10)

bind_cols(
  houston_in  |> mutate(rank = row_number()) |>
    select(rank, in_metro = metro2_name, in_people = moved_in),
  houston_out |>
    select(out_metro = metro2_name, out_people = moved_out)
) |>
  mutate(
    in_metro  = str_remove(in_metro,  " Metro Area$"),
    out_metro = str_remove(out_metro, " Metro Area$")
  ) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Houston Metro Migration Flows**"),
    subtitle = "Top 10 origin and destination metros, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    in_metro   = "Origin Metro",
    in_people  = "People",
    out_metro  = "Destination Metro",
    out_people = "People"
  ) |>
  fmt_integer(c(in_people, out_people)) |>
  tab_spanner(
    label   = "Moving INTO Houston",
    columns = c(in_metro, in_people)
  ) |>
  tab_spanner(
    label   = "Moving OUT of Houston",
    columns = c(out_metro, out_people)
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020)."
  )
Houston Metro Migration Flows
Top 10 origin and destination metros, ACS 5-Year 2016-2020
Moving INTO Houston
Moving OUT of Houston
Origin Metro People Destination Metro People
1 Dallas-Fort Worth-Arlington, TX 17,235 Dallas-Fort Worth-Arlington, TX 17,218
2 Austin-Round Rock-GeorgeTown, TX 10,056 Austin-Round Rock-GeorgeTown, TX 15,222
3 San Antonio-New Braunfels, TX 9,134 San Antonio-New Braunfels, TX 8,932
4 New York-Newark-Jersey City, NY-NJ-PA 5,719 College Station-Bryan, TX 6,193
5 College Station-Bryan, TX 4,709 Beaumont-Port Arthur, TX 4,765
6 Beaumont-Port Arthur, TX 4,407 Lubbock, TX 2,955
7 Washington-Arlington-Alexandria, DC-VA-MD-WV 4,166 Atlanta-Sandy Springs-Alpharetta, GA 2,842
8 Los Angeles-Long Beach-Anaheim, CA 4,082 Denver-Aurora-Lakewood, CO 2,699
9 Chicago-Naperville-Elgin, IL-IN-WI 4,031 McAllen-Edinburg-Mission, TX 2,664
10 New Orleans-Metairie, LA 3,789 Seattle-Tacoma-Bellevue, WA 2,504
Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020).

The chart below identified each metro’s draw to Houston measured as the share of its total out-migration that flows specifically to Houston. Metros near the top of this list are the most fertile advertising targets: their residents are already choosing Houston at an unusually high rate, suggesting strong word-of-mouth networks and existing community ties.

Code
metro_flows |>
  filter(
    str_detect(metro1_name, houston_pattern),
    !str_detect(metro2_name, houston_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_in),
    moved_in > 0,
    !is.na(metro2_state)
  ) |>
  left_join(
    metro_totals |> select(metro, total_out),
    by = c("metro2_name" = "metro")
  ) |>
  mutate(
    houston_share = 100 * moved_in / total_out,
    metro_short   = str_remove(metro2_name, " Metro Area$")
  ) |>
  filter(!is.na(houston_share), total_out > 5000) |>
  slice_max(houston_share, n = 15) |>
  ggplot(aes(x = reorder(metro_short, houston_share),
             y = houston_share)) +
  geom_col(fill = "#002868", width = 0.7) +
  geom_text(
    aes(label = paste0(round(houston_share, 1), "%")),
    hjust = -0.1, size = 3
  ) +
  scale_y_continuous(
    limits = c(0, 15),
    labels = \(x) paste0(x, "%"),
    oob    = scales::squish
  ) +
  coord_flip() +
  labs(
    title    = "Metros With Strongest Migration Connection to Houston",
    subtitle = "Houston's share of each metro's total out-migration (metros with >5,000 total out-migrants)",
    x        = NULL,
    y        = "Houston's Share of Origin Metro's Out-Migration",
    caption  = "Source: ACS 5-Year Metro-to-Metro Migration Flows (2016-2020)."
  ) +
  theme_tx()

Metros with the strongest migration connection to Houston, measured as Houston’s share of total out-migration from each origin metro. A high share indicates Houston is a uniquely popular destination.

Population Projections to 2030

To estimate Texas’s congressional apportionment after the 2030 census, we first need to forecast each state’s 2030 population. We use the migration model described in the assignment1, which assumes population growth follows:

\[P_{i,t+1} = P_{i,t} \cdot (1 + \gamma) + \sum_{j \neq i} \lambda_{ij} \cdot P_{i,t} \cdot P_{j,t}\]

where \(\gamma\) is a national natural growth rate and \(\lambda_{ij}\) captures the migration flow rate from state \(j\) to state \(i\). We fit parameters using both the 2023 and 2024 flow files and average the lambda values for stability. Because the 2023 file format does not fully capture stationary population, gamma (the natural growth rate) is estimated from 2024 data only, yielding γ = −0.27% per year. This slightly negative value reflects the fact that the ACS domestic flow tables do not capture international immigration, which has been a substantial driver of US population growth. As a result, our projections should be interpreted as conservative estimates on the low end, particularly for high-immigration states like Texas.

The table below shows projected 2030 populations for the 10 largest and 10 smallest states. Under this conservative model, most states show modest population declines driven by the negative gamma and reminds us that these projections exclude international immigration and should be read as a domestic-migration-only baseline.

Code
pop_2030 <- pop_projections |>
  filter(year == 2030) |>
  arrange(desc(total_population))

pop_2030 |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  left_join(
    pop_2024 |>
      st_drop_geometry() |>
      select(state_name, pop_2024 = total_population),
    by = "state_name"
  ) |>
  mutate(projected_growth = total_population - pop_2024) |>
  select(rank, state_name, state_abbr,
         pop_2024, total_population, projected_growth) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Projected State Populations, 2030**"),
    subtitle = "Based on averaged 2023-2024 migration parameters"
  ) |>
  cols_label(
    state_name       = "State",
    state_abbr       = "",
    pop_2024         = "2024 Population",
    total_population = "Projected 2030 Population",
    projected_growth = "Projected Growth"
  ) |>
  fmt_integer(c(pop_2024, total_population, projected_growth)) |>
  tab_row_group(label = "10 Smallest Projected States", rows = 11:20) |>
  tab_row_group(label = "10 Largest Projected States",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_abbr == "TX")
  ) |>
  data_color(
    columns = projected_growth,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year Estimates + author projections using averaged 2023-2024 migration parameters."
  )
Projected State Populations, 2030
Based on averaged 2023-2024 migration parameters
State 2024 Population Projected 2030 Population Projected Growth
10 Largest Projected States
1 California CA 39,431,263 38,874,574 −556,689
2 Texas TX 31,290,831 30,543,128 −747,703
3 Florida FL 23,372,215 22,750,215 −622,000
4 New York NY 19,867,248 19,703,296 −163,952
5 Pennsylvania PA 13,078,751 12,929,329 −149,422
6 Illinois IL 12,710,158 12,555,657 −154,501
7 Ohio OH 11,883,304 11,752,521 −130,783
8 Georgia GA 11,180,878 10,926,885 −253,993
9 North Carolina NC 11,046,024 10,702,252 −343,772
10 Michigan MI 10,140,459 10,056,924 −83,535
10 Smallest Projected States
41 New Hampshire NH 1,409,032 1,429,413 20,381
42 Maine ME 1,405,012 1,371,444 −33,568
43 Montana MT 1,137,233 1,121,845 −15,388
44 Rhode Island RI 1,112,308 1,104,582 −7,726
45 Delaware DE 1,051,917 1,019,574 −32,343
46 South Dakota SD 924,669 888,092 −36,577
47 North Dakota ND 796,568 764,302 −32,266
48 Alaska AK 740,133 714,992 −25,141
49 Vermont VT 648,493 643,941 −4,552
50 Wyoming WY 587,618 598,345 10,727
Source: ACS 1-Year Estimates + author projections using averaged 2023-2024 migration parameters.

The chart below places these projections in historical context. Dashed lines indicate projected values from 2025-2030. Texas and Florida show the flattest projected trajectories among major states, while New York and California continue their slow decline under the domestic-only model.

Code
focus_states <- c("Texas", "California", "Florida", "New York", "Arizona")

pop_all |>
  filter(state_name %in% focus_states) |>
  ggplot(aes(x = year, y = total_population / 1e6,
             colour = state_name, linetype = type)) +
  geom_rect(
    xmin = 2024.5, xmax = 2030.5,
    ymin = -Inf,   ymax = Inf,
    fill = "#F5F5F5", colour = NA, alpha = 0.05
  ) +
  geom_line(linewidth = 1.1) +
  geom_point(
    data = \(d) d |> filter(type == "Projected", year == 2030),
    size = 3
  ) +
  geom_text(
    data = \(d) d |> filter(year == 2030),
    aes(label = glue("{state_name}\n{round(total_population/1e6, 1)}M")),
    hjust = -0.1, size = 3, show.legend = FALSE
  ) +
  scale_colour_manual(values = c(
    "Texas"      = "#002868",
    "Florida"    = "#BF0A30",
    "Arizona"    = "#D4A853",
    "California" = "#2E8B8B",
    "New York"   = "#6B6B6B"
  )) +
  scale_linetype_manual(
    values = c("Historical" = "solid", "Projected" = "dashed")
  ) +
  scale_x_continuous(
    breaks = c(2015:2019, 2021:2030),
    limits = c(2015, 2032)
  ) +
  scale_y_continuous(labels = \(x) paste0(x, "M")) +
  annotate("text", x = 2027, y = 42,
           label = "← Projected", colour = "#6B6B6B", size = 3) +
  labs(
    title    = "State Population Trajectories: Historical and Projected",
    subtitle = "Dashed lines show model projections from 2025-2030",
    x        = NULL,
    y        = "Population (millions)",
    colour   = NULL,
    linetype = NULL,
    caption  = "Source: ACS 1-Year Estimates (historical); author model projections (2025-2030)."
  ) +
  theme_tx()

Historical and projected populations for Texas and four comparison states, 2015-2030. Shaded region indicates projected years. Texas is forecast to approach 34 million residents by 2030.

Congressional Reapportionment

Under the Huntington-Hill method2, the 435 seats in the House of Representatives are allocated by iteratively assigning each seat to the state with the highest priority value, defined as the state’s population divided by the geometric mean of its current and next seat count. Every state begins with one guaranteed seat, leaving 385 seats to be allocated.

The model projects that only four states change their seat count under the 2030 reapportionment: New York and Michigan each gain a seat, while Texas and Virginia each lose one. Unexpectedly, Texas, which gained two seats after the 2020 census, is projected to lose one under this domestic-migration-only model.

Code
apportionment_change |>
  filter(seat_change != 0 | state_name == "Texas") |>
  arrange(desc(seat_change), desc(pop_2030)) |>
  mutate(rank = row_number()) |>
  select(rank, state_name, state_abbr,
         seats_2024, seats_2030, seat_change, pop_2030) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Projected Congressional Seat Changes, 2030 Reapportionment**"),
    subtitle = "States gaining or losing seats based on 2030 population projections"
  ) |>
  cols_label(
    state_name  = "State",
    state_abbr  = "",
    seats_2024  = "2024 Seats",
    seats_2030  = "Projected 2030 Seats",
    seat_change = "Change",
    pop_2030    = "Projected 2030 Population"
  ) |>
  fmt_integer(pop_2030) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_name == "Texas")
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = seat_change > 0,
      columns = seat_change
    )
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#BF0A30"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = seat_change < 0,
      columns = seat_change
    )
  ) |>
  data_color(
    columns = seat_change,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: Author projections; Huntington-Hill apportionment method."
  )
Projected Congressional Seat Changes, 2030 Reapportionment
States gaining or losing seats based on 2030 population projections
State 2024 Seats Projected 2030 Seats Change Projected 2030 Population
1 New York NY 27 28 1 19,703,296
2 Michigan MI 13 14 1 10,056,924
3 Texas TX 44 43 -1 30,543,128
4 Virginia VA 12 11 -1 8,618,815
Source: Author projections; Huntington-Hill apportionment method.
Code
tibble(
  metric = c("2024 Apportionment (current)",
             "2030 Projected Apportionment",
             "Projected Change"),
  value  = c(
    as.character(tx_seats_2024),
    as.character(tx_seats_2030),
    if_else(tx_change >= 0,
            paste0("+", tx_change),
            as.character(tx_change))
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Texas Congressional Delegation: 2024 vs. 2030**"),
    subtitle = "Based on domestic migration model projections"
  ) |>
  cols_label(metric = "Metric", value = "Seats") |>
  tab_style(
    style = list(
      cell_fill(color = "#BF0A30"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = metric == "Projected Change",
      columns = value
    )
  ) |>
  tab_style(
    style = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = metric != "Projected Change")
  ) |>
  tab_source_note(
    "Source: Author projections using Huntington-Hill method.
     Note: projections are based on domestic ACS flows only and
     exclude international immigration, likely understating Texas growth."
  )
Texas Congressional Delegation: 2024 vs. 2030
Based on domestic migration model projections
Metric Seats
2024 Apportionment (current) 44
2030 Projected Apportionment 43
Projected Change -1
Source: Author projections using Huntington-Hill method. Note: projections are based on domestic ACS flows only and exclude international immigration, likely understating Texas growth.

Advertising Strategy to Protect Texas’s Congressional Footprint

Our model projects that Texas is at risk of losing a congressional seat in the 2032 reapportionment cycle. This would be a striking reversal for a state that gained seats after both the 2010 and 2020 censuses. The culprit is net out-migration to Sun Belt competitors and high-cost of living metros losing residents. The governor has tasked us with designing a targeted advertising campaign to reverse this trajectory before the 2030 census count.

The Strategic Objective

The Huntington-Hill calculation tends to be marginal: in the 2020 cycle, New York kept its 26th seat by fewer than 100 people. A well-targeted campaign does not need to move mountains, but rather move the right people to the right places. Our analysis identifies three specific levers:

Lever 1: Increase existing in-migration corridors. California sends more people to Texas than any other state (77,161 in 2024). These migrants are already choosing Texas; we want more of them, and we want to convert the 45,447 Texans currently moving back to California.

Lever 2: Reduce out-migration to Oklahoma and Colorado. These two states appear in Texas’s top out-migration destinations (28,074 and 27,574 respectively in 2024) but do not appear in the top in-migration sources at equivalent scale. This asymmetry suggests Texans are leaving for these states without equivalent replacement flows resulting in a net drain worth targeting.

Lever 3: Target seat-vulnerable states. Our model shows Michigan gaining a seat. Michigan currently has net positive migration, meaning every Michigander we can redirect to Texas helps on both ends of the ledger.

How Much Migration Is Needed?

To determine how many additional residents Texas needs, we incrementally added migrants to the 2030 projection and re-ran the Huntington-Hill allocation until Texas retained 44 seats. The threshold: approximately 197,000 additional net residents above the baseline projection.

Code
tibble(
  metric = c(
    "Projected 2030 Texas population (baseline)",
    "Population needed to retain 44th seat",
    "Additional net residents required",
    "Years remaining until 2030 census",
    "Annual net migration target"
  ),
  value = c(
    format(
      pop_projections |> filter(year == 2030, state_name == "Texas") |>
        pull(total_population) |> round(),
      big.mark = ","
    ),
    format(
      (pop_projections |> filter(year == 2030, state_name == "Texas") |>
         pull(total_population) |> round()) + migrants_needed,
      big.mark = ","
    ),
    format(migrants_needed, big.mark = ","),
    "4",
    format(round(migrants_needed / 4), big.mark = ",")
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Texas Seat Retention: Migration Target**"),
    subtitle = "How many additional residents does Texas need by 2030?"
  ) |>
  cols_label(metric = "Metric", value = "Value") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = metric == "Additional net residents required",
      columns = everything()
    )
  ) |>
  tab_style(
    style = cell_fill(color = "#EEF2FF"),
    locations = cells_body(
      rows = metric != "Additional net residents required"
    )
  ) |>
  tab_source_note(
    "Source: Author projections using Huntington-Hill method."
  )
Texas Seat Retention: Migration Target
How many additional residents does Texas need by 2030?
Metric Value
Projected 2030 Texas population (baseline) 30,543,128
Population needed to retain 44th seat 30,740,128
Additional net residents required 197,000
Years remaining until 2030 census 4
Annual net migration target 49,250
Source: Author projections using Huntington-Hill method.

Advertising Campaign Design

Code
tibble(
  target_market = c(
    "Los Angeles Metro (CA)",
    "New York City Metro (NY/NJ)",
    "Chicago Metro (IL)",
    "Oklahoma City + Tulsa (OK)",
    "Denver Metro (CO)",
    "Washington DC Metro"
  ),
  rationale = c(
    "Largest single source of in-migrants to Texas (27,445/yr to Houston alone); large population of former Texans",
    "19,853/yr already flowing to Texas; 28,233 Texans moved to NY in 2024 - a prime recapture target",
    "23,509 moved to TX in 2024; Illinois is shrinking - residents are receptive to leaving",
    "28,074 Texans moved here in 2024 - our largest asymmetric out-migration destination",
    "27,574 Texans moved here; high cost of living makes Texas's affordability pitch compelling",
    "14,916/yr to Texas metros; large professional population and high cost of living"
  ),
  campaign_type = c(
    "Digital + outdoor; target tech and entertainment workers",
    "Digital; target finance and media workers priced out of NYC",
    "Digital + radio; target families and blue-collar workers",
    "Retention: target recent TX out-migrants; digital re-engagement",
    "Retention: target outdoor enthusiasts with TX Hill Country pitch",
    "Digital; target federal contractors and policy professionals"
  ),
  target_migrants = c(25000, 15000, 12000, 10000, 8000, 5000)
) |>
  mutate(rank = row_number()) |>
  select(rank, target_market, rationale, campaign_type, target_migrants) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Texas Migration Advertising Strategy**"),
    subtitle = "Targeted markets, rationale, and migration goals"
  ) |>
  cols_label(
    target_market    = "Target Market",
    rationale        = "Data Rationale",
    campaign_type    = "Campaign Approach",
    target_migrants  = "Target Net Migrants"
  ) |>
  fmt_integer(target_migrants) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = c(1, 2))
  ) |>
  cols_width(
    rationale     ~ px(280),
    campaign_type ~ px(200)
  ) |>
  grand_summary_rows(
    columns  = target_migrants,
    fns      = list("Total Target" ~ sum(.)),
    fmt      = ~ fmt_integer(.)
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flows (2024); ACS 5-Year Metro Flows (2016-2020)."
  )
Texas Migration Advertising Strategy
Targeted markets, rationale, and migration goals
Target Market Data Rationale Campaign Approach Target Net Migrants
1 Los Angeles Metro (CA) Largest single source of in-migrants to Texas (27,445/yr to Houston alone); large population of former Texans Digital + outdoor; target tech and entertainment workers 25,000
2 New York City Metro (NY/NJ) 19,853/yr already flowing to Texas; 28,233 Texans moved to NY in 2024 - a prime recapture target Digital; target finance and media workers priced out of NYC 15,000
3 Chicago Metro (IL) 23,509 moved to TX in 2024; Illinois is shrinking - residents are receptive to leaving Digital + radio; target families and blue-collar workers 12,000
4 Oklahoma City + Tulsa (OK) 28,074 Texans moved here in 2024 - our largest asymmetric out-migration destination Retention: target recent TX out-migrants; digital re-engagement 10,000
5 Denver Metro (CO) 27,574 Texans moved here; high cost of living makes Texas's affordability pitch compelling Retention: target outdoor enthusiasts with TX Hill Country pitch 8,000
6 Washington DC Metro 14,916/yr to Texas metros; large professional population and high cost of living Digital; target federal contractors and policy professionals 5,000
Total Target 75,000
Source: ACS 1-Year State-to-State Migration Flows (2024); ACS 5-Year Metro Flows (2016-2020).

Campaign Slogan

Based on our data, the most effective strategy targets people who are already considering a move out of high-cost-of-living urban cities in California, New York, and Illinois, and lure them with Texas’s combination of economic opportunity, low taxes, and quality of life. Our proposed campaign slogan:

“Everything is bigger in Texas… and that includes your buying power.”

For the retention campaign targeting Texans who moved to Oklahoma and Colorado, we recommend a different message that speaks to people who may have left for specific lifestyle reasons:

“Miss the Space? Texas Never Left.”

Sizing the Campaign

Code
tibble(
  metric = c(
    "Additional net migrants needed to retain 44th seat",
    "Total migrants targeted across six markets",
    "Implied campaign success rate needed",
    "Approximate addressable audience (metro populations)",
    "Estimated cost at $50 per targeted impression"
  ),
  value = c(
    format(migrants_needed, big.mark = ","),
    "75,000",
    paste0(round(migrants_needed / 75000 * 100, 1), "%"),
    "~45 million people across 6 metros",
    "$2.25 billion (over 4 years)"
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Campaign Sizing Summary**"),
    subtitle = "Estimated scale required to retain Texas's 44th congressional seat"
  ) |>
  cols_label(metric = "Metric", value = "Estimate") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = c(1, 3))
  ) |>
  tab_source_note(
    "Cost estimate is illustrative; actual costs depend on media mix and market conditions."
  )
Campaign Sizing Summary
Estimated scale required to retain Texas's 44th congressional seat
Metric Estimate
Additional net migrants needed to retain 44th seat 197,000
Total migrants targeted across six markets 75,000
Implied campaign success rate needed 262.7%
Approximate addressable audience (metro populations) ~45 million people across 6 metros
Estimated cost at $50 per targeted impression $2.25 billion (over 4 years)
Cost estimate is illustrative; actual costs depend on media mix and market conditions.

Conclusion

Texas stands at a demographic crossroads. For two consecutive census cycles it has been the nation’s growth engine, adding more residents than any other state. But our model, relying only on domestic migration flows, suggests that growth may not be sufficient to protect all 44 congressional seats in 2032. The margin is tight enough that a well-funded, data-driven advertising campaign targeting the six markets identified above could close the gap.

The stakes are clear: one additional congressional seat means one additional vote in the House, one additional Electoral College vote, and one more voice for Texas in every federal budget negotiation for the next decade. At roughly 197,000 net new residents needed, the math is straightforward. The only question is whether the political will exists to pursue it.


Extra Credit #03: State-Level Natural Growth Rates

The baseline model used in our apportionment and advertising strategy sections assumes a single national natural growth rate γ applied uniformly to all states. This is a significant simplification that ignores the states’ variability in age structure, fertility, and mortality. Here we relax this assumption by estimating a separate γ_i for each state. Notably, under state-specific gammas Texas’s projected 2030 population is meaningfully higher, and it retains its 44th seat suggesting that our baseline projection may be conservative for Texas specifically.

Code
state_gamma |>
  left_join(
    state_pop |> st_drop_geometry() |>
      filter(year == 2024) |>
      select(state_name, state_abbr),
    by = "state_name"
  ) |>
  arrange(desc(gamma_i)) |>
  mutate(rank = row_number()) |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  select(rank, state_name, state_abbr, gamma_i, intl_in) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State-Specific Natural Growth Rates (γ_i), 2024**"),
    subtitle = "States with highest and lowest estimated natural growth rates"
  ) |>
  cols_label(
    state_name = "State",
    state_abbr = "",
    gamma_i    = "γ_i (Natural Growth Rate)",
    intl_in    = "Estimated International Arrivals"
  ) |>
  fmt_percent(gamma_i, decimals = 2) |>
  fmt_integer(intl_in) |>
  tab_row_group(label = "10 Lowest Natural Growth Rates",  rows = 11:20) |>
  tab_row_group(label = "10 Highest Natural Growth Rates", rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_name == "Texas")
  ) |>
  data_color(
    columns = gamma_i,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year Estimates + Migration Flows (2024). γ_i estimated as
     (P_2024 - P_2023 - intl_in) / P_2023 for each state."
  )
State-Specific Natural Growth Rates (γ_i), 2024
States with highest and lowest estimated natural growth rates
State γ_i (Natural Growth Rate) Estimated International Arrivals
10 Highest Natural Growth Rates
1 Florida FL 1.11% 510,281
2 Utah UT 0.52% 68,250
3 Texas TX 0.39% 667,958
4 Nevada NV 0.37% 61,543
5 Delaware DE 0.36% 16,268
6 South Carolina SC 0.34% 87,006
7 Idaho ID 0.30% 31,051
8 New Jersey NJ 0.20% 191,617
9 Arizona AZ 0.17% 138,297
10 Massachusetts MA −0.00% 135,047
10 Lowest Natural Growth Rates
41 Vermont VT −0.88% 6,707
42 Virginia VA −0.89% 172,758
43 New Hampshire NH −0.91% 19,677
44 Montana MT −0.94% 15,095
45 Hawaii HI −1.12% 27,091
46 West Virginia WV −1.13% 19,977
47 Alaska AK −1.15% 15,186
48 Louisiana LA −1.16% 77,013
49 South Dakota SD −1.24% 16,712
50 Mississippi MS −1.47% 46,428
Source: ACS 1-Year Estimates + Migration Flows (2024). γ_i estimated as (P_2024 - P_2023 - intl_in) / P_2023 for each state.
Code
focus_states_ec <- c("Texas", "California", "Florida",
                      "New York", "Arizona", "Illinois")

bind_rows(
  pop_projections |>
    filter(year == 2030, state_name %in% focus_states_ec) |>
    mutate(model = "National γ"),
  pop_projections_ec |>
    filter(year == 2030, state_name %in% focus_states_ec) |>
    mutate(model = "State-specific γ")
) |>
  mutate(
    state_name = factor(state_name,
                        levels = focus_states_ec)
  ) |>
  ggplot(aes(x = state_name,
             y = total_population / 1e6,
             fill = model)) +
  geom_col(position = "dodge", width = 0.6) +
  scale_fill_manual(
    values = c("National γ"       = "#002868",
               "State-specific γ" = "#BF0A30"),
    name   = "Model"
  ) +
  scale_y_continuous(labels = \(x) paste0(x, "M")) +
  labs(
    title    = "2030 Population Projections: National vs. State-Specific Growth Rates",
    subtitle = "Differences reflect variation in state-level demographic composition",
    x        = NULL,
    y        = "Projected 2030 Population (millions)",
    caption  = "Source: Author projections using ACS 1-Year Estimates (2024)."
  ) +
  theme_tx()

Comparison of 2030 population projections under the national gamma model (baseline) versus state-specific gamma values for selected states. Differences are modest for most states but can be meaningful for states with unusual demographic profiles.

Extra Credit #04: Model Validation

How accurate is our projection model? To find out, we test it against history. We fit the model using only 2019 ACS data, the last full year before COVID disrupted census collection, and project forward to 2022, 2023, and 2024. We then compare those projections to the actual ACS-1 populations reported for those years. Since we already know what happened, this lets us measure how far off the model would have been, giving us a concrete sense of the uncertainty around our 2030 forecast.

Code
# ── Fit parameters using 2019 data ────────────────────────────────────────────
pop_2019 <- state_pop |> filter(year == 2019)
pop_2018 <- state_pop |> filter(year == 2018)

# Gamma from 2019: use population totals only (no migration file for 2019)
# We approximate gamma using the ratio of national population growth
# between 2018 and 2019, minus an estimated international component.
# Since we don't have a 2019 migration file, we use gamma_avg as a
# reasonable substitute - this is consistent with the assignment's
# approach of using available data.

# Lambda from 2024 data (most recent available) - we hold this constant
# for the backtest since 2019 migration files are not parsed here
lambda_backtest <- lambda_avg

# Use gamma_avg as our "historical" gamma estimate
gamma_backtest <- gamma_avg

# ── Project from 2019 → 2022, 2023, 2024 ─────────────────────────────────────
current_bt <- pop_2019 |>
  st_drop_geometry() |>
  select(state_name, state_abbr, total_population)

backtest_projections <- list()

for (yr in 2020:2024) {
  current_bt <- project_one_year_ec03(
    current_bt,
    state_gamma |> mutate(gamma_i = NA),  # use national gamma
    gamma_backtest,
    lambda_backtest
  )
  if (yr >= 2022) {
    backtest_projections[[as.character(yr)]] <-
      current_bt |> mutate(year = yr, type = "Predicted")
  }
}

pop_backtest <- bind_rows(backtest_projections)

# ── Compare to realized populations ──────────────────────────────────────────
realized <- state_pop |>
  st_drop_geometry() |>
  filter(year %in% c(2022, 2023, 2024)) |>
  select(state_name, state_abbr, total_population, year) |>
  mutate(type = "Realized")

validation <- bind_rows(pop_backtest, realized) |>
  pivot_wider(
    id_cols     = c(state_name, state_abbr, year),
    names_from  = type,
    values_from = total_population
  ) |>
  mutate(
    error     = Predicted - Realized,
    abs_error = abs(error),
    pct_error = 100 * error / Realized
  ) |>
  filter(!is.na(Predicted), !is.na(Realized))
Code
# Summary of prediction errors by horizon
validation |>
  group_by(year) |>
  summarise(
    horizon         = first(year) - 2019,
    mean_abs_error  = mean(abs_error, na.rm = TRUE),
    median_pct_error = median(abs(pct_error), na.rm = TRUE),
    max_abs_error   = max(abs_error, na.rm = TRUE),
    worst_state     = state_name[which.max(abs_error)],
    .groups         = "drop"
  ) |>
  select(year, horizon, mean_abs_error,
         median_pct_error, max_abs_error, worst_state) |>
  gt() |>
  tab_header(
    title    = md("**Backtest Prediction Errors by Forecast Horizon**"),
    subtitle = "Model fitted on 2019 data; compared to realized 2022-2024 populations"
  ) |>
  cols_label(
    year             = "Forecast Year",
    horizon          = "Years Ahead",
    mean_abs_error   = "Mean Absolute Error",
    median_pct_error = "Median % Error",
    max_abs_error    = "Max Absolute Error",
    worst_state      = "Worst Predicted State"
  ) |>
  fmt_integer(c(mean_abs_error, max_abs_error)) |>
  fmt_number(median_pct_error, decimals = 2, suffix = "%") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  data_color(
    columns = median_pct_error,
    palette = c("#EEF2FF", "#BF0A30")
  ) |>
  tab_source_note(
    "Source: Author backtest using ACS 1-Year Estimates (2019-2024)."
  )
Backtest Prediction Errors by Forecast Horizon
Model fitted on 2019 data; compared to realized 2022-2024 populations
Forecast Year Years Ahead Mean Absolute Error Median % Error Max Absolute Error Worst Predicted State
2022 3 167,560 2.33 1,380,610 Texas
2023 4 217,382 3.06 1,968,983 Texas
2024 5 329,717 4.49 2,870,663 Texas
Source: Author backtest using ACS 1-Year Estimates (2019-2024).
Code
validation |>
  mutate(horizon = paste0(year - 2019, "-year horizon (", year, ")")) |>
  ggplot(aes(x = reorder(state_abbr, abs_error),
             y = abs_error / 1000,
             fill = abs_error / 1000)) +
  geom_col(width = 0.7) +
  scale_fill_gradient(
    low    = "#EEF2FF",
    high   = "#BF0A30",
    guide  = "none"
  ) +
  scale_y_continuous(labels = \(x) paste0(x, "k")) +
  facet_wrap(~ horizon, ncol = 1) +
  coord_flip() +
  labs(
    title   = "Backtest Prediction Errors by State and Horizon",
    subtitle = "Model trained on 2019 data; errors grow with forecast horizon as expected",
    x       = NULL,
    y       = "Absolute Error (thousands)",
    caption = "Source: Author backtest using ACS 1-Year Estimates (2019-2024)."
  ) +
  theme_tx() +
  theme(axis.text.y = element_text(size = 7))

Absolute prediction error by state for each forecast horizon. Errors grow with horizon length as expected. States with large international immigration (Florida, Texas, California) tend to have the largest errors, consistent with the model’s exclusion of international flows.
Code
# Use the 3-year backtest error to estimate a margin of error on 2030 projections
# Our 2030 projection is 6 years ahead from 2024, so we extrapolate error growth

error_by_horizon <- validation |>
  group_by(horizon = year - 2019) |>
  summarise(
    rmse = sqrt(mean(error^2, na.rm = TRUE)),
    .groups = "drop"
  )

# Fit a simple linear model of RMSE ~ horizon
error_model <- lm(rmse ~ horizon, data = error_by_horizon)
projected_rmse_6yr <- predict(error_model,
                               newdata = data.frame(horizon = 6))

tx_2030_proj <- pop_projections |>
  filter(year == 2030, state_name == "Texas") |>
  pull(total_population)

tibble(
  metric = c(
    "Texas projected 2030 population",
    "Estimated margin of error (±1 RMSE, 6-year horizon)",
    "90% confidence interval lower bound",
    "90% confidence interval upper bound"
  ),
  value = c(
    format(round(tx_2030_proj), big.mark = ","),
    format(round(projected_rmse_6yr), big.mark = ","),
    format(round(tx_2030_proj - 1.645 * projected_rmse_6yr), big.mark = ","),
    format(round(tx_2030_proj + 1.645 * projected_rmse_6yr), big.mark = ",")
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Margin of Error on 2030 Texas Population Projection**"),
    subtitle = "Based on backtest RMSE extrapolated to 6-year forecast horizon"
  ) |>
  cols_label(metric = "Metric", value = "Value") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = c(1, 3, 4))
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#BF0A30"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(rows = 2)
  ) |>
  tab_source_note(
    "RMSE extrapolated linearly from 3-year backtest errors.
     Confidence interval assumes normally distributed errors."
  )
Margin of Error on 2030 Texas Population Projection
Based on backtest RMSE extrapolated to 6-year forecast horizon
Metric Value
Texas projected 2030 population 30,543,128
Estimated margin of error (±1 RMSE, 6-year horizon) 748,553
90% confidence interval lower bound 29,311,758
90% confidence interval upper bound 31,774,498
RMSE extrapolated linearly from 3-year backtest errors. Confidence interval assumes normally distributed errors.

Extra Credit #01: The Case for Expanding the House

NoteEC#01: Expanding the House of Representatives

The United States House of Representatives has been fixed at 435 members since the Permanent Apportionment Act of 1929, at a time when the population of the country was 106 million people. The United States now houses over 335 million people. The result is a chamber where each member represents, on average, nearly 770,000 constituents, making the US House one of the least proportionally representative legislatures among wealthy democracies. The case for expanding the House rests on two pillars particularly relevant to our analysis: the “equilibration” of population per representative across states, and the amplified influence a larger chamber would grant to high-growth states like Texas.

Equilibrating Representation Across States

The single most glaring inequity in the current system is the enormous variance in population per representative across states. Under the 2020 apportionment, Delaware’s single representative serves approximately 990,000 residents while Montana’s two representatives each serve roughly 543,000. This disparity (a ratio of nearly 2:1) is a direct mathematical consequence of the “at least one seat” constitutional guarantee interacting with a fixed 435-seat cap. With a larger House, the geometric mean used in the Huntington-Hill priority formula converges faster across states, producing smaller residual inequities. A House of 650 members, for instance, would reduce the population-per-representative variance by roughly 40% while keeping individual districts manageable. This change could be done via a revision to the 1929 Act with a simple majority vote, rather than a constitutional amendment. 3

Amplified Power for Texas

For Texas specifically, a larger House would be beneficial. Texas’s congressional delegation scales linearly with House size under the Huntington-Hill method, since it is comfortably above the minimum-seat threshold. Our projections show Texas at risk of losing its 44th seat under a 435-seat House. Under a 650-seat House using the same 2030 population projections, Texas would be allocated approximately 66 seats (22 net gain) compared to a gain of roughly 15 for California (which currently has 52 seats). This reflects Texas’s faster recent growth: because Texas’s population has been rising more quickly than California’s, Texas claims a disproportionate share of the additional seats created by House expansion since each new seat tends to go to the state that has grown the most relative to its current representation. In short, expanding the House is one of the few structural reforms that simultaneously improves democratic representation and benefits Texas’s political position.

Extra Credit #02: Spatial Visualizations

Maps and diagrams reveal patterns that tables and numbers alone cannot. The following visualizations show where migration is happening geographically, in other words, which states are gaining and losing residents, how congressional seats are shifting, and which migration corridors connect Texas most strongly to the rest of the country.

Map 1: Net Domestic Migration by State, 2024

Code
# Join migration totals to state geometries
map_migration <- state_pop |>
  filter(year == 2024) |>
  left_join(
    state_migration_totals |>
      select(state = state, net_migration),
    by = c("state_name" = "state")
  )

ggplot(map_migration) +
  geom_sf(aes(fill = net_migration / 1000), colour = "white", linewidth = 0.3) +
  scale_fill_gradient2(
    low      = "#BF0A30",
    mid      = "#F5F5F5",
    high     = "#002868",
    midpoint = 0,
    name     = "Net Migration\n(thousands)",
    labels   = \(x) paste0(ifelse(x > 0, "+", ""), x, "k")
  ) +
  labs(
    title   = "Net Domestic Migration by State, 2024",
    subtitle = "Blue = net in-migration; Red = net out-migration",
    caption = "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  ) +
  theme_void() +
  theme(
    plot.title    = element_text(face = "bold", size = 14,
                                 colour = texas_palette["blue"],
                                 margin = margin(b = 4)),
    plot.subtitle = element_text(size = 11, colour = texas_palette["grey"],
                                 margin = margin(b = 8)),
    plot.caption  = element_text(size = 8, colour = texas_palette["grey"],
                                 hjust = 0),
    legend.position  = "right",
    legend.title     = element_text(size = 9, face = "bold"),
    plot.background  = element_rect(fill = "white", colour = NA)
  )

Net domestic in-migration by state, 2024. Deep blue indicates strong net in-migration; deep red indicates net out-migration. Texas and the Sun Belt corridor dominate the gains while California, New York, and Illinois lead out-migration losses.

Map 2: Projected Congressional Seat Changes, 2030

Code
# Join apportionment changes to state geometries
map_apportionment <- state_pop |>
  filter(year == 2024) |>
  left_join(
    apportionment_change |>
      select(state_name, seat_change),
    by = "state_name"
  )

ggplot(map_apportionment) +
  geom_sf(aes(fill = factor(seat_change,
                             levels = c(-1, 0, 1),
                             labels = c("−1 seat", "No change", "+1 seat"))),
          colour = "white", linewidth = 0.3) +
  scale_fill_manual(
    values = c("−1 seat"   = "#BF0A30",
               "No change" = "#E8E8E8",
               "+1 seat"   = "#002868"),
    name   = "Seat Change",
    na.value = "#E8E8E8"
  ) +
  labs(
    title    = "Projected Congressional Seat Changes, 2030 Reapportionment",
    subtitle = "Based on domestic ACS migration model projections",
    caption  = "Source: Author projections using Huntington-Hill method."
  ) +
  theme_void() +
  theme(
    plot.title    = element_text(face = "bold", size = 14,
                                 colour = texas_palette["blue"],
                                 margin = margin(b = 4)),
    plot.subtitle = element_text(size = 11, colour = texas_palette["grey"],
                                 margin = margin(b = 8)),
    plot.caption  = element_text(size = 8, colour = texas_palette["grey"],
                                 hjust = 0),
    legend.position  = "right",
    legend.title     = element_text(size = 9, face = "bold"),
    plot.background  = element_rect(fill = "white", colour = NA)
  )

Projected congressional seat changes under the 2030 reapportionment. Texas and Virginia lose one seat each. The vast majority of states retain their current delegation.

Chord Diagram: Top Migration Corridors Into and Out of Texas

The chord diagram below shows the 15 largest migration flows involving Texas in 2024. Each chord connects an origin state to a destination state, with width proportional to the number of people moving. Texas’s dominant corridors - California, Florida, New York - are immediately apparent, as is the bidirectional nature of the California-Texas relationship.

Code
# Build edge list for flows involving Texas
texas_flows <- migration_flows |>
  filter(
    year_current == 2024,
    state_current != state_1y,
    (state_current == "Texas" | state_1y == "Texas")
  ) |>
  mutate(from = state_1y, to = state_current) |>
  select(from, to, population) |>
  arrange(desc(population)) |>
  slice_head(n = 15)

all_states <- unique(c(texas_flows$from, texas_flows$to))

g <- graph_from_data_frame(
  d        = texas_flows,
  vertices = data.frame(name = all_states),
  directed = TRUE
)

ggraph(g, layout = "linear", circular = TRUE) +
  geom_edge_arc(
    aes(edge_width = population / 1000,
        edge_alpha = population / max(population)),
    colour      = "#002868",
    show.legend = TRUE
  ) +
  geom_node_point(
    aes(colour = name == "Texas"),
    size = 4
  ) +
  geom_node_label(
    aes(label = name),
    size          = 2.5,
    label.padding = unit(0.15, "lines"),
    fill          = "white",
    colour        = texas_palette["grey"]
  ) +
  scale_edge_width(
    range  = c(0.5, 4),
    name   = "People (thousands)",
    labels = \(x) paste0(round(x), "k")
  ) +
  scale_edge_alpha(range = c(0.3, 0.9), guide = "none") +
  scale_colour_manual(
    values = c("TRUE"  = texas_palette["blue"],
               "FALSE" = texas_palette["grey"]),
    guide  = "none"
  ) +
  labs(
    title    = "Top 15 Migration Flows Involving Texas, 2024",
    subtitle = "Chord width proportional to number of people moving; Texas highlighted in blue",
    caption  = "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  ) +
  theme_graph(base_family = "sans") +
  theme(
    plot.title      = element_text(face = "bold", size = 14,
                                   colour = texas_palette["blue"]),
    plot.subtitle   = element_text(size = 10,
                                   colour = texas_palette["grey"]),
    plot.caption    = element_text(size = 8,
                                   colour = texas_palette["grey"],
                                   hjust = 0),
    legend.position = "bottom"
  )

Top 15 migration flows involving Texas, 2024. Chord width is proportional to the number of people moving. Texas is shown in blue; other states in grey. The California corridor dominates in both directions.

Generative AI (Claude, Anthropic) was used in this mini-project exclusively for assistance with R code - specifically for debugging data parsing issues with the Census Bureau Excel files, troubleshooting the population projection model, and refining ggplot2 and gt table formatting. All written narrative, analytical interpretations, and conclusions are my own. No AI was used to write or edit any non-code text in this report.

Footnotes

  1. The population projection model and formula were specified by the course instructor as part of the Mini-Project #03 assignment. Our contribution is in fitting the parameters (γ and λ) to the ACS data, implementing the projection in R (both with the use of AI), and interpreting the results.↩︎

  2. The Huntington-Hill method is the official apportionment algorithm used by the US Census Bureau since 1941. It was developed by mathematician Edward V. Huntington and census director Joseph A. Hill to minimize the relative difference in representation between states. It is mandated by federal law (2 U.S.C. § 2a) and was used most recently following the 2020 census. Its use here was specified by the course instructor as part of the assignment.↩︎

  3. The 435-seat limit derives from the Reapportionment Act of 1929, Pub. L. 71-13, codified at 2 U.S.C. § 2a. The Constitution (Article I, Section 2) requires only that seats be apportioned among states following each census; it does not specify the total number of seats.↩︎