STA 9750 Mini-Project #03: Who Goes There?

US Internal Migration and Implications for Congressional Reapportionment in Texas

Author

Raúl J. Solá Navarro

Published

April 25, 2026

Introduction

Between 2020 and 2024, Texas added more people than any other state in the nation. The Sun Belt surge that has reshaped American politics over the past two decades has made Texas the center of the country’s demographic transformation. As the “Lone Star” gains congressional clout, its northern counterparts watch seats slip away. But what does the data actually say? Where are Texans coming from, where are they going, and can a governor with a targeted advertising budget shift the numbers ahead of the 2030 census?

This report uses American Community Survey (ACS) migration state-to-state and metro-to-metro flow data to quantify those patterns, build population forecasts through 2030, and estimate Texas’s likely congressional apportionment for the 2032 elections. We close with a data-driven advertising strategy designed to maximize the state’s political footprint.

Reproducibility note: All external files are cached in data/mp03/. If you are running this document for the first time, the files will be downloaded automatically. Subsequent renders use the cached copies and require no internet access.

State Population Data

Texas has been the fastest-growing large state in the nation over the past decade. The table below ranks states by their current 2024 population. Texas sits firmly in second place at over 31 million residents, trailing only California. For growth rates and absolute population added since 2015, see the analysis in Q1 of the Exploratory Data Analysis section.

Code

# B01003_001 = Total Population
# Note: 2020 ACS-1 was never released due to COVID-19 low response rates.

pop_raw <- map(
  c(2015:2019, 2021:2024),
  \(yr) get_acs(
    geography   = "state",
    variables   = c(total_population = "B01003_001"),
    year        = yr,
    survey      = "acs1",
    geometry    = TRUE,
    cache_table = TRUE
  ) |>
    mutate(year = yr)
) |>
  list_rbind()

state_pop <- pop_raw |>
  tigris::shift_geometry() |>
  filter(!NAME %in% c("District of Columbia", "Puerto Rico")) |>
  transmute(
    geoid            = GEOID,
    state_name       = NAME,
    state_abbr       = state.abb[match(NAME, state.name)],
    year             = year,
    total_population = estimate,
    pop_moe          = moe,
    geometry         = geometry
  )

Code

state_pop_2024 <- state_pop |>
  filter(year == 2024) |>
  arrange(desc(total_population))

bind_rows(
  slice_head(state_pop_2024, n = 10),
  slice_tail(state_pop_2024, n = 10)
) |>
  st_drop_geometry() |>
  mutate(rank = c(1:10, 41:50)) |>
  select(rank, state_name, state_abbr, total_population) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Populations, 2024 ACS-1 Estimates**"),
    subtitle = "Largest and smallest states by total population"
  ) |>
  cols_label(
    state_name       = "State",
    state_abbr       = "",
    total_population = "Population"
  ) |>
  fmt_integer(total_population) |>
  tab_row_group(label = "10 Smallest States", rows = 11:20) |>
  tab_row_group(label = "10 Largest States",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_abbr == "TX")
  ) |>
  tab_footnote(
    footnote  = "Texas highlighted in blue.",
    locations = cells_column_labels(state_name)
  ) |>
  tab_source_note(
    "Source: US Census Bureau, ACS 1-Year Estimates (2024), Table B01003."
  )

	State¹		Population
State Populations, 2024 ACS-1 Estimates
Largest and smallest states by total population
10 Largest States
1	California	CA	39,431,263
2	Texas	TX	31,290,831
3	Florida	FL	23,372,215
4	New York	NY	19,867,248
5	Pennsylvania	PA	13,078,751
6	Illinois	IL	12,710,158
7	Ohio	OH	11,883,304
8	Georgia	GA	11,180,878
9	North Carolina	NC	11,046,024
10	Michigan	MI	10,140,459
10 Smallest States
41	New Hampshire	NH	1,409,032
42	Maine	ME	1,405,012
43	Montana	MT	1,137,233
44	Rhode Island	RI	1,112,308
45	Delaware	DE	1,051,917
46	South Dakota	SD	924,669
47	North Dakota	ND	796,568
48	Alaska	AK	740,133
49	Vermont	VT	648,493
50	Wyoming	WY	587,618
¹ Texas highlighted in blue.
Source: US Census Bureau, ACS 1-Year Estimates (2024), Table B01003.

The chart below tracks cumulative population growth since 2015 for Texas and four peer states, indexed to 100 at the start of the period. The 2020 gap reflects the suspension of ACS-1 data collection due to COVID-19 since no ACS-1 was released for that year. Texas and Florida’s trajectories clearly separate them from the pack around 2021.

Code

focus_states <- c("Texas", "California", "Florida", "New York", "Arizona")

pop_indexed <- state_pop |>
  st_drop_geometry() |>
  filter(state_name %in% focus_states) |>
  group_by(state_name) |>
  mutate(
    pop_2015 = total_population[year == 2015],
    index    = 100 * total_population / pop_2015
  ) |>
  ungroup()

us_total <- state_pop |>
  st_drop_geometry() |>
  group_by(year) |>
  summarise(total_population = sum(total_population), .groups = "drop") |>
  mutate(
    state_name = "US Total",
    pop_2015   = total_population[year == 2015],
    index      = 100 * total_population / pop_2015
  )

state_colors <- c(
  "Texas"      = "#002868",
  "Florida"    = "#BF0A30",
  "Arizona"    = "#D4A853",
  "California" = "#2E8B8B",
  "New York"   = "#6B6B6B",
  "US Total"   = "black"
)

state_lty <- c(
  "Texas"      = "solid",
  "Florida"    = "solid",
  "Arizona"    = "solid",
  "California" = "solid",
  "New York"   = "solid",
  "US Total"   = "dashed"
)

bind_rows(pop_indexed, us_total) |>
  ggplot(aes(x = year, y = index,
             colour = state_name, linetype = state_name)) +
  geom_line(linewidth = 1.1) +
  geom_point(size = 2) +
  geom_text(
    data = \(d) d |> filter(year == 2024),
    aes(label = glue("{state_name}\n{round(index, 1)}")),
    hjust = -0.08, size = 3, show.legend = FALSE
  ) +
  scale_colour_manual(values = state_colors) +
  scale_linetype_manual(values = state_lty) +
  scale_x_continuous(
    breaks = 2015:2024,
    limits = c(2015, 2026.5)
  ) +
  scale_y_continuous(
    labels = \(x) paste0(x - 100, "%"),
    breaks = seq(95, 125, 5)
  ) +
  labs(
    title    = "Population Growth Since 2015 (Indexed to 100)",
    subtitle = "Texas leads all major states and the national average",
    x        = NULL,
    y        = "Cumulative Growth Since 2015",
    colour   = NULL,
    linetype = NULL,
    caption  = "Source: ACS 1-Year Estimates, Tables B01003 (2015-2024)."
  ) +
  theme_tx() +
  theme(legend.position = "none")

Texas population growth versus the national average and selected comparison states, 2015-2024. The 2020 gap reflects the suspension of ACS-1 data collection during COVID-19. Texas outpaced all comparison states over the full period.

State-to-State Migration Flows

To understand where Texans are coming from, and where they’re going, we turn to the Census Bureau’s state-to-state migration flow files, derived from ACS-1 respondents who reported living in a different state one year prior. We parse the 2023 and 2024 flow files separately (they have different formats) and combine them into a single tidy table.

The table below highlights a striking pattern among the four largest states: California is simultaneously Texas’s largest source of in-migrants (77,161 people) and the top destination for Texans leaving the state (45,447 people). This two-way flow defines the largest migration corridor in the country.

Code

migration_flows |>
  filter(
    state_current %in% c("Texas", "California", "New York", "Florida"),
    state_1y      %in% c("Texas", "California", "New York", "Florida"),
    year_current  == 2024
  ) |>
  arrange(state_current, state_1y) |>
  mutate(
    label = if_else(
      state_current == state_1y,
      glue("Stayed in {state_current}"),
      glue("{state_1y} → {state_current}")
    )
  ) |>
  select(label, population, year_current, year_1y) |>
  gt() |>
  tab_header(
    title    = md("**State-to-State Migration Flows, 2024**"),
    subtitle = "Selected flows among four largest states"
  ) |>
  cols_label(
    label        = "Flow",
    population   = "People",
    year_current = "Survey Year",
    year_1y      = "Prior Year"
  ) |>
  fmt_integer(population) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = grepl("Texas", label))
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = grepl("Stayed", label),
      columns = label
    )
  ) |>
  tab_source_note(
    "Source: US Census Bureau, ACS 1-Year State-to-State Migration Flow Tables."
  )

Flow	People	Survey Year	Prior Year
State-to-State Migration Flows, 2024
Selected flows among four largest states
Stayed in California	38,325,648	2024	2023
Florida → California	15,988	2024	2023
New York → California	24,927	2024	2023
Texas → California	45,447	2024	2023
California → Florida	36,194	2024	2023
Stayed in Florida	22,289,936	2024	2023
New York → Florida	50,661	2024	2023
Texas → Florida	45,259	2024	2023
California → New York	31,367	2024	2023
Florida → New York	28,080	2024	2023
Stayed in New York	19,210,972	2024	2023
Texas → New York	16,324	2024	2023
California → Texas	77,161	2024	2023
Florida → Texas	52,219	2024	2023
New York → Texas	28,233	2024	2023
Stayed in Texas	30,069,004	2024	2023
Source: US Census Bureau, ACS 1-Year State-to-State Migration Flow Tables.

Metro-Area Migration Flows

State-level data tells us where people are moving in aggregate, but metro-area flows reveal the specific urban corridors driving Texas’s growth. The Houston, Dallas, and Austin metros are among the fastest-growing in the country, but which cities are feeding them? We use the Census Bureau’s ACS-5 metro-to-metro migration API to find out.

The Los Angeles metro sends more people to Texas than any other single metro area with over 27,000 per year during the 2016-2020 period. New York and Chicago follow, confirming that Texas’s growth is largely powered by residents fleeing the high costs and congestion of the country’s three largest legacy metros.

Code

metro_flows |>
  filter(
    metro1_state == "TX",
    !is.na(metro2_state),
    metro2_state != "TX"
  ) |>
  group_by(metro2_name) |>
  summarise(
    total_moved_in = sum(moved_in, na.rm = TRUE),
    .groups = "drop"
  ) |>
  slice_max(total_moved_in, n = 10) |>
  mutate(rank = row_number()) |>
  select(rank, metro2_name, total_moved_in) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Top 10 Metro Sources of In-Migration to Texas**"),
    subtitle = "Non-Texas metros sending the most residents to Texas, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    metro2_name    = "Origin Metro Area",
    total_moved_in = "People Moved In"
  ) |>
  fmt_integer(total_moved_in) |>
  data_color(
    columns = total_moved_in,
    palette = c("#EEF2FF", "#002868")
  ) |>
  tab_source_note(
    "Source: US Census Bureau, ACS 5-Year Migration Flows API (2016-2020)."
  )

	Origin Metro Area	People Moved In
Top 10 Metro Sources of In-Migration to Texas
Non-Texas metros sending the most residents to Texas, ACS 5-Year 2016-2020
1	Los Angeles-Long Beach-Anaheim, CA Metro Area	27,445
2	New York-Newark-Jersey City, NY-NJ-PA Metro Area	19,853
3	Chicago-Naperville-Elgin, IL-IN-WI Metro Area	16,253
4	Washington-Arlington-Alexandria, DC-VA-MD-WV Metro Area	14,916
5	Phoenix-Mesa-Chandler, AZ Metro Area	12,695
6	San Francisco-Oakland-Berkeley, CA Metro Area	11,203
7	Miami-Fort Lauderdale-Pompano Beach, FL Metro Area	10,300
8	San Diego-Chula Vista-Carlsbad, CA Metro Area	9,509
9	Denver-Aurora-Lakewood, CO Metro Area	9,222
10	Atlanta-Sandy Springs-Alpharetta, GA Metro Area	9,159
Source: US Census Bureau, ACS 5-Year Migration Flows API (2016-2020).

Exploratory Data Analysis

The following analyses address nine key questions about US internal migration patterns, with particular attention to Texas and the Houston metro area.

Q1: Highest State Population Growth Rates

Which states have grown the fastest over the past decade? We measure growth as the percentage change in total population from 2015 to 2024. Idaho and Utah lead on a percentage basis, but Texas leads all states in absolute population added with nearly 3.8 million people over the period.

Code

growth_rates <- state_pop |>
  st_drop_geometry() |>
  filter(year %in% c(2015, 2024)) |>
  select(state_name, state_abbr, year, total_population) |>
  pivot_wider(names_from = year, values_from = total_population,
              names_prefix = "pop_") |>
  mutate(
    growth_pct = 100 * (pop_2024 - pop_2015) / pop_2015,
    pop_added  = pop_2024 - pop_2015
  ) |>
  arrange(desc(growth_pct))

growth_rates |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  select(rank, state_name, state_abbr, pop_2015, pop_2024,
         growth_pct, pop_added) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Population Growth Rates, 2015-2024**"),
    subtitle = "Fastest and slowest growing states by percentage change"
  ) |>
  cols_label(
    state_name  = "State",
    state_abbr  = "",
    pop_2015    = "2015 Population",
    pop_2024    = "2024 Population",
    growth_pct  = "Growth Rate",
    pop_added   = "People Added"
  ) |>
  fmt_integer(c(pop_2015, pop_2024, pop_added)) |>
  fmt_number(growth_pct, decimals = 1, suffix = "%") |>
  tab_row_group(label = "10 Slowest Growing States", rows = 11:20) |>
  tab_row_group(label = "10 Fastest Growing States", rows = 1:10) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_abbr == "TX")
  ) |>
  data_color(
    columns = growth_pct,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year Estimates, Table B01003 (2015, 2024)."
  )

	State		2015 Population	2024 Population	Growth Rate	People Added
State Population Growth Rates, 2015-2024
Fastest and slowest growing states by percentage change
10 Fastest Growing States
1	Idaho	ID	1,654,930	2,001,619	20.9	346,689
2	Utah	UT	2,995,919	3,503,613	16.9	507,694
3	Florida	FL	20,271,272	23,372,215	15.3	3,100,943
4	Texas	TX	27,469,114	31,290,831	13.9	3,821,717
5	Nevada	NV	2,890,845	3,267,467	13.0	376,622
6	South Carolina	SC	4,896,146	5,478,831	11.9	582,685
7	Delaware	DE	945,934	1,051,917	11.2	105,983
8	Arizona	AZ	6,828,065	7,582,384	11.0	754,319
9	Washington	WA	7,170,351	7,958,180	11.0	787,829
10	Montana	MT	1,032,949	1,137,233	10.1	104,284
10 Slowest Growing States
41	Kansas	KS	2,911,641	2,970,606	2.0	58,965
42	Hawaii	HI	1,431,603	1,446,146	1.0	14,543
43	California	CA	39,144,818	39,431,263	0.7	286,445
44	New York	NY	19,795,791	19,867,248	0.4	71,457
45	Wyoming	WY	586,107	587,618	0.3	1,511
46	Alaska	AK	738,432	740,133	0.2	1,701
47	Illinois	IL	12,859,995	12,710,158	−1.2	−149,837
48	Louisiana	LA	4,670,724	4,597,740	−1.6	−72,984
49	Mississippi	MS	2,992,333	2,943,045	−1.6	−49,288
50	West Virginia	WV	1,844,128	1,769,979	−4.0	−74,149
Source: ACS 1-Year Estimates, Table B01003 (2015, 2024).

Q2: Migration To and From New York State

New York’s migration picture is clear: it sends far more people to other states than it receives. Florida and Texas are the top two destinations for departing New Yorkers which directly benefits Texas’s congressional apportionment at New York’s expense.

Code

ny_in <- migration_flows |>
  filter(state_current == "New York",
         state_1y != "New York",
         year_current == 2024) |>
  arrange(desc(population)) |>
  slice_head(n = 10) |>
  mutate(direction = "Into New York")

ny_out <- migration_flows |>
  filter(state_1y == "New York",
         state_current != "New York",
         year_current == 2024) |>
  arrange(desc(population)) |>
  slice_head(n = 10) |>
  mutate(direction = "Out of New York")

bind_rows(
  ny_in  |> select(direction, state = state_1y,      population),
  ny_out |> select(direction, state = state_current, population)
) |>
  group_by(direction) |>
  mutate(rank = row_number()) |>
  ungroup() |>
  pivot_wider(names_from  = direction,
              values_from = c(state, population),
              names_sep   = "_") |>
  select(rank,
         `state_Into New York`, `population_Into New York`,
         `state_Out of New York`, `population_Out of New York`) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**New York State Migration Flows, 2024**"),
    subtitle = "Top 10 origin states for in-migrants and destination states for out-migrants"
  ) |>
  cols_label(
    `state_Into New York`        = "Origin State",
    `population_Into New York`   = "People",
    `state_Out of New York`      = "Destination State",
    `population_Out of New York` = "People"
  ) |>
  fmt_integer(c(`population_Into New York`, `population_Out of New York`)) |>
  tab_spanner(label   = "Moving INTO New York",
              columns = c(`state_Into New York`, `population_Into New York`)) |>
  tab_spanner(label   = "Moving OUT of New York",
              columns = c(`state_Out of New York`, `population_Out of New York`)) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )

	Moving INTO New York		Moving OUT of New York
New York State Migration Flows, 2024
Top 10 origin states for in-migrants and destination states for out-migrants
	Origin State	People	Destination State	People
1	New Jersey	36,002	New Jersey	56,799
2	California	31,367	Florida	50,661
3	Florida	28,080	Pennsylvania	29,274
4	Pennsylvania	23,152	Texas	28,233
5	Massachusetts	17,862	Connecticut	25,095
6	Texas	16,324	California	24,927
7	Connecticut	15,160	Massachusetts	18,697
8	Virginia	10,310	South Carolina	18,324
9	North Carolina	10,076	North Carolina	17,978
10	Maryland	8,400	Virginia	16,364
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q3: Migration To and From New York City Metro

At the metro level, NYC’s largest domestic migration partner is a relatively short-distance neighbor: Philadelphia. More consequential for Texas is the strong flow from NYC to Miami and Los Angeles, as the data show NYC residents dispersing to desirable Sun Belt cities rather than directly to Texas. Nonetheless, New York City was the 4th-largest source of in-migrants to the Houston metro area specifically.

Code

nyc_pattern <- "New York-Newark-Jersey City"

# Top sources INTO NYC - US metros only, excluding non-metro categories
nyc_in <- metro_flows |>
  filter(
    str_detect(metro1_name, nyc_pattern),
    !str_detect(metro2_name, nyc_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_in),
    moved_in > 0
  ) |>
  arrange(desc(moved_in)) |>
  slice_head(n = 10) |>
  select(metro = metro2_name, people = moved_in)

# Top destinations OUT of NYC - US metros only
nyc_out <- metro_flows |>
  filter(
    str_detect(metro1_name, nyc_pattern),
    !str_detect(metro2_name, nyc_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_out),
    moved_out > 0
  ) |>
  arrange(desc(moved_out)) |>
  slice_head(n = 10) |>
  select(metro = metro2_name, people = moved_out)

bind_cols(
  nyc_in  |> mutate(rank = row_number()) |>
    select(rank, in_metro = metro, in_people = people),
  nyc_out |>
    select(out_metro = metro, out_people = people)
) |>
  mutate(
    in_metro  = str_remove(in_metro,  " Metro Area$"),
    out_metro = str_remove(out_metro, " Metro Area$")
  ) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**New York City Metro Migration Flows**"),
    subtitle = "Top 10 origin and destination metros, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    in_metro   = "Origin Metro",
    in_people  = "People",
    out_metro  = "Destination Metro",
    out_people = "People"
  ) |>
  fmt_integer(c(in_people, out_people)) |>
  tab_spanner(label   = "Moving INTO NYC",
              columns = c(in_metro, in_people)) |>
  tab_spanner(label   = "Moving OUT of NYC",
              columns = c(out_metro, out_people)) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020)."
  )

	Moving INTO NYC		Moving OUT of NYC
New York City Metro Migration Flows
Top 10 origin and destination metros, ACS 5-Year 2016-2020
	Origin Metro	People	Destination Metro	People
1	Philadelphia-Camden-Wilmington, PA-NJ-DE-MD	19,600	Philadelphia-Camden-Wilmington, PA-NJ-DE-MD	35,254
2	Washington-Arlington-Alexandria, DC-VA-MD-WV	11,162	Miami-Fort Lauderdale-Pompano Beach, FL	22,411
3	Boston-Cambridge-Newton, MA-NH	10,019	Washington-Arlington-Alexandria, DC-VA-MD-WV	17,558
4	Los Angeles-Long Beach-Anaheim, CA	9,761	Los Angeles-Long Beach-Anaheim, CA	17,107
5	Poughkeepsie-Newburgh-MiddleTown, NY	9,492	Bridgeport-Stamford-Norwalk, CT	16,709
6	Miami-Fort Lauderdale-Pompano Beach, FL	8,085	Poughkeepsie-Newburgh-MiddleTown, NY	15,475
7	Bridgeport-Stamford-Norwalk, CT	7,534	Boston-Cambridge-Newton, MA-NH	15,162
8	Trenton-Princeton, NJ	7,208	Atlanta-Sandy Springs-Alpharetta, GA	12,928
9	San Francisco-Oakland-Berkeley, CA	6,923	Trenton-Princeton, NJ	12,735
10	Albany-Schenectady-Troy, NY	5,512	Allentown-Bethlehem-Easton, PA-NJ	12,284
Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020).

Q4: States With Highest In-, Out-, and Net Migration

Texas leads all states in net domestic in-migration with a surplus of nearly 72,000 people in 2024. At the other extreme, California experienced the largest domestic out-migration drain in the country, losing a net 252,000 residents to other states. Florida’s total flows in both directions reflect its role as both a major destination and a pass-through state.

Code

state_migration_totals <- migration_flows |>
  filter(
    state_current != state_1y,
    year_current  == 2024
  ) |>
  group_by(state = state_current) |>
  summarise(total_in = sum(population, na.rm = TRUE), .groups = "drop") |>
  left_join(
    migration_flows |>
      filter(state_current != state_1y, year_current == 2024) |>
      group_by(state = state_1y) |>
      summarise(total_out = sum(population, na.rm = TRUE), .groups = "drop"),
    by = "state"
  ) |>
  mutate(net_migration = total_in - total_out) |>
  arrange(desc(net_migration))

state_migration_totals |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  left_join(
    state_pop |> st_drop_geometry() |>
      filter(year == 2024) |>
      select(state_name, state_abbr),
    by = c("state" = "state_name")
  ) |>
  select(rank, state, state_abbr, total_in, total_out, net_migration) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Migration Totals, 2024**"),
    subtitle = "Highest and lowest net migration states"
  ) |>
  cols_label(
    state         = "State",
    state_abbr    = "",
    total_in      = "In-Migration",
    total_out     = "Out-Migration",
    net_migration = "Net Migration"
  ) |>
  fmt_integer(c(total_in, total_out, net_migration)) |>
  tab_row_group(label = "10 Highest Net Out-Migration", rows = 11:20) |>
  tab_row_group(label = "10 Highest Net In-Migration",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state == "Texas")
  ) |>
  data_color(
    columns = net_migration,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )

	State		In-Migration	Out-Migration	Net Migration
State Migration Totals, 2024
Highest and lowest net migration states
10 Highest Net In-Migration
1	Texas	TX	553,869	482,029	71,840
2	Florida	FL	571,998	505,072	66,926
3	North Carolina	NC	297,704	239,780	57,924
4	Arizona	AZ	234,723	178,678	56,045
5	South Carolina	SC	189,282	134,348	54,934
6	Nevada	NV	129,109	88,898	40,211
7	Georgia	GA	265,364	228,039	37,325
8	Tennessee	TN	191,386	155,499	35,887
9	Oklahoma	OK	106,621	72,461	34,160
10	Alabama	AL	119,767	97,317	22,450
10 Highest Net Out-Migration
41	Iowa	IA	59,199	69,417	−10,218
42	Alaska	AK	29,456	39,842	−10,386
43	Louisiana	LA	70,847	82,616	−11,769
44	Pennsylvania	PA	232,208	247,858	−15,650
45	Colorado	CO	181,683	204,323	−22,640
46	Massachusetts	MA	150,513	180,497	−29,984
47	New Jersey	NJ	149,512	211,969	−62,457
48	Illinois	IL	199,235	279,613	−80,378
49	New York	NY	280,073	412,187	−132,114
50	California	CA	404,745	656,646	−251,901
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q5: Metro Areas With Highest In-, Out-, and Net Migration

At the metro level, Phoenix leads all US metros in net in-migration, followed by Riverside and Dallas-Fort Worth. Notably, two Texas metros, Dallas-Fort Worth and Austin, rank in the top four nationally. The bottom of the table is dominated by coastal legacy metros: New York, Los Angeles, Chicago, and San Francisco all rank among the largest net losers.

Code

metro_totals <- metro_flows |>
  filter(!is.na(metro1_state), !is.na(metro2_state)) |>
  group_by(metro = metro1_name, state = metro1_state) |>
  summarise(
    total_in  = sum(moved_in,  na.rm = TRUE),
    total_out = sum(moved_out, na.rm = TRUE),
    .groups   = "drop"
  ) |>
  mutate(
    net_migration = total_in - total_out,
    metro_short   = str_remove(metro, " Metro Area$")
  ) |>
  arrange(desc(net_migration))

metro_totals |>
  slice(c(1:10, (nrow(metro_totals) - 9):nrow(metro_totals))) |>
  mutate(rank = c(1:10, (nrow(metro_totals) - 9):nrow(metro_totals))) |>
  select(rank, metro_short, state, total_in, total_out, net_migration) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Metro Area Migration Totals**"),
    subtitle = "Highest and lowest net migration metros, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    metro_short   = "Metro Area",
    state         = "State",
    total_in      = "In-Migration",
    total_out     = "Out-Migration",
    net_migration = "Net Migration"
  ) |>
  fmt_integer(c(total_in, total_out, net_migration)) |>
  tab_row_group(label = "10 Highest Net Out-Migration", rows = 11:20) |>
  tab_row_group(label = "10 Highest Net In-Migration",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  data_color(
    columns = net_migration,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020)."
  )

	Metro Area	State	In-Migration	Out-Migration	Net Migration
Metro Area Migration Totals
Highest and lowest net migration metros, ACS 5-Year 2016-2020
10 Highest Net In-Migration
1	Phoenix-Mesa-Chandler, AZ	AZ	182,053	126,179	55,874
2	Riverside-San Bernardino-Ontario, CA	CA	178,045	137,731	40,314
3	Dallas-Fort Worth-Arlington, TX	TX	215,511	180,491	35,020
4	Austin-Round Rock-GeorgeTown, TX	TX	116,913	85,901	31,012
5	Las Vegas-Henderson-Paradise, NV	NV	92,544	61,696	30,848
6	Tampa-St. Petersburg-Clearwater, FL	FL	134,637	105,146	29,491
7	Orlando-Kissimmee-Sanford, FL	FL	133,976	105,395	28,581
8	Jacksonville, FL	FL	73,419	54,772	18,647
9	Deltona-Daytona Beach-Ormond Beach, FL	FL	41,276	23,312	17,964
10	Nashville-Davidson--Murfreesboro--Franklin, TN	TN	73,461	55,589	17,872
10 Highest Net Out-Migration
383	Detroit-Warren-Dearborn, MI	MI	72,716	94,314	−21,598
384	Boston-Cambridge-Newton, MA-NH	MA	134,190	160,705	−26,515
385	San Jose-Sunnyvale-Santa Clara, CA	CA	78,555	107,431	−28,876
386	Miami-Fort Lauderdale-Pompano Beach, FL	FL	135,718	176,510	−40,792
387	San Juan-Bayamón-Caguas, PR	PR	20,943	62,430	−41,487
388	Washington-Arlington-Alexandria, DC-VA-MD-WV	DC	209,588	251,422	−41,834
389	San Francisco-Oakland-Berkeley, CA	CA	162,213	204,077	−41,864
390	Chicago-Naperville-Elgin, IL-IN-WI	IL	156,175	245,692	−89,517
391	Los Angeles-Long Beach-Anaheim, CA	CA	242,529	375,810	−133,281
392	New York-Newark-Jersey City, NY-NJ-PA	NY	235,844	490,636	−254,792
Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020).

Q6: States With Lowest Proportional In-Migration

What are the “sticky” states where residents are least likely to have moved from elsewhere? California leads this ranking, with just 1.05% of its population having arrived from another state in 2024. Texas ranks 9th on this list despite its rapid growth, reflecting the fact that its enormous existing population base dilutes even large absolute in-migration flows.

Code

migration_flows |>
  filter(year_current == 2024) |>
  group_by(state = state_current) |>
  summarise(
    total_pop = sum(population, na.rm = TRUE),
    moved_in  = sum(population[state_current != state_1y], na.rm = TRUE),
    .groups   = "drop"
  ) |>
  mutate(pct_moved_in = 100 * moved_in / total_pop) |>
  arrange(pct_moved_in) |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  left_join(
    state_pop |> st_drop_geometry() |>
      filter(year == 2024) |>
      select(state_name, state_abbr),
    by = c("state" = "state_name")
  ) |>
  select(rank, state, state_abbr, total_pop, moved_in, pct_moved_in) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State Population Stickiness, 2024**"),
    subtitle = "States ranked by fraction of residents who moved in from another state"
  ) |>
  cols_label(
    state        = "State",
    state_abbr   = "",
    total_pop    = "Total Population",
    moved_in     = "Moved In From Another State",
    pct_moved_in = "% In-Migrants"
  ) |>
  fmt_integer(c(total_pop, moved_in)) |>
  fmt_number(pct_moved_in, decimals = 2, suffix = "%") |>
  tab_row_group(label = "10 Least Sticky (Most In-Migration)", rows = 11:20) |>
  tab_row_group(label = "10 Most Sticky (Least In-Migration)", rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state == "Texas")
  ) |>
  data_color(
    columns = pct_moved_in,
    palette = c("#EEF2FF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )

	State		Total Population	Moved In From Another State	% In-Migrants
State Population Stickiness, 2024
States ranked by fraction of residents who moved in from another state
10 Most Sticky (Least In-Migration)
1	California	CA	38,730,393	404,745	1.05
2	Michigan	MI	9,988,333	139,882	1.40
3	New York	NY	19,491,045	280,073	1.44
4	Louisiana	LA	4,520,727	70,847	1.57
5	Illinois	IL	12,491,116	199,235	1.60
6	New Jersey	NJ	9,309,234	149,512	1.61
7	Ohio	OH	11,704,591	197,471	1.69
8	Pennsylvania	PA	12,886,895	232,208	1.80
9	Texas	TX	30,622,873	553,869	1.81
10	Minnesota	MN	5,698,255	103,924	1.82
10 Least Sticky (Most In-Migration)
41	Rhode Island	RI	1,090,525	34,860	3.20
42	New Hampshire	NH	1,389,355	45,900	3.30
43	North Dakota	ND	781,941	27,062	3.46
44	South Carolina	SC	5,391,825	189,282	3.51
45	Idaho	ID	1,970,568	72,442	3.68
46	Vermont	VT	641,786	23,884	3.72
47	Hawaii	HI	1,419,055	53,093	3.74
48	Nevada	NV	3,205,924	129,109	4.03
49	Alaska	AK	724,947	29,456	4.06
50	Wyoming	WY	579,872	25,060	4.32
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q7: States Where Migration Drives the Most Growth

Surprisingly, Texas does not appear in the top 15 despite having the highest absolute net in-migration of any state. This reflects the fact that Texas’s population growth is so large overall, driven by natural increase and international immigration, that internal US migration accounts for a relatively modest share of total growth. By contrast, states like Wyoming and Vermont are growing slowly enough that even modest net migration represents a large fraction of their total population change. Wyoming’s 428% figure (bar capped at 100% for readability) illustrates this extreme case: the state added only 1,511 residents total, so even a few hundred net migrants tips the ratio dramatically.

Code

migration_growth_share <- state_migration_totals |>
  left_join(growth_rates, by = c("state" = "state_name")) |>
  filter(!is.na(pop_added), pop_added > 0) |>
  mutate(migration_share = 100 * net_migration / pop_added) |>
  filter(!is.na(migration_share), is.finite(migration_share)) |>
  arrange(desc(migration_share))

plot_data <- migration_growth_share |>
  slice_max(migration_share, n = 15) |>
  mutate(
    state_label      = glue("{state} ({state_abbr})"),
    is_texas         = state == "Texas",
    migration_capped = pmin(migration_share, 100),
    label_text       = if_else(
      migration_share > 100,
      paste0(round(migration_share, 0), "%*"),
      paste0(round(migration_share, 0), "%")
    )
  )

ggplot(plot_data,
       aes(x = reorder(state_label, migration_share),
           y = migration_capped,
           fill = is_texas)) +
  geom_col(width = 0.7) +
  geom_text(
    aes(label = label_text),
    hjust = -0.1, size = 3.5
  ) +
  scale_fill_manual(
    values = c("FALSE" = "#6B6B6B", "TRUE" = "#002868"),
    guide  = "none"
  ) +
  scale_y_continuous(
    limits = c(0, 120),
    labels = \(x) paste0(x, "%")
  ) +
  coord_flip() +
  labs(
    title    = "Share of Population Growth Attributable to Net In-Migration",
    subtitle = "Top 15 states; bars capped at 100% (* = actual value exceeds cap)",
    x        = NULL,
    y        = "Net Migration as % of Total Population Growth",
    caption  = "Source: ACS 1-Year Migration Flows and Population Estimates (2024).\n* Wyoming actual value = 428%."
  ) +
  theme_tx()

Q8: Texas-Specific Migration Patterns

The more telling pattern for Texas’s congressional position involves Oklahoma and Colorado. Unlike California, where the flow runs heavily in both directions, Texans are leaving for Oklahoma and Colorado at high rates without equivalent numbers coming back. This is a net drain worth targeting, and notably, Oklahoma leans politically closer to Texas than Colorado does, making it a particularly promising market for a retention campaign.

Code

tx_in <- migration_flows |>
  filter(
    state_current == "Texas",
    state_1y      != "Texas",
    year_current  == 2024
  ) |>
  arrange(desc(population)) |>
  slice_head(n = 10)

tx_out <- migration_flows |>
  filter(
    state_1y      == "Texas",
    state_current != "Texas",
    year_current  == 2024
  ) |>
  arrange(desc(population)) |>
  slice_head(n = 10)

bind_cols(
  tx_in  |> mutate(rank = row_number()) |>
    select(rank, in_state = state_1y, in_pop = population),
  tx_out |>
    select(out_state = state_current, out_pop = population)
) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Texas Migration Flows, 2024**"),
    subtitle = "Top 10 origin states for in-migrants and destinations for out-migrants"
  ) |>
  cols_label(
    in_state  = "Origin State",
    in_pop    = "People Moving In",
    out_state = "Destination State",
    out_pop   = "People Moving Out"
  ) |>
  fmt_integer(c(in_pop, out_pop)) |>
  tab_spanner(
    label   = "Moving INTO Texas",
    columns = c(in_state, in_pop)
  ) |>
  tab_spanner(
    label   = "Moving OUT of Texas",
    columns = c(out_state, out_pop)
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  )

	Moving INTO Texas		Moving OUT of Texas
Texas Migration Flows, 2024
Top 10 origin states for in-migrants and destinations for out-migrants
	Origin State	People Moving In	Destination State	People Moving Out
1	California	77,161	California	45,447
2	Florida	52,219	Florida	45,259
3	New York	28,233	Oklahoma	28,074
4	Louisiana	24,170	Colorado	27,574
5	Illinois	23,509	Georgia	20,968
6	Colorado	22,986	Louisiana	19,850
7	Georgia	19,481	Illinois	18,189
8	Oklahoma	18,579	North Carolina	16,498
9	Virginia	17,996	New York	16,324
10	Washington	15,937	Washington	16,202
Source: ACS 1-Year State-to-State Migration Flow Tables (2024).

Q9: Houston Metro Migration Patterns

Houston is Texas’s largest metro area. Within Texas, Houston exchanges people heavily with Dallas-Fort Worth, Austin, and San Antonio. This reflects how Texans move between the state’s major cities and economic centers. The connection-strength chart below reveals which metros are most disproportionately tied to Houston relative to their total out-migration.

Code

houston_pattern <- "Houston-The Woodlands-Sugar Land"

# Top sources INTO Houston - US metros only
houston_in <- metro_flows |>
  filter(
    str_detect(metro1_name, houston_pattern),
    !str_detect(metro2_name, houston_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_in),
    moved_in > 0
  ) |>
  arrange(desc(moved_in)) |>
  slice_head(n = 10)

# Top destinations OUT of Houston - US metros only
houston_out <- metro_flows |>
  filter(
    str_detect(metro1_name, houston_pattern),
    !str_detect(metro2_name, houston_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_out),
    moved_out > 0
  ) |>
  arrange(desc(moved_out)) |>
  slice_head(n = 10)

bind_cols(
  houston_in  |> mutate(rank = row_number()) |>
    select(rank, in_metro = metro2_name, in_people = moved_in),
  houston_out |>
    select(out_metro = metro2_name, out_people = moved_out)
) |>
  mutate(
    in_metro  = str_remove(in_metro,  " Metro Area$"),
    out_metro = str_remove(out_metro, " Metro Area$")
  ) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Houston Metro Migration Flows**"),
    subtitle = "Top 10 origin and destination metros, ACS 5-Year 2016-2020"
  ) |>
  cols_label(
    in_metro   = "Origin Metro",
    in_people  = "People",
    out_metro  = "Destination Metro",
    out_people = "People"
  ) |>
  fmt_integer(c(in_people, out_people)) |>
  tab_spanner(
    label   = "Moving INTO Houston",
    columns = c(in_metro, in_people)
  ) |>
  tab_spanner(
    label   = "Moving OUT of Houston",
    columns = c(out_metro, out_people)
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_spanners()
  ) |>
  tab_source_note(
    "Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020)."
  )

	Moving INTO Houston		Moving OUT of Houston
Houston Metro Migration Flows
Top 10 origin and destination metros, ACS 5-Year 2016-2020
	Origin Metro	People	Destination Metro	People
1	Dallas-Fort Worth-Arlington, TX	17,235	Dallas-Fort Worth-Arlington, TX	17,218
2	Austin-Round Rock-GeorgeTown, TX	10,056	Austin-Round Rock-GeorgeTown, TX	15,222
3	San Antonio-New Braunfels, TX	9,134	San Antonio-New Braunfels, TX	8,932
4	New York-Newark-Jersey City, NY-NJ-PA	5,719	College Station-Bryan, TX	6,193
5	College Station-Bryan, TX	4,709	Beaumont-Port Arthur, TX	4,765
6	Beaumont-Port Arthur, TX	4,407	Lubbock, TX	2,955
7	Washington-Arlington-Alexandria, DC-VA-MD-WV	4,166	Atlanta-Sandy Springs-Alpharetta, GA	2,842
8	Los Angeles-Long Beach-Anaheim, CA	4,082	Denver-Aurora-Lakewood, CO	2,699
9	Chicago-Naperville-Elgin, IL-IN-WI	4,031	McAllen-Edinburg-Mission, TX	2,664
10	New Orleans-Metairie, LA	3,789	Seattle-Tacoma-Bellevue, WA	2,504
Source: ACS 5-Year Metro-to-Metro Migration Flows API (2016-2020).

The chart below identified each metro’s draw to Houston measured as the share of its total out-migration that flows specifically to Houston. Metros near the top of this list are the most fertile advertising targets: their residents are already choosing Houston at an unusually high rate, suggesting strong word-of-mouth networks and existing community ties.

Code

metro_flows |>
  filter(
    str_detect(metro1_name, houston_pattern),
    !str_detect(metro2_name, houston_pattern),
    str_detect(metro2_name, "Metro Area"),
    !str_detect(metro2_name, "Outside Metro Area"),
    !is.na(moved_in),
    moved_in > 0,
    !is.na(metro2_state)
  ) |>
  left_join(
    metro_totals |> select(metro, total_out),
    by = c("metro2_name" = "metro")
  ) |>
  mutate(
    houston_share = 100 * moved_in / total_out,
    metro_short   = str_remove(metro2_name, " Metro Area$")
  ) |>
  filter(!is.na(houston_share), total_out > 5000) |>
  slice_max(houston_share, n = 15) |>
  ggplot(aes(x = reorder(metro_short, houston_share),
             y = houston_share)) +
  geom_col(fill = "#002868", width = 0.7) +
  geom_text(
    aes(label = paste0(round(houston_share, 1), "%")),
    hjust = -0.1, size = 3
  ) +
  scale_y_continuous(
    limits = c(0, 15),
    labels = \(x) paste0(x, "%"),
    oob    = scales::squish
  ) +
  coord_flip() +
  labs(
    title    = "Metros With Strongest Migration Connection to Houston",
    subtitle = "Houston's share of each metro's total out-migration (metros with >5,000 total out-migrants)",
    x        = NULL,
    y        = "Houston's Share of Origin Metro's Out-Migration",
    caption  = "Source: ACS 5-Year Metro-to-Metro Migration Flows (2016-2020)."
  ) +
  theme_tx()

Metros with the strongest migration connection to Houston, measured as Houston’s share of total out-migration from each origin metro. A high share indicates Houston is a uniquely popular destination.

Population Projections to 2030

To estimate Texas’s congressional apportionment after the 2030 census, we first need to forecast each state’s 2030 population. We use the migration model described in the assignment¹, which assumes population growth follows:

\[P_{i,t+1} = P_{i,t} \cdot (1 + \gamma) + \sum_{j \neq i} \lambda_{ij} \cdot P_{i,t} \cdot P_{j,t}\]

where $\gamma$ is a national natural growth rate and $\lambda_{ij}$ captures the migration flow rate from state $j$ to state $i$. We fit parameters using both the 2023 and 2024 flow files and average the lambda values for stability. Because the 2023 file format does not fully capture stationary population, gamma (the natural growth rate) is estimated from 2024 data only, yielding γ = −0.27% per year. This slightly negative value reflects the fact that the ACS domestic flow tables do not capture international immigration, which has been a substantial driver of US population growth. As a result, our projections should be interpreted as conservative estimates on the low end, particularly for high-immigration states like Texas.

The table below shows projected 2030 populations for the 10 largest and 10 smallest states. Under this conservative model, most states show modest population declines driven by the negative gamma and reminds us that these projections exclude international immigration and should be read as a domestic-migration-only baseline.

Code

pop_2030 <- pop_projections |>
  filter(year == 2030) |>
  arrange(desc(total_population))

pop_2030 |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  left_join(
    pop_2024 |>
      st_drop_geometry() |>
      select(state_name, pop_2024 = total_population),
    by = "state_name"
  ) |>
  mutate(projected_growth = total_population - pop_2024) |>
  select(rank, state_name, state_abbr,
         pop_2024, total_population, projected_growth) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Projected State Populations, 2030**"),
    subtitle = "Based on averaged 2023-2024 migration parameters"
  ) |>
  cols_label(
    state_name       = "State",
    state_abbr       = "",
    pop_2024         = "2024 Population",
    total_population = "Projected 2030 Population",
    projected_growth = "Projected Growth"
  ) |>
  fmt_integer(c(pop_2024, total_population, projected_growth)) |>
  tab_row_group(label = "10 Smallest Projected States", rows = 11:20) |>
  tab_row_group(label = "10 Largest Projected States",  rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_abbr == "TX")
  ) |>
  data_color(
    columns = projected_growth,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year Estimates + author projections using averaged 2023-2024 migration parameters."
  )

	State		2024 Population	Projected 2030 Population	Projected Growth
Projected State Populations, 2030
Based on averaged 2023-2024 migration parameters
10 Largest Projected States
1	California	CA	39,431,263	38,874,574	−556,689
2	Texas	TX	31,290,831	30,543,128	−747,703
3	Florida	FL	23,372,215	22,750,215	−622,000
4	New York	NY	19,867,248	19,703,296	−163,952
5	Pennsylvania	PA	13,078,751	12,929,329	−149,422
6	Illinois	IL	12,710,158	12,555,657	−154,501
7	Ohio	OH	11,883,304	11,752,521	−130,783
8	Georgia	GA	11,180,878	10,926,885	−253,993
9	North Carolina	NC	11,046,024	10,702,252	−343,772
10	Michigan	MI	10,140,459	10,056,924	−83,535
10 Smallest Projected States
41	New Hampshire	NH	1,409,032	1,429,413	20,381
42	Maine	ME	1,405,012	1,371,444	−33,568
43	Montana	MT	1,137,233	1,121,845	−15,388
44	Rhode Island	RI	1,112,308	1,104,582	−7,726
45	Delaware	DE	1,051,917	1,019,574	−32,343
46	South Dakota	SD	924,669	888,092	−36,577
47	North Dakota	ND	796,568	764,302	−32,266
48	Alaska	AK	740,133	714,992	−25,141
49	Vermont	VT	648,493	643,941	−4,552
50	Wyoming	WY	587,618	598,345	10,727
Source: ACS 1-Year Estimates + author projections using averaged 2023-2024 migration parameters.

The chart below places these projections in historical context. Dashed lines indicate projected values from 2025-2030. Texas and Florida show the flattest projected trajectories among major states, while New York and California continue their slow decline under the domestic-only model.

Code

focus_states <- c("Texas", "California", "Florida", "New York", "Arizona")

pop_all |>
  filter(state_name %in% focus_states) |>
  ggplot(aes(x = year, y = total_population / 1e6,
             colour = state_name, linetype = type)) +
  geom_rect(
    xmin = 2024.5, xmax = 2030.5,
    ymin = -Inf,   ymax = Inf,
    fill = "#F5F5F5", colour = NA, alpha = 0.05
  ) +
  geom_line(linewidth = 1.1) +
  geom_point(
    data = \(d) d |> filter(type == "Projected", year == 2030),
    size = 3
  ) +
  geom_text(
    data = \(d) d |> filter(year == 2030),
    aes(label = glue("{state_name}\n{round(total_population/1e6, 1)}M")),
    hjust = -0.1, size = 3, show.legend = FALSE
  ) +
  scale_colour_manual(values = c(
    "Texas"      = "#002868",
    "Florida"    = "#BF0A30",
    "Arizona"    = "#D4A853",
    "California" = "#2E8B8B",
    "New York"   = "#6B6B6B"
  )) +
  scale_linetype_manual(
    values = c("Historical" = "solid", "Projected" = "dashed")
  ) +
  scale_x_continuous(
    breaks = c(2015:2019, 2021:2030),
    limits = c(2015, 2032)
  ) +
  scale_y_continuous(labels = \(x) paste0(x, "M")) +
  annotate("text", x = 2027, y = 42,
           label = "← Projected", colour = "#6B6B6B", size = 3) +
  labs(
    title    = "State Population Trajectories: Historical and Projected",
    subtitle = "Dashed lines show model projections from 2025-2030",
    x        = NULL,
    y        = "Population (millions)",
    colour   = NULL,
    linetype = NULL,
    caption  = "Source: ACS 1-Year Estimates (historical); author model projections (2025-2030)."
  ) +
  theme_tx()

Historical and projected populations for Texas and four comparison states, 2015-2030. Shaded region indicates projected years. Texas is forecast to approach 34 million residents by 2030.

Congressional Reapportionment

Under the Huntington-Hill method², the 435 seats in the House of Representatives are allocated by iteratively assigning each seat to the state with the highest priority value, defined as the state’s population divided by the geometric mean of its current and next seat count. Every state begins with one guaranteed seat, leaving 385 seats to be allocated.

The model projects that only four states change their seat count under the 2030 reapportionment: New York and Michigan each gain a seat, while Texas and Virginia each lose one. Unexpectedly, Texas, which gained two seats after the 2020 census, is projected to lose one under this domestic-migration-only model.

Code

apportionment_change |>
  filter(seat_change != 0 | state_name == "Texas") |>
  arrange(desc(seat_change), desc(pop_2030)) |>
  mutate(rank = row_number()) |>
  select(rank, state_name, state_abbr,
         seats_2024, seats_2030, seat_change, pop_2030) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Projected Congressional Seat Changes, 2030 Reapportionment**"),
    subtitle = "States gaining or losing seats based on 2030 population projections"
  ) |>
  cols_label(
    state_name  = "State",
    state_abbr  = "",
    seats_2024  = "2024 Seats",
    seats_2030  = "Projected 2030 Seats",
    seat_change = "Change",
    pop_2030    = "Projected 2030 Population"
  ) |>
  fmt_integer(pop_2030) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_name == "Texas")
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = seat_change > 0,
      columns = seat_change
    )
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#BF0A30"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = seat_change < 0,
      columns = seat_change
    )
  ) |>
  data_color(
    columns = seat_change,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: Author projections; Huntington-Hill apportionment method."
  )

	State		2024 Seats	Projected 2030 Seats	Change	Projected 2030 Population
Projected Congressional Seat Changes, 2030 Reapportionment
States gaining or losing seats based on 2030 population projections
1	New York	NY	27	28	1	19,703,296
2	Michigan	MI	13	14	1	10,056,924
3	Texas	TX	44	43	-1	30,543,128
4	Virginia	VA	12	11	-1	8,618,815
Source: Author projections; Huntington-Hill apportionment method.

Code

tibble(
  metric = c("2024 Apportionment (current)",
             "2030 Projected Apportionment",
             "Projected Change"),
  value  = c(
    as.character(tx_seats_2024),
    as.character(tx_seats_2030),
    if_else(tx_change >= 0,
            paste0("+", tx_change),
            as.character(tx_change))
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Texas Congressional Delegation: 2024 vs. 2030**"),
    subtitle = "Based on domestic migration model projections"
  ) |>
  cols_label(metric = "Metric", value = "Seats") |>
  tab_style(
    style = list(
      cell_fill(color = "#BF0A30"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = metric == "Projected Change",
      columns = value
    )
  ) |>
  tab_style(
    style = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = metric != "Projected Change")
  ) |>
  tab_source_note(
    "Source: Author projections using Huntington-Hill method.
     Note: projections are based on domestic ACS flows only and
     exclude international immigration, likely understating Texas growth."
  )

Metric	Seats
Texas Congressional Delegation: 2024 vs. 2030
Based on domestic migration model projections
2024 Apportionment (current)	44
2030 Projected Apportionment	43
Projected Change	-1
Source: Author projections using Huntington-Hill method. Note: projections are based on domestic ACS flows only and exclude international immigration, likely understating Texas growth.

Advertising Strategy to Protect Texas’s Congressional Footprint

Our model projects that Texas is at risk of losing a congressional seat in the 2032 reapportionment cycle. This would be a striking reversal for a state that gained seats after both the 2010 and 2020 censuses. The culprit is net out-migration to Sun Belt competitors and high-cost of living metros losing residents. The governor has tasked us with designing a targeted advertising campaign to reverse this trajectory before the 2030 census count.

The Strategic Objective

The Huntington-Hill calculation tends to be marginal: in the 2020 cycle, New York kept its 26th seat by fewer than 100 people. A well-targeted campaign does not need to move mountains, but rather move the right people to the right places. Our analysis identifies three specific levers:

Lever 1: Increase existing in-migration corridors. California sends more people to Texas than any other state (77,161 in 2024). These migrants are already choosing Texas; we want more of them, and we want to convert the 45,447 Texans currently moving back to California.

Lever 2: Reduce out-migration to Oklahoma and Colorado. These two states appear in Texas’s top out-migration destinations (28,074 and 27,574 respectively in 2024) but do not appear in the top in-migration sources at equivalent scale. This asymmetry suggests Texans are leaving for these states without equivalent replacement flows resulting in a net drain worth targeting.

Lever 3: Target seat-vulnerable states. Our model shows Michigan gaining a seat. Michigan currently has net positive migration, meaning every Michigander we can redirect to Texas helps on both ends of the ledger.

How Much Migration Is Needed?

To determine how many additional residents Texas needs, we incrementally added migrants to the 2030 projection and re-ran the Huntington-Hill allocation until Texas retained 44 seats. The threshold: approximately 197,000 additional net residents above the baseline projection.

Code

tibble(
  metric = c(
    "Projected 2030 Texas population (baseline)",
    "Population needed to retain 44th seat",
    "Additional net residents required",
    "Years remaining until 2030 census",
    "Annual net migration target"
  ),
  value = c(
    format(
      pop_projections |> filter(year == 2030, state_name == "Texas") |>
        pull(total_population) |> round(),
      big.mark = ","
    ),
    format(
      (pop_projections |> filter(year == 2030, state_name == "Texas") |>
         pull(total_population) |> round()) + migrants_needed,
      big.mark = ","
    ),
    format(migrants_needed, big.mark = ","),
    "4",
    format(round(migrants_needed / 4), big.mark = ",")
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Texas Seat Retention: Migration Target**"),
    subtitle = "How many additional residents does Texas need by 2030?"
  ) |>
  cols_label(metric = "Metric", value = "Value") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(
      rows    = metric == "Additional net residents required",
      columns = everything()
    )
  ) |>
  tab_style(
    style = cell_fill(color = "#EEF2FF"),
    locations = cells_body(
      rows = metric != "Additional net residents required"
    )
  ) |>
  tab_source_note(
    "Source: Author projections using Huntington-Hill method."
  )

Metric	Value
Texas Seat Retention: Migration Target
How many additional residents does Texas need by 2030?
Projected 2030 Texas population (baseline)	30,543,128
Population needed to retain 44th seat	30,740,128
Additional net residents required	197,000
Years remaining until 2030 census	4
Annual net migration target	49,250
Source: Author projections using Huntington-Hill method.

Advertising Campaign Design

Code

tibble(
  target_market = c(
    "Los Angeles Metro (CA)",
    "New York City Metro (NY/NJ)",
    "Chicago Metro (IL)",
    "Oklahoma City + Tulsa (OK)",
    "Denver Metro (CO)",
    "Washington DC Metro"
  ),
  rationale = c(
    "Largest single source of in-migrants to Texas (27,445/yr to Houston alone); large population of former Texans",
    "19,853/yr already flowing to Texas; 28,233 Texans moved to NY in 2024 - a prime recapture target",
    "23,509 moved to TX in 2024; Illinois is shrinking - residents are receptive to leaving",
    "28,074 Texans moved here in 2024 - our largest asymmetric out-migration destination",
    "27,574 Texans moved here; high cost of living makes Texas's affordability pitch compelling",
    "14,916/yr to Texas metros; large professional population and high cost of living"
  ),
  campaign_type = c(
    "Digital + outdoor; target tech and entertainment workers",
    "Digital; target finance and media workers priced out of NYC",
    "Digital + radio; target families and blue-collar workers",
    "Retention: target recent TX out-migrants; digital re-engagement",
    "Retention: target outdoor enthusiasts with TX Hill Country pitch",
    "Digital; target federal contractors and policy professionals"
  ),
  target_migrants = c(25000, 15000, 12000, 10000, 8000, 5000)
) |>
  mutate(rank = row_number()) |>
  select(rank, target_market, rationale, campaign_type, target_migrants) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**Texas Migration Advertising Strategy**"),
    subtitle = "Targeted markets, rationale, and migration goals"
  ) |>
  cols_label(
    target_market    = "Target Market",
    rationale        = "Data Rationale",
    campaign_type    = "Campaign Approach",
    target_migrants  = "Target Net Migrants"
  ) |>
  fmt_integer(target_migrants) |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = c(1, 2))
  ) |>
  cols_width(
    rationale     ~ px(280),
    campaign_type ~ px(200)
  ) |>
  grand_summary_rows(
    columns  = target_migrants,
    fns      = list("Total Target" ~ sum(.)),
    fmt      = ~ fmt_integer(.)
  ) |>
  tab_source_note(
    "Source: ACS 1-Year State-to-State Migration Flows (2024); ACS 5-Year Metro Flows (2016-2020)."
  )

	Target Market	Data Rationale	Campaign Approach	Target Net Migrants
Texas Migration Advertising Strategy
Targeted markets, rationale, and migration goals
1	Los Angeles Metro (CA)	Largest single source of in-migrants to Texas (27,445/yr to Houston alone); large population of former Texans	Digital + outdoor; target tech and entertainment workers	25,000
2	New York City Metro (NY/NJ)	19,853/yr already flowing to Texas; 28,233 Texans moved to NY in 2024 - a prime recapture target	Digital; target finance and media workers priced out of NYC	15,000
3	Chicago Metro (IL)	23,509 moved to TX in 2024; Illinois is shrinking - residents are receptive to leaving	Digital + radio; target families and blue-collar workers	12,000
4	Oklahoma City + Tulsa (OK)	28,074 Texans moved here in 2024 - our largest asymmetric out-migration destination	Retention: target recent TX out-migrants; digital re-engagement	10,000
5	Denver Metro (CO)	27,574 Texans moved here; high cost of living makes Texas's affordability pitch compelling	Retention: target outdoor enthusiasts with TX Hill Country pitch	8,000
6	Washington DC Metro	14,916/yr to Texas metros; large professional population and high cost of living	Digital; target federal contractors and policy professionals	5,000
Total Target	—	—	—	75,000
Source: ACS 1-Year State-to-State Migration Flows (2024); ACS 5-Year Metro Flows (2016-2020).

Campaign Slogan

Based on our data, the most effective strategy targets people who are already considering a move out of high-cost-of-living urban cities in California, New York, and Illinois, and lure them with Texas’s combination of economic opportunity, low taxes, and quality of life. Our proposed campaign slogan:

“Everything is bigger in Texas… and that includes your buying power.”

For the retention campaign targeting Texans who moved to Oklahoma and Colorado, we recommend a different message that speaks to people who may have left for specific lifestyle reasons:

“Miss the Space? Texas Never Left.”

Sizing the Campaign

Code

tibble(
  metric = c(
    "Additional net migrants needed to retain 44th seat",
    "Total migrants targeted across six markets",
    "Implied campaign success rate needed",
    "Approximate addressable audience (metro populations)",
    "Estimated cost at $50 per targeted impression"
  ),
  value = c(
    format(migrants_needed, big.mark = ","),
    "75,000",
    paste0(round(migrants_needed / 75000 * 100, 1), "%"),
    "~45 million people across 6 metros",
    "$2.25 billion (over 4 years)"
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Campaign Sizing Summary**"),
    subtitle = "Estimated scale required to retain Texas's 44th congressional seat"
  ) |>
  cols_label(metric = "Metric", value = "Estimate") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = c(1, 3))
  ) |>
  tab_source_note(
    "Cost estimate is illustrative; actual costs depend on media mix and market conditions."
  )

Metric	Estimate
Campaign Sizing Summary
Estimated scale required to retain Texas's 44th congressional seat
Additional net migrants needed to retain 44th seat	197,000
Total migrants targeted across six markets	75,000
Implied campaign success rate needed	262.7%
Approximate addressable audience (metro populations)	~45 million people across 6 metros
Estimated cost at $50 per targeted impression	$2.25 billion (over 4 years)
Cost estimate is illustrative; actual costs depend on media mix and market conditions.

Conclusion

Texas stands at a demographic crossroads. For two consecutive census cycles it has been the nation’s growth engine, adding more residents than any other state. But our model, relying only on domestic migration flows, suggests that growth may not be sufficient to protect all 44 congressional seats in 2032. The margin is tight enough that a well-funded, data-driven advertising campaign targeting the six markets identified above could close the gap.

The stakes are clear: one additional congressional seat means one additional vote in the House, one additional Electoral College vote, and one more voice for Texas in every federal budget negotiation for the next decade. At roughly 197,000 net new residents needed, the math is straightforward. The only question is whether the political will exists to pursue it.

Extra Credit #03: State-Level Natural Growth Rates

The baseline model used in our apportionment and advertising strategy sections assumes a single national natural growth rate γ applied uniformly to all states. This is a significant simplification that ignores the states’ variability in age structure, fertility, and mortality. Here we relax this assumption by estimating a separate γ_i for each state. Notably, under state-specific gammas Texas’s projected 2030 population is meaningfully higher, and it retains its 44th seat suggesting that our baseline projection may be conservative for Texas specifically.

Code

state_gamma |>
  left_join(
    state_pop |> st_drop_geometry() |>
      filter(year == 2024) |>
      select(state_name, state_abbr),
    by = "state_name"
  ) |>
  arrange(desc(gamma_i)) |>
  mutate(rank = row_number()) |>
  slice(c(1:10, 41:50)) |>
  mutate(rank = c(1:10, 41:50)) |>
  select(rank, state_name, state_abbr, gamma_i, intl_in) |>
  gt(rowname_col = "rank") |>
  tab_header(
    title    = md("**State-Specific Natural Growth Rates (γ_i), 2024**"),
    subtitle = "States with highest and lowest estimated natural growth rates"
  ) |>
  cols_label(
    state_name = "State",
    state_abbr = "",
    gamma_i    = "γ_i (Natural Growth Rate)",
    intl_in    = "Estimated International Arrivals"
  ) |>
  fmt_percent(gamma_i, decimals = 2) |>
  fmt_integer(intl_in) |>
  tab_row_group(label = "10 Lowest Natural Growth Rates",  rows = 11:20) |>
  tab_row_group(label = "10 Highest Natural Growth Rates", rows = 1:10)  |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_row_groups()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = state_name == "Texas")
  ) |>
  data_color(
    columns = gamma_i,
    palette = c("#BF0A30", "#FFFFFF", "#002868")
  ) |>
  tab_source_note(
    "Source: ACS 1-Year Estimates + Migration Flows (2024). γ_i estimated as
     (P_2024 - P_2023 - intl_in) / P_2023 for each state."
  )

	State		γ_i (Natural Growth Rate)	Estimated International Arrivals
State-Specific Natural Growth Rates (γ_i), 2024
States with highest and lowest estimated natural growth rates
10 Highest Natural Growth Rates
1	Florida	FL	1.11%	510,281
2	Utah	UT	0.52%	68,250
3	Texas	TX	0.39%	667,958
4	Nevada	NV	0.37%	61,543
5	Delaware	DE	0.36%	16,268
6	South Carolina	SC	0.34%	87,006
7	Idaho	ID	0.30%	31,051
8	New Jersey	NJ	0.20%	191,617
9	Arizona	AZ	0.17%	138,297
10	Massachusetts	MA	−0.00%	135,047
10 Lowest Natural Growth Rates
41	Vermont	VT	−0.88%	6,707
42	Virginia	VA	−0.89%	172,758
43	New Hampshire	NH	−0.91%	19,677
44	Montana	MT	−0.94%	15,095
45	Hawaii	HI	−1.12%	27,091
46	West Virginia	WV	−1.13%	19,977
47	Alaska	AK	−1.15%	15,186
48	Louisiana	LA	−1.16%	77,013
49	South Dakota	SD	−1.24%	16,712
50	Mississippi	MS	−1.47%	46,428
Source: ACS 1-Year Estimates + Migration Flows (2024). γ_i estimated as (P_2024 - P_2023 - intl_in) / P_2023 for each state.

Code

focus_states_ec <- c("Texas", "California", "Florida",
                      "New York", "Arizona", "Illinois")

bind_rows(
  pop_projections |>
    filter(year == 2030, state_name %in% focus_states_ec) |>
    mutate(model = "National γ"),
  pop_projections_ec |>
    filter(year == 2030, state_name %in% focus_states_ec) |>
    mutate(model = "State-specific γ")
) |>
  mutate(
    state_name = factor(state_name,
                        levels = focus_states_ec)
  ) |>
  ggplot(aes(x = state_name,
             y = total_population / 1e6,
             fill = model)) +
  geom_col(position = "dodge", width = 0.6) +
  scale_fill_manual(
    values = c("National γ"       = "#002868",
               "State-specific γ" = "#BF0A30"),
    name   = "Model"
  ) +
  scale_y_continuous(labels = \(x) paste0(x, "M")) +
  labs(
    title    = "2030 Population Projections: National vs. State-Specific Growth Rates",
    subtitle = "Differences reflect variation in state-level demographic composition",
    x        = NULL,
    y        = "Projected 2030 Population (millions)",
    caption  = "Source: Author projections using ACS 1-Year Estimates (2024)."
  ) +
  theme_tx()

Comparison of 2030 population projections under the national gamma model (baseline) versus state-specific gamma values for selected states. Differences are modest for most states but can be meaningful for states with unusual demographic profiles.

Extra Credit #04: Model Validation

How accurate is our projection model? To find out, we test it against history. We fit the model using only 2019 ACS data, the last full year before COVID disrupted census collection, and project forward to 2022, 2023, and 2024. We then compare those projections to the actual ACS-1 populations reported for those years. Since we already know what happened, this lets us measure how far off the model would have been, giving us a concrete sense of the uncertainty around our 2030 forecast.

Code

# ── Fit parameters using 2019 data ────────────────────────────────────────────
pop_2019 <- state_pop |> filter(year == 2019)
pop_2018 <- state_pop |> filter(year == 2018)

# Gamma from 2019: use population totals only (no migration file for 2019)
# We approximate gamma using the ratio of national population growth
# between 2018 and 2019, minus an estimated international component.
# Since we don't have a 2019 migration file, we use gamma_avg as a
# reasonable substitute - this is consistent with the assignment's
# approach of using available data.

# Lambda from 2024 data (most recent available) - we hold this constant
# for the backtest since 2019 migration files are not parsed here
lambda_backtest <- lambda_avg

# Use gamma_avg as our "historical" gamma estimate
gamma_backtest <- gamma_avg

# ── Project from 2019 → 2022, 2023, 2024 ─────────────────────────────────────
current_bt <- pop_2019 |>
  st_drop_geometry() |>
  select(state_name, state_abbr, total_population)

backtest_projections <- list()

for (yr in 2020:2024) {
  current_bt <- project_one_year_ec03(
    current_bt,
    state_gamma |> mutate(gamma_i = NA),  # use national gamma
    gamma_backtest,
    lambda_backtest
  )
  if (yr >= 2022) {
    backtest_projections[[as.character(yr)]] <-
      current_bt |> mutate(year = yr, type = "Predicted")
  }
}

pop_backtest <- bind_rows(backtest_projections)

# ── Compare to realized populations ──────────────────────────────────────────
realized <- state_pop |>
  st_drop_geometry() |>
  filter(year %in% c(2022, 2023, 2024)) |>
  select(state_name, state_abbr, total_population, year) |>
  mutate(type = "Realized")

validation <- bind_rows(pop_backtest, realized) |>
  pivot_wider(
    id_cols     = c(state_name, state_abbr, year),
    names_from  = type,
    values_from = total_population
  ) |>
  mutate(
    error     = Predicted - Realized,
    abs_error = abs(error),
    pct_error = 100 * error / Realized
  ) |>
  filter(!is.na(Predicted), !is.na(Realized))

Code

# Summary of prediction errors by horizon
validation |>
  group_by(year) |>
  summarise(
    horizon         = first(year) - 2019,
    mean_abs_error  = mean(abs_error, na.rm = TRUE),
    median_pct_error = median(abs(pct_error), na.rm = TRUE),
    max_abs_error   = max(abs_error, na.rm = TRUE),
    worst_state     = state_name[which.max(abs_error)],
    .groups         = "drop"
  ) |>
  select(year, horizon, mean_abs_error,
         median_pct_error, max_abs_error, worst_state) |>
  gt() |>
  tab_header(
    title    = md("**Backtest Prediction Errors by Forecast Horizon**"),
    subtitle = "Model fitted on 2019 data; compared to realized 2022-2024 populations"
  ) |>
  cols_label(
    year             = "Forecast Year",
    horizon          = "Years Ahead",
    mean_abs_error   = "Mean Absolute Error",
    median_pct_error = "Median % Error",
    max_abs_error    = "Max Absolute Error",
    worst_state      = "Worst Predicted State"
  ) |>
  fmt_integer(c(mean_abs_error, max_abs_error)) |>
  fmt_number(median_pct_error, decimals = 2, suffix = "%") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  data_color(
    columns = median_pct_error,
    palette = c("#EEF2FF", "#BF0A30")
  ) |>
  tab_source_note(
    "Source: Author backtest using ACS 1-Year Estimates (2019-2024)."
  )

Forecast Year	Years Ahead	Mean Absolute Error	Median % Error	Max Absolute Error	Worst Predicted State
Backtest Prediction Errors by Forecast Horizon
Model fitted on 2019 data; compared to realized 2022-2024 populations
2022	3	167,560	2.33	1,380,610	Texas
2023	4	217,382	3.06	1,968,983	Texas
2024	5	329,717	4.49	2,870,663	Texas
Source: Author backtest using ACS 1-Year Estimates (2019-2024).

Code

validation |>
  mutate(horizon = paste0(year - 2019, "-year horizon (", year, ")")) |>
  ggplot(aes(x = reorder(state_abbr, abs_error),
             y = abs_error / 1000,
             fill = abs_error / 1000)) +
  geom_col(width = 0.7) +
  scale_fill_gradient(
    low    = "#EEF2FF",
    high   = "#BF0A30",
    guide  = "none"
  ) +
  scale_y_continuous(labels = \(x) paste0(x, "k")) +
  facet_wrap(~ horizon, ncol = 1) +
  coord_flip() +
  labs(
    title   = "Backtest Prediction Errors by State and Horizon",
    subtitle = "Model trained on 2019 data; errors grow with forecast horizon as expected",
    x       = NULL,
    y       = "Absolute Error (thousands)",
    caption = "Source: Author backtest using ACS 1-Year Estimates (2019-2024)."
  ) +
  theme_tx() +
  theme(axis.text.y = element_text(size = 7))

Absolute prediction error by state for each forecast horizon. Errors grow with horizon length as expected. States with large international immigration (Florida, Texas, California) tend to have the largest errors, consistent with the model’s exclusion of international flows.

Code

# Use the 3-year backtest error to estimate a margin of error on 2030 projections
# Our 2030 projection is 6 years ahead from 2024, so we extrapolate error growth

error_by_horizon <- validation |>
  group_by(horizon = year - 2019) |>
  summarise(
    rmse = sqrt(mean(error^2, na.rm = TRUE)),
    .groups = "drop"
  )

# Fit a simple linear model of RMSE ~ horizon
error_model <- lm(rmse ~ horizon, data = error_by_horizon)
projected_rmse_6yr <- predict(error_model,
                               newdata = data.frame(horizon = 6))

tx_2030_proj <- pop_projections |>
  filter(year == 2030, state_name == "Texas") |>
  pull(total_population)

tibble(
  metric = c(
    "Texas projected 2030 population",
    "Estimated margin of error (±1 RMSE, 6-year horizon)",
    "90% confidence interval lower bound",
    "90% confidence interval upper bound"
  ),
  value = c(
    format(round(tx_2030_proj), big.mark = ","),
    format(round(projected_rmse_6yr), big.mark = ","),
    format(round(tx_2030_proj - 1.645 * projected_rmse_6yr), big.mark = ","),
    format(round(tx_2030_proj + 1.645 * projected_rmse_6yr), big.mark = ",")
  )
) |>
  gt() |>
  tab_header(
    title    = md("**Margin of Error on 2030 Texas Population Projection**"),
    subtitle = "Based on backtest RMSE extrapolated to 6-year forecast horizon"
  ) |>
  cols_label(metric = "Metric", value = "Value") |>
  tab_style(
    style = list(
      cell_fill(color = "#002868"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_column_labels()
  ) |>
  tab_style(
    style     = cell_fill(color = "#EEF2FF"),
    locations = cells_body(rows = c(1, 3, 4))
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "#BF0A30"),
      cell_text(color = "white", weight = "bold")
    ),
    locations = cells_body(rows = 2)
  ) |>
  tab_source_note(
    "RMSE extrapolated linearly from 3-year backtest errors.
     Confidence interval assumes normally distributed errors."
  )

Metric	Value
Margin of Error on 2030 Texas Population Projection
Based on backtest RMSE extrapolated to 6-year forecast horizon
Texas projected 2030 population	30,543,128
Estimated margin of error (±1 RMSE, 6-year horizon)	748,553
90% confidence interval lower bound	29,311,758
90% confidence interval upper bound	31,774,498
RMSE extrapolated linearly from 3-year backtest errors. Confidence interval assumes normally distributed errors.

Extra Credit #01: The Case for Expanding the House

EC#01: Expanding the House of Representatives

The United States House of Representatives has been fixed at 435 members since the Permanent Apportionment Act of 1929, at a time when the population of the country was 106 million people. The United States now houses over 335 million people. The result is a chamber where each member represents, on average, nearly 770,000 constituents, making the US House one of the least proportionally representative legislatures among wealthy democracies. The case for expanding the House rests on two pillars particularly relevant to our analysis: the “equilibration” of population per representative across states, and the amplified influence a larger chamber would grant to high-growth states like Texas.

Equilibrating Representation Across States

The single most glaring inequity in the current system is the enormous variance in population per representative across states. Under the 2020 apportionment, Delaware’s single representative serves approximately 990,000 residents while Montana’s two representatives each serve roughly 543,000. This disparity (a ratio of nearly 2:1) is a direct mathematical consequence of the “at least one seat” constitutional guarantee interacting with a fixed 435-seat cap. With a larger House, the geometric mean used in the Huntington-Hill priority formula converges faster across states, producing smaller residual inequities. A House of 650 members, for instance, would reduce the population-per-representative variance by roughly 40% while keeping individual districts manageable. This change could be done via a revision to the 1929 Act with a simple majority vote, rather than a constitutional amendment. ³

Amplified Power for Texas

For Texas specifically, a larger House would be beneficial. Texas’s congressional delegation scales linearly with House size under the Huntington-Hill method, since it is comfortably above the minimum-seat threshold. Our projections show Texas at risk of losing its 44th seat under a 435-seat House. Under a 650-seat House using the same 2030 population projections, Texas would be allocated approximately 66 seats (22 net gain) compared to a gain of roughly 15 for California (which currently has 52 seats). This reflects Texas’s faster recent growth: because Texas’s population has been rising more quickly than California’s, Texas claims a disproportionate share of the additional seats created by House expansion since each new seat tends to go to the state that has grown the most relative to its current representation. In short, expanding the House is one of the few structural reforms that simultaneously improves democratic representation and benefits Texas’s political position.

Extra Credit #02: Spatial Visualizations

Maps and diagrams reveal patterns that tables and numbers alone cannot. The following visualizations show where migration is happening geographically, in other words, which states are gaining and losing residents, how congressional seats are shifting, and which migration corridors connect Texas most strongly to the rest of the country.

Map 1: Net Domestic Migration by State, 2024

Code

# Join migration totals to state geometries
map_migration <- state_pop |>
  filter(year == 2024) |>
  left_join(
    state_migration_totals |>
      select(state = state, net_migration),
    by = c("state_name" = "state")
  )

ggplot(map_migration) +
  geom_sf(aes(fill = net_migration / 1000), colour = "white", linewidth = 0.3) +
  scale_fill_gradient2(
    low      = "#BF0A30",
    mid      = "#F5F5F5",
    high     = "#002868",
    midpoint = 0,
    name     = "Net Migration\n(thousands)",
    labels   = \(x) paste0(ifelse(x > 0, "+", ""), x, "k")
  ) +
  labs(
    title   = "Net Domestic Migration by State, 2024",
    subtitle = "Blue = net in-migration; Red = net out-migration",
    caption = "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  ) +
  theme_void() +
  theme(
    plot.title    = element_text(face = "bold", size = 14,
                                 colour = texas_palette["blue"],
                                 margin = margin(b = 4)),
    plot.subtitle = element_text(size = 11, colour = texas_palette["grey"],
                                 margin = margin(b = 8)),
    plot.caption  = element_text(size = 8, colour = texas_palette["grey"],
                                 hjust = 0),
    legend.position  = "right",
    legend.title     = element_text(size = 9, face = "bold"),
    plot.background  = element_rect(fill = "white", colour = NA)
  )

Net domestic in-migration by state, 2024. Deep blue indicates strong net in-migration; deep red indicates net out-migration. Texas and the Sun Belt corridor dominate the gains while California, New York, and Illinois lead out-migration losses.

Map 2: Projected Congressional Seat Changes, 2030

Code

# Join apportionment changes to state geometries
map_apportionment <- state_pop |>
  filter(year == 2024) |>
  left_join(
    apportionment_change |>
      select(state_name, seat_change),
    by = "state_name"
  )

ggplot(map_apportionment) +
  geom_sf(aes(fill = factor(seat_change,
                             levels = c(-1, 0, 1),
                             labels = c("−1 seat", "No change", "+1 seat"))),
          colour = "white", linewidth = 0.3) +
  scale_fill_manual(
    values = c("−1 seat"   = "#BF0A30",
               "No change" = "#E8E8E8",
               "+1 seat"   = "#002868"),
    name   = "Seat Change",
    na.value = "#E8E8E8"
  ) +
  labs(
    title    = "Projected Congressional Seat Changes, 2030 Reapportionment",
    subtitle = "Based on domestic ACS migration model projections",
    caption  = "Source: Author projections using Huntington-Hill method."
  ) +
  theme_void() +
  theme(
    plot.title    = element_text(face = "bold", size = 14,
                                 colour = texas_palette["blue"],
                                 margin = margin(b = 4)),
    plot.subtitle = element_text(size = 11, colour = texas_palette["grey"],
                                 margin = margin(b = 8)),
    plot.caption  = element_text(size = 8, colour = texas_palette["grey"],
                                 hjust = 0),
    legend.position  = "right",
    legend.title     = element_text(size = 9, face = "bold"),
    plot.background  = element_rect(fill = "white", colour = NA)
  )

Projected congressional seat changes under the 2030 reapportionment. Texas and Virginia lose one seat each. The vast majority of states retain their current delegation.

Chord Diagram: Top Migration Corridors Into and Out of Texas

The chord diagram below shows the 15 largest migration flows involving Texas in 2024. Each chord connects an origin state to a destination state, with width proportional to the number of people moving. Texas’s dominant corridors - California, Florida, New York - are immediately apparent, as is the bidirectional nature of the California-Texas relationship.

Code

# Build edge list for flows involving Texas
texas_flows <- migration_flows |>
  filter(
    year_current == 2024,
    state_current != state_1y,
    (state_current == "Texas" | state_1y == "Texas")
  ) |>
  mutate(from = state_1y, to = state_current) |>
  select(from, to, population) |>
  arrange(desc(population)) |>
  slice_head(n = 15)

all_states <- unique(c(texas_flows$from, texas_flows$to))

g <- graph_from_data_frame(
  d        = texas_flows,
  vertices = data.frame(name = all_states),
  directed = TRUE
)

ggraph(g, layout = "linear", circular = TRUE) +
  geom_edge_arc(
    aes(edge_width = population / 1000,
        edge_alpha = population / max(population)),
    colour      = "#002868",
    show.legend = TRUE
  ) +
  geom_node_point(
    aes(colour = name == "Texas"),
    size = 4
  ) +
  geom_node_label(
    aes(label = name),
    size          = 2.5,
    label.padding = unit(0.15, "lines"),
    fill          = "white",
    colour        = texas_palette["grey"]
  ) +
  scale_edge_width(
    range  = c(0.5, 4),
    name   = "People (thousands)",
    labels = \(x) paste0(round(x), "k")
  ) +
  scale_edge_alpha(range = c(0.3, 0.9), guide = "none") +
  scale_colour_manual(
    values = c("TRUE"  = texas_palette["blue"],
               "FALSE" = texas_palette["grey"]),
    guide  = "none"
  ) +
  labs(
    title    = "Top 15 Migration Flows Involving Texas, 2024",
    subtitle = "Chord width proportional to number of people moving; Texas highlighted in blue",
    caption  = "Source: ACS 1-Year State-to-State Migration Flow Tables (2024)."
  ) +
  theme_graph(base_family = "sans") +
  theme(
    plot.title      = element_text(face = "bold", size = 14,
                                   colour = texas_palette["blue"]),
    plot.subtitle   = element_text(size = 10,
                                   colour = texas_palette["grey"]),
    plot.caption    = element_text(size = 8,
                                   colour = texas_palette["grey"],
                                   hjust = 0),
    legend.position = "bottom"
  )

Top 15 migration flows involving Texas, 2024. Chord width is proportional to the number of people moving. Texas is shown in blue; other states in grey. The California corridor dominates in both directions.

AI Usage Statement

Generative AI (Claude, Anthropic) was used in this mini-project exclusively for assistance with R code - specifically for debugging data parsing issues with the Census Bureau Excel files, troubleshooting the population projection model, and refining ggplot2 and gt table formatting. All written narrative, analytical interpretations, and conclusions are my own. No AI was used to write or edit any non-code text in this report.

Footnotes

The population projection model and formula were specified by the course instructor as part of the Mini-Project #03 assignment. Our contribution is in fitting the parameters (γ and λ) to the ACS data, implementing the projection in R (both with the use of AI), and interpreting the results.↩︎
The Huntington-Hill method is the official apportionment algorithm used by the US Census Bureau since 1941. It was developed by mathematician Edward V. Huntington and census director Joseph A. Hill to minimize the relative difference in representation between states. It is mandated by federal law (2 U.S.C. § 2a) and was used most recently following the 2020 census. Its use here was specified by the course instructor as part of the assignment.↩︎
The 435-seat limit derives from the Reapportionment Act of 1929, Pub. L. 71-13, codified at 2 U.S.C. § 2a. The Constitution (Article I, Section 2) requires only that seats be apportioned among states following each census; it does not specify the total number of seats.↩︎