Discussing MMR vaccines with an LLM after a brief content review increases vaccination intent among hesitant parents

Analysis of a Preregistered Two‑Arm Active Control Online RCT, with a Durability Check

Author

Scott J. Forman

Published

September 19, 2025

Abstract

We conducted a preregistered two‑arm online RCT (N = 180 MMR hesitant U.S. parents of young children) comparing a timer‑gated four‑panel MMR content carousel followed by an LLM‑guided conversation against a structure-matched, non‑vaccine-related active control (car seat safety). An ANCOVA on post‑intent controlling for baseline intention estimates an adjusted arm effect of β̂ ≈ 1.03 points (95% CI 0.72–1.34) on a 1–7 vaccination intention scale. A post‑intervention delayed follow‑up on a separate sample (N = 66) shows a consistent effect size of β̂ ≈ 1.09 (95% CI 0.52, 1.66). Conclusion: a brief intervention combining persuasive content with an LLM conversation significantly increases MMR intention relative to control, with signs of durability over a period of several days.

Preregistration

OSF preregistration: Using Conversational AI to Support Parental MMR Decision‑Making: An Active‑Control Randomized Trial
Deviations: (i) An initial outcome‑page failure prevented immediate post‑intent collection for a subset; those participants were later contacted in a rescue follow‑up. Confirmatory inference uses only the clean re‑run per preregistration; the rescue durability analysis is exploratory. (ii) Compensation increased across batches to maintain enrollment ($2.50 → $3.50 → $4.50) without changing the protocol or analysis plan.
Confirmatory set and exclusions: in preregistration, hard exclusions for obvious bots or exposure failures were planned. In practice no participant could be unambiguously classified as such, so the confirmatory set includes all randomized participants who completed the study. We report a sensitivity excluding two borderline cases; results are unchanged.

Methods

Recruitment: U.S.-resident parents of at least one child born in 2019 or later, who indicated less than complete confidence in vaccine safety, and who had not participated in one of our previous studies, were recruited via Prolific.
Flow: After consenting, participants were shown a mock‑appointment page with two buttons: “I have questions or concerns about MMR” and “No concerns about the MMR vaccine.” Participants who clicked “No concerns about the MMR vaccine” were screened out and awarded a small payment. Participants who clicked “I have questions or concerns about MMR” were asked to imagine an upcoming appointment with a pediatric medical provider, and indicate on a scale of 1-7 the likelihood that they would have their child receive a dose of the MMR vaccine at that visit. Those at the ceiling (7) exited and were granted a small payment. Those with baseline ≤ 6 were randomized 1:1 to Experimental or Control, and completed a short pre-intervention demographic survey. All randomized participants saw the matched carousel and interactive chat segment, then answered the vaccination intent question a second time.
Structure: Both study arms used the same interface and engagement rules. Participants first saw a brief, scrollable information carousel, followed by an interactive conversation segment. The only difference between arms was the content topic: MMR vaccine (Experimental) versus car‑seat safety (Active Control).
Outcome: Post‑intervention MMR intention (1–7). The primary analysis adjusted for baseline intention.
Primary model: ANCOVA post_intent ~ arm_coded + pre_intent with HC3 robust SEs on participants with baseline head‑room (≤ 6) who met preregistered engagement rules. We retained all randomized participants; two cases appeared borderline low‑quality and were excluded in a sensitivity check, which did not materially change results.
A separate exploratory “durability” analysis used delayed post‑intent from the “rescue” dataset.
Batches: Recruitment proceeded in three batches with rising compensation to maintain enrollment when it slowed: Batch 2B at $2.50, Batch 2B2 at $3.50, and Batch 2B3 at $4.50. Each relaunch followed the same protocol and analysis plan.

LLM Settings

All LLM conversations were powered by Claude 4.0 Sonnet via the Anthropic API (model: claude-sonnet-4-20250514). Generation settings: temperature = 1, max_tokens = 4096, thinking_enabled = TRUE, thinking_budget = 1024. Prompts followed an identical motivational‑interviewing style; only the topic‑specific elements of the prompts differed by arm.

Data Import

Here we load all data sources used in the analysis.

Main RCT‑2 Analysis Views

Show code

freeze_files <- vapply(params$main_freeze_dirs, resolve_latest_csv, FUN.VALUE = character(1))
load_tbl_main <- tibble(batch = vapply(freeze_files, infer_batch_label, FUN.VALUE = character(1)),
                        file  = freeze_files)
knitr::kable(load_tbl_main, tbl_fmt, caption = "Selected analysis‑view files (Main RCT‑2)") %>% style_tbl()

Selected analysis‑view files (Main RCT‑2)
batch	file
2B	/Users/scott/Projects/verum-analysis/experiments/45ff14/data_freezes/experiment_analysis_view_exp-45ff14_20250910-153854.csv
2B2	/Users/scott/Projects/verum-analysis/experiments/5bb2fd/data_freezes/experiment_analysis_view_exp-5bb2fd_20250910-153904.csv
2B3	/Users/scott/Projects/verum-analysis/experiments/8bb589/data_freezes/experiment_analysis_view_exp-8bb589_20250910-095801.csv

Show code

read_main <- function(fpath) {
  df <- readr::read_csv(fpath, show_col_types = FALSE)
  df$batch <- infer_batch_label(fpath)
  if ("prolific_pid" %in% names(df)) {
    df <- df %>% mutate(pid_hash = hash_pid(prolific_pid)) %>% select(-prolific_pid)
  } else if (!"pid_hash" %in% names(df)) {
    df$pid_hash <- NA_character_
  }
  df %>% mutate(
    pre_intent  = suppressWarnings(as.numeric(pre_intent)),
    post_intent = suppressWarnings(as.numeric(post_intent)),
    arm_fct = factor(assigned_condition, levels = c("control", "mmr"), labels = c("Control", "Treatment")),
    arm_coded = dplyr::if_else(assigned_condition == "mmr", 1, 0, missing = NA_real_),
    time_seconds = suppressWarnings(as.numeric(time_seconds)),
    chat_turns = suppressWarnings(as.numeric(chat_turns)),
    chat_user_chars = suppressWarnings(as.numeric(chat_user_chars))
  )
}

trial_main_raw <- purrr::map_dfr(freeze_files, read_main)

trial_main <- trial_main_raw %>%
  select(pid_hash, user_id, batch, arm_fct, arm_coded,
         pre_intent, post_intent,
         time_seconds, chat_turns, chat_user_chars)

coverage_main <- tibble(
  N_rows = nrow(trial_main),
  N_with_pre  = sum(!is.na(trial_main$pre_intent)),
  N_with_post = sum(!is.na(trial_main$post_intent))
)
knitr::kable(coverage_main, tbl_fmt, caption = "Main freezes: coverage of pre/post intention") %>% style_tbl()

Main freezes: coverage of pre/post intention
N_rows	N_with_pre	N_with_post
180	180	180

These denormalized analysis views are the batch‑level source used for the confirmatory models (one row per randomized participant, with assigned condition, pre/post intention, engagement metrics, and batch labels).

Durability (Rescue) Files

Show code

freeze_file_rescue <- here::here(params$durability_freeze_dir, params$durability_freeze_csv)
prolific_file_rescue <- here::here(params$durability_freeze_dir, params$durability_prolific_csv)
knitr::kable(tibble(file_role=c("rescue_analysis_view","rescue_prolific_csv"), file=c(freeze_file_rescue, prolific_file_rescue)), tbl_fmt, caption = "Durability input files") %>% style_tbl()

Durability input files
file_role	file
rescue_analysis_view	/Users/scott/Projects/verum-analysis/experiments/ab8086/data_freezes/experiment_analysis_view_exp-ab8086_nopost.csv
rescue_prolific_csv	/Users/scott/Projects/verum-analysis/experiments/ab8086/data_freezes/prolific_export_68a63717859761977be4e572.csv

Show code

stopifnot(fs::file_exists(freeze_file_rescue))
stopifnot(fs::file_exists(prolific_file_rescue))

freeze_df <- readr::read_csv(freeze_file_rescue, show_col_types = FALSE)
if ("prolific_pid" %in% names(freeze_df)) {
  freeze_df <- freeze_df %>% mutate(pid_hash = hash_pid(prolific_pid))
} else {
  freeze_df$pid_hash <- NA_character_
}

freeze_df <- freeze_df %>%
  mutate(
    pre_intent  = suppressWarnings(as.numeric(pre_intent)),
    arm_fct = factor(assigned_condition, levels = c("control", "mmr"), labels = c("Control", "Treatment")),
    arm_coded = dplyr::if_else(assigned_condition == "mmr", 1, 0, missing = NA_real_)
  )

prol_raw_rescue <- readr::read_csv(prolific_file_rescue, show_col_types = FALSE)
intent_col <- names(prol_raw_rescue)[length(names(prol_raw_rescue))]

prol_rescue <- prol_raw_rescue %>%
  rename(
    prolific_pid = `Participant id`,
    prolific_completed_at = `Completed at`
  ) %>%
  mutate(
    rescue_intent_raw = .data[[intent_col]],
    prolific_completed_at = lubridate::ymd_hms(prolific_completed_at, quiet = TRUE),
    pid_hash = hash_pid(prolific_pid),
    intent_post_rescue = suppressWarnings(as.numeric(stringr::str_extract(as.character(rescue_intent_raw), "[1-7]")))
  ) %>%
  select(pid_hash, intent_post_rescue, prolific_completed_at)

trial_rescue <- freeze_df %>%
  left_join(prol_rescue, by = "pid_hash") %>%
  mutate(
    intervention_completed_at_parsed = lubridate::ymd_hms(intervention_completed_at, quiet = TRUE),
    days_delay = as.numeric(difftime(prolific_completed_at, intervention_completed_at_parsed, units = "days"))
  ) %>%
  transmute(
    pid_hash,
    user_id,
    arm_fct,
    arm_coded,
    intent_pre = pre_intent,
    intent_post_rescue,
    time_seconds, chat_turns, chat_user_chars,
    prolific_completed_at,
    intervention_completed_at = intervention_completed_at_parsed,
    days_delay
  )

coverage_rescue <- tibble(
  N_rows = nrow(trial_rescue),
  N_with_pre  = sum(!is.na(trial_rescue$intent_pre)),
  N_with_post_rescue = sum(!is.na(trial_rescue$intent_post_rescue))
)
knitr::kable(coverage_rescue, tbl_fmt, caption = "Rescue set: coverage of pre and delayed post‑intent") %>% style_tbl()

Rescue set: coverage of pre and delayed post‑intent
N_rows	N_with_pre	N_with_post_rescue
90	90	66

Rescue follow-up response by assigned arm
arm_fct	N	Responded	Rate
Control	42	34	81.0%
Treatment	48	32	66.7%

The rescue analysis_view contains baseline and process fields from the affected run; the Prolific export holds delayed post‑intent collected later. We link them by hashed Prolific IDs and compute the delay in days between intervention completion and the post-intervention intent data collection.

Prolific Exports

Show code

prolific_files <- vapply(params$main_freeze_dirs, resolve_latest_prolific, FUN.VALUE = character(1))
prolific_tbl <- tibble(batch = vapply(prolific_files, infer_batch_label, FUN.VALUE = character(1)),
                      file  = prolific_files)
knitr::kable(prolific_tbl, tbl_fmt, caption = "Resolved Prolific exports (latest by dir)") %>% style_tbl()

Resolved Prolific exports (latest by dir)
batch	file
2B	/Users/scott/Projects/verum-analysis/experiments/45ff14/data_freezes/prolific_export_68a76e72f871f463046d251c.csv
2B2	/Users/scott/Projects/verum-analysis/experiments/5bb2fd/data_freezes/prolific_export_68b21405c16a8fdf41af130d.csv
2B3	/Users/scott/Projects/verum-analysis/experiments/8bb589/data_freezes/prolific_export_68b8b8dffb63864bf9db9ea6.csv

Show code

read_prolific <- function(fpath) {
  if (is.na(fpath) || !fs::file_exists(fpath)) return(NULL)
  df <- readr::read_csv(fpath, show_col_types = FALSE)
  df$batch <- infer_batch_label(fpath)
  df %>% transmute(
    batch,
    submission_id = `Submission id`,
    prolific_pid = `Participant id`,
    status = Status,
    started_at = `Started at`,
    completed_at = `Completed at`,
    time_taken_s = suppressWarnings(as.numeric(`Time taken`)),
    completion_code = `Completion code`
  )
}

prolific_raw <- purrr::map(prolific_files, read_prolific) %>% purrr::compact() %>% bind_rows()

status_levels <- c("RETURNED","SCREENED OUT","TIMED OUT","REJECTED","APPROVED","MISSING")
prolific_clean <- prolific_raw %>%
  mutate(status = ifelse(is.na(status) | status == "", "MISSING", status),
         status = factor(status, levels = status_levels),
         batch = factor(batch, levels = c("2B","2B2","2B3")))

counts_long <- prolific_clean %>% count(status, batch, name = "N")
batch_totals <- counts_long %>% group_by(batch) %>% summarise(batch_total = sum(N), .groups = "drop")

# Compute row-wise percentages in long form, then pivot to labels
counts_long_labeled <- counts_long %>%
  left_join(batch_totals, by = "batch") %>%
  mutate(pct = ifelse(batch_total > 0, 100 * N / batch_total, NA_real_),
         label = sprintf("%d (%.1f%%)", N, pct)) %>%
  select(status, batch, label)

# Numeric wide for totals, labeled wide for display
counts_wide_num <- counts_long %>% tidyr::pivot_wider(names_from = batch, values_from = N, values_fill = 0) %>% arrange(status)
counts_wide_lbl <- counts_long_labeled %>% tidyr::pivot_wider(names_from = batch, values_from = label, values_fill = "0 (NA%)") %>% arrange(status)

counts_wide_lbl$Total <- rowSums(counts_wide_num %>% select(any_of(levels(prolific_clean$batch))))

# Total row
grand <- tibble(status = "Total")
for (b in levels(prolific_clean$batch)) {
  bt <- batch_totals$batch_total[batch_totals$batch == b]
  grand[[b]] <- sprintf("%d (100.0%%)", bt)
}
grand$Total <- sum(counts_wide_lbl$Total)
fmt_tbl <- bind_rows(counts_wide_lbl, grand)

knitr::kable(fmt_tbl, tbl_fmt, caption = "Prolific statuses by batch with totals (N and % of batch total)") %>% style_tbl()

Prolific statuses by batch with totals (N and % of batch total)
status	2B	2B2	2B3	Total
RETURNED	89 (30.1%)	41 (20.8%)	32 (19.9%)	162
SCREENED OUT	122 (41.2%)	88 (44.7%)	77 (47.8%)	287
REJECTED	2 (0.7%)	5 (2.5%)	4 (2.5%)	11
APPROVED	76 (25.7%)	57 (28.9%)	45 (28.0%)	178
NA	7 (2.4%)	6 (3.0%)	3 (1.9%)	16
Total	296 (100.0%)	197 (100.0%)	161 (100.0%)	654

These Prolific exports summarize recruitment statuses by batch (e.g., APPROVED, RETURNED). They are used for flow and recruitment descriptives only, not for outcome modeling.

Exit Paths

Show code

resolve_exit_paths <- function(dir_path) {
  fp <- here::here(dir_path, "exit_paths.csv")
  if (fs::file_exists(fp)) fp else NA_character_
}
exit_files <- vapply(params$main_freeze_dirs, resolve_exit_paths, FUN.VALUE = character(1))
knitr::kable(tibble(batch = vapply(exit_files, infer_batch_label, FUN.VALUE = character(1)), file = exit_files), tbl_fmt, caption = "Resolved exit_paths.csv per batch") %>% style_tbl()

Resolved exit_paths.csv per batch
batch	file
2B	/Users/scott/Projects/verum-analysis/experiments/45ff14/data_freezes/exit_paths.csv
2B2	/Users/scott/Projects/verum-analysis/experiments/5bb2fd/data_freezes/exit_paths.csv
2B3	/Users/scott/Projects/verum-analysis/experiments/8bb589/data_freezes/exit_paths.csv

Show code

read_exit <- function(fpath) {
  if (is.na(fpath) || !fs::file_exists(fpath)) return(NULL)
  readr::read_csv(fpath, show_col_types = FALSE) %>% mutate(batch = infer_batch_label(fpath))
}
exit_paths <- purrr::map(exit_files, read_exit) %>% purrr::compact() %>% bind_rows()

if (nrow(exit_paths) > 0) {
  exit_paths <- exit_paths %>%
    mutate(pid_hash = hash_pid(prolific_pid),
           path = factor(completion_pathway, levels = c("confirm","ceiling","complete"))) %>%
    filter(!is.na(path))
  exit_counts <- exit_paths %>% count(path, batch, name = "N")
  exit_totals <- exit_counts %>% group_by(batch) %>% summarise(batch_total = sum(N), .groups = "drop")

  # Compute percentages in long form then pivot to labels
  exit_long_labeled <- exit_counts %>%
    left_join(exit_totals, by = "batch") %>%
    mutate(pct = ifelse(batch_total > 0, 100 * N / batch_total, NA_real_),
           label = sprintf("%d (%.1f%%)", N, pct)) %>%
    select(path, batch, label)

  exit_wide_num <- exit_counts %>% tidyr::pivot_wider(names_from = batch, values_from = N, values_fill = 0) %>% arrange(path)
  exit_wide_lbl <- exit_long_labeled %>% tidyr::pivot_wider(names_from = batch, values_from = label, values_fill = "0 (NA%)") %>% arrange(path)

  batches_fac <- levels(factor(exit_paths$batch))
  exit_wide_lbl$Total <- rowSums(exit_wide_num %>% select(any_of(batches_fac)))

  grand_e <- tibble(path = "Total")
  for (b in batches_fac) {
    bt <- exit_totals$batch_total[exit_totals$batch == b]
    grand_e[[b]] <- sprintf("%d (100.0%%)", bt)
  }
  grand_e$Total <- sum(exit_wide_lbl$Total)
  fmt_exit <- bind_rows(exit_wide_lbl, grand_e)

  knitr::kable(fmt_exit, tbl_fmt, caption = "Exit pathways by batch (DB): N and % of batch total (confirm → ceiling → complete)") %>% style_tbl()
}

Exit pathways by batch (DB): N and % of batch total (confirm → ceiling → complete)
path	2B	2B2	2B3	Total
confirm	123 (58.6%)	93 (58.5%)	68 (53.1%)	284
ceiling	10 (4.8%)	8 (5.0%)	15 (11.7%)	33
complete	77 (36.7%)	58 (36.5%)	45 (35.2%)	180
Total	210 (100.0%)	159 (100.0%)	128 (100.0%)	497

Exit paths provide the database source‑of‑truth for participant flow through the mock‑appointment step and pre-intervention intent survey (“confirm” [no concerns about MMR], “ceiling” [pre-intervention intent = 7], “complete”). We use these counts for the reach metric and in the CONSORT‑style flow diagram.

Sample Composition

We summarize randomized participants from the analysis view, as counts and percentages within each batch.

Show code

by_batch_arm <- trial_main %>% count(batch, arm_fct, name = "N")
by_batch_tot <- by_batch_arm %>% group_by(batch) %>% summarise(batch_total = sum(N), .groups = "drop")

# Percentages in long form, then pivot to labels
by_long_lbl <- by_batch_arm %>%
  left_join(by_batch_tot, by = "batch") %>%
  mutate(pct = ifelse(batch_total > 0, 100 * N / batch_total, NA_real_),
         label = sprintf("%d (%.1f%%)", N, pct)) %>%
  select(arm_fct, batch, label)

main_wide_num <- by_batch_arm %>% tidyr::pivot_wider(names_from = batch, values_from = N, values_fill = 0) %>% arrange(arm_fct)
main_wide_lbl <- by_long_lbl %>% tidyr::pivot_wider(names_from = batch, values_from = label, values_fill = "0 (NA%)") %>% arrange(arm_fct)

main_wide_lbl$Total <- rowSums(main_wide_num %>% select(-arm_fct))

# Totals row
grand_row <- tibble(arm_fct = "Total")
for (b in unique(by_batch_arm$batch)) {
  denom <- by_batch_tot$batch_total[by_batch_tot$batch == b]
  grand_row[[b]] <- sprintf("%d (100.0%%)", denom)
}
grand_row$Total <- sum(main_wide_lbl$Total)
fmt_main <- bind_rows(main_wide_lbl, grand_row)

knitr::kable(fmt_main, tbl_fmt, caption = "Randomized sample by arm and batch (N and % of batch total)") %>% style_tbl()

Randomized sample by arm and batch (N and % of batch total)
arm_fct	2B	2B2	2B3	Total
Control	38 (49.4%)	29 (50.0%)	22 (48.9%)	89
Treatment	39 (50.6%)	29 (50.0%)	23 (51.1%)	91
Total	77 (100.0%)	58 (100.0%)	45 (100.0%)	180

Demographics

We summarize basic demographics using fields captured in the pre‑survey JSON (age, gender, political ideology).

Show code

# Prepare demographics ---------------------------------------------------
trial_demo <- trial_main_raw
if (all(c("pre_json") %in% names(trial_demo))) {
  parsed <- purrr::map(trial_demo$pre_json, safe_parse)
  trial_demo$age      <- vapply(parsed, pluck_chr, FUN.VALUE = character(1), name = "age")
  trial_demo$gender   <- vapply(parsed, pluck_chr, FUN.VALUE = character(1), name = "sex")
  trial_demo$ideology <- vapply(parsed, pluck_chr, FUN.VALUE = character(1), name = "political_leaning")
} else {
  if (!"age" %in% names(trial_demo) && "Age" %in% names(trial_demo)) trial_demo$age <- trial_demo$Age
  if (!"gender" %in% names(trial_demo) && "sex" %in% names(trial_demo)) trial_demo$gender <- trial_demo$sex
  if (!"ideology" %in% names(trial_demo) && "political_ideology" %in% names(trial_demo)) trial_demo$ideology <- trial_demo$political_ideology
}

democat <- function(df, var) {
  if (!var %in% names(df)) return(tibble())
  df %>% filter(!is.na(.data[[var]]), .data[[var]] != "") %>%
    count(.data[[var]], name = "N") %>% arrange(desc(N)) %>%
    mutate(Percent = 100 * N / sum(N), Variable = var) %>%
    rename(Level = !!var)
}

age_tbl  <- democat(trial_demo, "age")
gend_tbl <- democat(trial_demo, "gender")
ideo_tbl <- democat(trial_demo, "ideology")

The sample skews female (68.9%). Ideology leans conservative (51.7%). Ages cluster in 25-34 (51.1%) and 35-44 (34.4%).

Age

Age distribution (counts and %)
Level	N	Percent
25-34	92	51.1%
35-44	62	34.4%
45-54	14	7.8%
18-24	8	4.4%
55-64	4	2.2%

Show code

if (nrow(age_tbl) > 0) {
  ggplot(age_tbl, aes(x = reorder(Level, -N), y = N)) +
    geom_col(fill = "#5B8E7D") +
    labs(x = NULL, y = "N") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 20, hjust = 1))
}

Bar chart of participant age categories with counts; the sample skews to 25–34 and 35–44. — Age distribution (counts)

Gender

Gender distribution (counts and %)
Level	N	Percent
female	124	68.9%
male	55	30.6%
prefer-not	1	0.6%

Show code

if (nrow(gend_tbl) > 0) {
  ggplot(gend_tbl, aes(x = reorder(Level, -N), y = N)) +
    geom_col(fill = "#7F7F7F") +
    labs(x = NULL, y = "N") +
    theme_minimal()
}

Bar chart of participant gender categories with counts. — Gender distribution (counts)

Political ideology

Political ideology distribution (counts and %)
Level	N	Percent
conservative	93	51.7%
moderate	52	28.9%
liberal	32	17.8%
prefer-not	3	1.7%

Show code

if (nrow(ideo_tbl) > 0) {
  ggplot(ideo_tbl, aes(x = reorder(Level, -N), y = N)) +
    geom_col(fill = "#6A3D9A") +
    labs(x = NULL, y = "N") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 20, hjust = 1))
}

Bar chart of participant political ideology categories with counts. — Political ideology distribution (counts)

Overview of Participant Flow

flowchart TD
A[Started study: 654] --> A0[Abandoned: 157]
A --> B[Consented: 497]
B --> S[Screened out: 317]
S --> C[No concerns about the MMR vaccine: 284]
S --> D[Ceiling: 33]
B --> E[Randomized: 180]
E --> A1[Allocated to Control: 89]
E --> A2[Allocated to Treatment: 91]

Among those who reached the mock‑appointment screen (N = 497), 42.9% (95% CI [38.5, 47.3]%) clicked “I have questions or concerns about MMR” (N = 213) and 57.1% (95% CI [52.7, 61.5]%) clicked “No concerns about the MMR vaccine” (N = 284). Within the questions/concerns branch, 15.5% (95% CI [10.9, 21.1]%) selected a baseline of 7 (N = 33) and 84.5% (95% CI [78.9, 89.1]%) had baseline ≤ 6 and proceeded to the intervention (N = 180). Overall, reach to the intervention was 36.2% (95% CI [32.0, 40.6]%) of those at the mock‑appointment step.

Batch Comparability

We test arm balance across batches and baseline comparability. Arms are balanced (p = 0.994); baseline intention is similar across batches (ANOVA p = 0.321); the estimated arm effect is stable across batches (Arm×Batch interaction p-values ≥ 0.943).

Show code

trial_bc <- trial_main %>% mutate(batch_fct = factor(batch, levels = sort(unique(batch))))

# 1) Design balance: Arm × Batch counts and test ------------------------
arm_by_batch <- trial_bc %>% count(batch_fct, arm_fct, name = "N") %>% tidyr::pivot_wider(names_from = arm_fct, values_from = N, values_fill = 0)
knitr::kable(arm_by_batch, tbl_fmt, caption = "Arm counts by batch") %>% style_tbl()

Arm counts by batch
batch_fct	Control	Treatment
2B	38	39
2B2	29	29
2B3	22	23

Show code

tab <- trial_bc %>% count(batch_fct, arm_fct) %>% tidyr::pivot_wider(names_from = arm_fct, values_from = n, values_fill = 0) %>% select(-batch_fct) %>% as.matrix()
if (all(tab >= 5)) {
  test_arm <- chisq.test(tab)
  test_name <- "Chi-squared"
} else {
  test_arm <- fisher.test(tab)
  test_name <- "Fisher exact"
}
cat(sprintf("Arm~Batch %s p-value: %.3g\n\n", test_name, test_arm$p.value))

Arm~Batch Chi-squared p-value: 0.994

Show code

# 2) Baseline comparability across batches ------------------------------
base_by_batch <- trial_bc %>% summarise(N = sum(!is.na(pre_intent)), mean_pre = mean(pre_intent, na.rm=TRUE), sd_pre = sd(pre_intent, na.rm=TRUE), .by = batch_fct)
knitr::kable(base_by_batch, tbl_fmt, digits = 2, caption = "Baseline intention by batch") %>% style_tbl()

Baseline intention by batch
batch_fct	N	mean_pre	sd_pre
2B	77	3.91	1.89
2B2	58	3.43	1.89
2B3	45	3.82	1.83

Show code

aov_pre <- aov(pre_intent ~ batch_fct, data = trial_bc)
cat("ANOVA for baseline intention across batches:\n"); print(summary(aov_pre))

ANOVA for baseline intention across batches:

             Df Sum Sq Mean Sq F value Pr(>F)
batch_fct     2    8.0   4.017   1.145  0.321
Residuals   177  621.2   3.509

Show code

# 3) Treatment effect stability (Arm × Batch in ANCOVA) -----------------
analysis_bc <- trial_bc %>% filter(!is.na(post_intent), !is.na(pre_intent), pre_intent <= 6) %>%
  mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5,
         engagement_met = (coalesce(chat_turns, 0) >= 3 & coalesce(chat_points, 0) >= 100)) %>%
  filter(engagement_met)

mod_int <- lm(post_intent ~ pre_intent + arm_coded * batch_fct, data = analysis_bc)
cat("\nANCOVA with Arm × Batch interaction (HC3):\n")


ANCOVA with Arm × Batch interaction (HC3):

Show code

print(hc3(mod_int))


t test of coefficients:

                        Estimate Std. Error t value  Pr(>|t|)    
(Intercept)             0.044968   0.186083  0.2417 0.8093330    
pre_intent              1.008784   0.045729 22.0601 < 2.2e-16 ***
arm_coded               0.971629   0.258973  3.7519 0.0002393 ***
batch_fct2B2           -0.040774   0.217222 -0.1877 0.8513262    
batch_fct2B3           -0.122763   0.106576 -1.1519 0.2509601    
arm_coded:batch_fct2B2  0.028674   0.402701  0.0712 0.9433185    
arm_coded:batch_fct2B3  0.201847   0.347880  0.5802 0.5625206    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Confirmatory Analysis

H1 – Arm effect on post‑intervention intention

We estimate the adjusted arm difference using a simple ANCOVA on participants with baseline head‑room (≤ 6) who met preregistered engagement rules, and find that the treatment increased MMR‑intention by about a full point on the 1–7 scale (adjusted for baseline). All participants are retained; two borderline low‑quality cases do not affect results.

post_intent = β0 + β1·arm_coded + β2·pre_intent + ε

arm_coded = 1 for Treatment (MMR chat), 0 for Control.
We report HC3 robust SEs, a 95% CI for β1, and partial η² as an effect‑size.

Show code

# Construct the single confirmatory analysis set -------------------------
analysis_confirm <- trial_main %>%
  mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5,
         engagement_met = (coalesce(chat_turns, 0) >= 3 & coalesce(chat_points, 0) >= 100)) %>%
  filter(!is.na(post_intent), !is.na(pre_intent), pre_intent <= 6, engagement_met)

cat(sprintf("Confirmatory sample size: %d\n", nrow(analysis_confirm)))

Confirmatory sample size: 180

Show code

stopifnot(nrow(analysis_confirm) > 0)

mod_h1 <- lm(post_intent ~ arm_coded + pre_intent, data = analysis_confirm)

cat("\nHC3 robust coefficients (ANCOVA):\n")


HC3 robust coefficients (ANCOVA):

Show code

print(hc3(mod_h1))


t test of coefficients:

              Estimate Std. Error t value  Pr(>|t|)    
(Intercept) -0.0029969  0.1612087 -0.0186    0.9852    
arm_coded    1.0312818  0.1584499  6.5086 7.547e-10 ***
pre_intent   1.0099595  0.0428955 23.5446 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Show code

# Robust 95% CI for arm effect -----------------------------------------
vc <- sandwich::vcovHC(mod_h1, type = "HC3")
co <- coef(mod_h1)
se <- sqrt(diag(vc))
df <- mod_h1$df.residual
crit <- qt(0.975, df)

ci_tbl <- tibble(
  term = names(co),
  estimate = unname(co),
  std.error = unname(se),
  conf.low = estimate - crit * std.error,
  conf.high = estimate + crit * std.error
) %>% dplyr::filter(term == "arm_coded")

knitr::kable(ci_tbl, tbl_fmt, caption = "Arm effect (Treatment vs Control) with HC3 SEs and 95% CI") %>% style_tbl()

Arm effect (Treatment vs Control) with HC3 SEs and 95% CI
term	estimate	std.error	conf.low	conf.high
arm_coded	1.031282	0.1584499	0.7185877	1.343976

Show code

if (requireNamespace("effectsize", quietly = TRUE)) {
  library(effectsize)
  cat("\nPartial Eta^2 (ANCOVA):\n")
  print(effectsize::eta_squared(mod_h1, partial = TRUE))
}


Partial Eta^2 (ANCOVA):
# Effect Size for ANOVA (Type I)

Parameter  | Eta2 (partial) |       95% CI
------------------------------------------
arm_coded  |           0.23 | [0.14, 1.00]
pre_intent |           0.77 | [0.72, 1.00]

- One-sided CIs: upper bound fixed at [1.00].

Show code

coef_df <- ci_tbl %>% transmute(term = "Treatment vs Control", estimate, conf.low, conf.high)
p_main_effect <- ggplot(coef_df, aes(x = term, y = estimate)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray60") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.15, color = treatment_color) +
  geom_point(size = 2.2, color = treatment_color) +
  coord_flip() +
  labs(x = NULL, y = "Arm effect (β̂, 95% CI)") +
  theme_minimal()
save_pdf(p_main_effect, "main-effect", width_in = 5, height_in = 2.5)
p_main_effect

Horizontal coefficient plot showing the adjusted Treatment vs Control effect with 95% confidence interval; the interval does not cross zero. — Adjusted arm effect (Treatment vs Control) with 95% CI from the primary ANCOVA; horizontal dashed line denotes no effect.

Standardized arm effect from ANCOVA
metric	value
Cohen's d (ANCOVA, residual SD)	0.984
Hedges' g (small-sample corrected)	0.980

Method: unstandardized arm coefficient divided by model residual SD (ANCOVA), with Hedges’ correction J. This standardizes the adjusted mean difference.

Residual Diagnostics

We summarize how far model predictions are from the observed post‑intent (after accounting for baseline). Smaller, centered residuals indicate the model is fitting sensibly.

Residual summary for primary ANCOVA
Metric	Value
Min	-3.09
1st Qu.	-0.09
Median	-0.05
Mean	0.00
3rd Qu.	-0.01
Max	4.96
SD	1.04

Sensitivity: Rank-Inverse-Normal (RIN) transform

As a robustness check, we apply a RIN transform to both intention variables and re-estimate the model.

Show code

rin <- function(x) {
  # Rank-Inverse-Normal transform with Blom adjustment
  r <- rank(x, ties.method = "average", na.last = "keep")
  n <- sum(!is.na(x))
  qnorm((r - 3/8) / (n + 1/4))
}

rin_data <- analysis_confirm %>%
  mutate(pre_rin = rin(pre_intent), post_rin = rin(post_intent))

mod_rin <- lm(post_rin ~ arm_coded + pre_rin, data = rin_data)
co_rin <- lmtest::coeftest(mod_rin, vcov = sandwich::vcovHC(mod_rin, type = "HC3"))

arm_est_rin <- unname(coef(mod_rin)["arm_coded"])
arm_se_rin  <- sqrt(diag(sandwich::vcovHC(mod_rin, type = "HC3")))["arm_coded"]
df_rin <- mod_rin$df.residual
crit_rin <- qt(0.975, df_rin)
ci_low_rin <- arm_est_rin - crit_rin * arm_se_rin
ci_high_rin <- arm_est_rin + crit_rin * arm_se_rin

knitr::kable(tibble(
  term = "arm_coded",
  estimate = arm_est_rin,
  std.error = arm_se_rin,
  conf.low = ci_low_rin,
  conf.high = ci_high_rin
), tbl_fmt, digits = 3, caption = "RIN-transform sensitivity: arm effect (HC3) with 95% CI") %>% style_tbl()

RIN-transform sensitivity: arm effect (HC3) with 95% CI
term	estimate	std.error	conf.low	conf.high
arm_coded	0.473	0.073	0.328	0.618

Show code

cat(paste0("RIN sensitivity estimates the adjusted arm effect on the standardized z-scale (",
           sprintf("%.3f", arm_est_rin),
           "; 95% CI ", sprintf("%.3f", ci_low_rin), ", ", sprintf("%.3f", ci_high_rin),
           "), with direction and inference consistent with the primary ANCOVA."))

RIN sensitivity estimates the adjusted arm effect on the standardized z-scale (0.473; 95% CI 0.328, 0.618), with direction and inference consistent with the primary ANCOVA.

Note: RIN estimates are in standardized z-units (not 1–7 scale).

Sensitivity: Excluding two borderline low‑quality cases

We re-estimate the primary model after excluding 2 borderline low‑quality case(s); results are materially unchanged.

HC3 robust coefficients (sensitivity)
term	estimate	std.error	t.value	p.value
(Intercept)	-0.031	0.160	-0.193	0.847
arm coded	1.053	0.159	6.641	0.000
pre intent	1.018	0.043	23.701	0.000

Descriptive Outcomes

We summarize within-arm change. Control shows essentially no change (Δ ≈ 0.03), while Treatment increases on average (Δ ≈ 1.07).

Show code

# Per-arm descriptive means and within-arm changes with SDs
sumA <- analysis_confirm %>%
  mutate(delta = post_intent - pre_intent) %>%
  group_by(arm_fct) %>%
  summarise(
    n = n(),
    pre_mean = mean(pre_intent, na.rm = TRUE),
    pre_sd = sd(pre_intent, na.rm = TRUE),
    post_mean = mean(post_intent, na.rm = TRUE),
    post_sd = sd(post_intent, na.rm = TRUE),
    delta_mean = mean(delta, na.rm = TRUE),
    delta_sd = sd(delta, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  rowwise() %>%
  mutate(
    crit = qt(0.975, df = n - 1),
    pre_ci_low = pre_mean - crit * pre_sd / sqrt(n),
    pre_ci_high = pre_mean + crit * pre_sd / sqrt(n),
    post_ci_low = post_mean - crit * post_sd / sqrt(n),
    post_ci_high = post_mean + crit * post_sd / sqrt(n),
    delta_ci_low = delta_mean - crit * delta_sd / sqrt(n),
    delta_ci_high = delta_mean + crit * delta_sd / sqrt(n)
  ) %>%
  ungroup()

fmt_tbl <- sumA %>%
  transmute(
    Arm = as.character(arm_fct),
    `Pre mean (SD)` = sprintf("%.2f (%.2f)", pre_mean, pre_sd),
    `Post mean (SD)` = sprintf("%.2f (%.2f)", post_mean, post_sd),
    `Δ Post–Pre (SD)` = sprintf("%.2f (%.2f)", delta_mean, delta_sd)
  )

knitr::kable(fmt_tbl, tbl_fmt, caption = "Per-arm descriptive means and within-arm changes (SDs)") %>% style_tbl()

Per-arm descriptive means and within-arm changes (SDs)
Arm	Pre mean (SD)	Post mean (SD)	Δ Post–Pre (SD)
Control	3.69 (1.84)	3.72 (1.99)	0.03 (0.70)
Treatment	3.78 (1.91)	4.85 (2.33)	1.07 (1.30)

Responder Rates (Δ ≥ +1 point)

Show code

# Define responders within confirmatory analysis set -------------------------
resp_df <- analysis_confirm %>%
  mutate(delta = post_intent - pre_intent,
         responder = delta >= 1)

by_arm_resp <- resp_df %>% count(arm_fct, responder, name = "N") %>%
  group_by(arm_fct) %>% mutate(total = sum(N), pct = 100 * N / total) %>% ungroup()

# Simple proportions by arm ---------------------------------------------------
prop_by_arm <- resp_df %>% group_by(arm_fct) %>%
  summarise(Responders = sum(responder, na.rm = TRUE), N = n(), Percent = 100 * Responders / N, .groups = 'drop') %>%
  mutate(Arm = arm_fct) %>% select(Arm, Responders, N, Percent)

prop_tbl_fmt <- prop_by_arm %>%
  transmute(Arm, `Responders / N` = sprintf("%d / %d", Responders, N), `Percent` = sprintf("%.1f%%", Percent))

knitr::kable(prop_tbl_fmt, tbl_fmt, caption = "Responder rates (Δ ≥ +1) by arm") %>% style_tbl()

Responder rates (Δ ≥ +1) by arm
Arm	Responders / N	Percent
Control	11 / 89	12.4%
Treatment	58 / 91	63.7%

Show code

# Compact bar (no CIs) -------------------------------------------------------
plot_prop <- prop_by_arm %>% mutate(Arm = factor(Arm, levels = c("Control","Treatment")))
p_responders <- ggplot(plot_prop, aes(x = Arm, y = Percent, fill = Arm)) +
  geom_col(width = 0.6) +
  scale_fill_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) +
  labs(x = NULL, y = "Responders (Δ ≥ +1) %") +
  theme_minimal() +
  theme(legend.position = "none")
save_pdf(p_responders, "responder-rates", width_in = 6, height_in = 3)
p_responders

Show code

plot_df <- sumA %>%
  transmute(arm_fct, Pre = pre_mean, Post = post_mean) %>%
  tidyr::pivot_longer(cols = c(Pre, Post), names_to = "time", values_to = "mean") %>%
  mutate(time = factor(time, levels = c("Pre","Post")))

p_prepost_means <- ggplot(plot_df, aes(x = time, y = mean, group = arm_fct, color = arm_fct)) +
  geom_line(linewidth = 0.7) +
  geom_point(size = 2) +
  facet_wrap(~ arm_fct, nrow = 1) +
  scale_color_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) +
  scale_y_continuous(limits = c(1, 7), breaks = 1:7) +
  labs(x = NULL, y = "Mean intent (1–7)") +
  theme_minimal() +
  theme(legend.position = "none")
save_pdf(p_prepost_means, "prepost-means", width_in = 6, height_in = 3)
p_prepost_means

Two side-by-side line plots showing mean pre and post intention for Control and Treatment arms on the 1–7 scale; Treatment increases more than Control. — Pre vs Post means by arm (full 1–7 scale)

Show code

delta_df <- analysis_confirm %>%
  transmute(arm_fct, delta = post_intent - pre_intent)

p_delta_violin <- ggplot(delta_df, aes(x = arm_fct, y = delta, fill = arm_fct)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray60") +
  geom_violin(trim = TRUE, alpha = 0.45, color = NA) +
  geom_boxplot(width = 0.15, outlier.shape = NA, alpha = 0.9, color = "#444444") +
  scale_fill_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) +
  labs(x = NULL, y = "Change in intent (Post - Pre)") +
  theme_minimal() +
  theme(legend.position = "none")
save_pdf(p_delta_violin, "delta-violin", width_in = 6, height_in = 3)
p_delta_violin

Violin and box plots of change in intention (Post - Pre) by arm; dashed line at zero. — Distribution of within-participant changes by arm (violin with boxplot)

Durability Analysis

We encountered an outcome‑page implementation failure in the first run that prevented collection of immediate post‑intent. To recover the primary outcome, we contacted those participants later via Prolific for a short follow‑up survey that included the same 1–7 intention item. We linked respondents deterministically by hashed Prolific IDs and computed the delay between their original intervention completion and the follow‑up response. This section estimates the arm effect using the delayed post and examines whether the effect appears durable over the observed follow‑up window. This analysis is exploratory only and the participants in this sample are excluded from the confirmatory analysis.

We estimate the adjusted arm effect using the delayed post‑intent (N = 66). The estimate is 1.087 (95% CI 0.517, 1.658; p = 0.000).

Show code

ci_D_tbl <- tibble(
  term = "arm_coded",
  estimate = arm_est_D,
  std.error = arm_se_D,
  conf.low = ci_low_D,
  conf.high = ci_high_D,
  p.value = p_D
)
knitr::kable(ci_D_tbl, tbl_fmt, digits = 3, caption = "Rescue arm effect (HC3) with 95% CI") %>% style_tbl()

Rescue arm effect (HC3) with 95% CI
term	estimate	std.error	conf.low	conf.high	p.value
arm_coded	1.087	0.286	0.517	1.658	0

Follow‑up Timing

Show code

# Delay distribution summary and plot -------------------------------------
delay_summary <- dur %>%
  summarise(
    N = n(),
    mean_days = mean(days_delay, na.rm = TRUE),
    median_days = median(days_delay, na.rm = TRUE),
    min_days = min(days_delay, na.rm = TRUE),
    max_days = max(days_delay, na.rm = TRUE)
  )
knitr::kable(delay_summary, tbl_fmt, digits = 1, caption = "Follow‑up delay (days): N, mean, median, min, max") %>% style_tbl()

Follow‑up delay (days): N, mean, median, min, max
N	mean_days	median_days	min_days	max_days
66	2.3	1.9	0.3	7.1

Histogram of days between intervention and rescue follow‑up

Show code

# Narrower bins without annotations for clarity ---------------------------
binw <- max(0.25, diff(range(dur$days_delay, na.rm = TRUE)) / 20)

ggplot(dur, aes(x = days_delay)) +
  geom_histogram(binwidth = binw, fill = "#7DA0B1", color = "white", boundary = 0) +
  labs(x = "Days between intervention completion and follow-up", y = "N") +
  theme_minimal()

Histogram of follow-up delays in days with most responses clustered at shorter delays. — Histogram of days between intervention and rescue follow‑up

Effect by Follow‑up Window

Show code

# Data-driven binning by quantiles with minimum size ----------------------
q <- quantile(dur$days_delay, probs = c(0, 1/3, 2/3, 1), na.rm = TRUE)
q <- unique(as.numeric(q))
if (length(q) < 4) {
  q <- unique(as.numeric(quantile(dur$days_delay, probs = c(0, 0.5, 1), na.rm = TRUE)))
}
if (length(q) <= 2) {
  q <- c(min(dur$days_delay, na.rm = TRUE), max(dur$days_delay, na.rm = TRUE))
}

# Ensure strictly increasing breaks
eps <- 1e-6
for (i in 2:length(q)) if (q[i] <= q[i-1]) q[i] <- q[i-1] + eps

# Construct labels
fmtd <- function(x) sprintf("%.1f", x)
labels <- NULL
if (length(q) >= 3) {
  labels <- c(
    paste0("≤", fmtd(q[2]), "d"),
    if (length(q) == 4) paste0(fmtd(q[2]), "–", fmtd(q[3]), "d") else NULL,
    paste0(">", fmtd(q[length(q)-1]), "d")
  )
} else {
  labels <- c(paste0("≤", fmtd(q[2]), "d"))
}

# Cut into bins
brks <- q
if (length(q) == 3) {
  dur$delay_bin2 <- cut(dur$days_delay, breaks = brks, include.lowest = TRUE, right = TRUE, labels = labels)
} else if (length(q) >= 4) {
  dur$delay_bin2 <- cut(dur$days_delay, breaks = brks, include.lowest = TRUE, right = TRUE,
                        labels = c(labels[1], labels[2], labels[3]))
} else {
  dur$delay_bin2 <- factor(rep(labels[1], nrow(dur)), levels = labels)
}

# Bin-wise ANCOVA estimates
bin_levels <- levels(dur$delay_bin2)
bin_tbl <- purrr::map_dfr(bin_levels, function(lb) {
  dfb <- dur %>% filter(delay_bin2 == lb)
  if (nrow(dfb) < 10 || length(unique(dfb$arm_coded)) < 2) {
    return(tibble(bin = lb, estimate = NA_real_, se = NA_real_, conf.low = NA_real_, conf.high = NA_real_, p.value = NA_real_, n = nrow(dfb)))
  }
  m <- lm(intent_post_rescue ~ arm_coded + intent_pre, data = dfb)
  V <- sandwich::vcovHC(m, type = "HC3")
  est <- coef(m)["arm_coded"]
  se  <- sqrt(diag(V))["arm_coded"]
  df  <- m$df.residual
  crit <- qt(0.975, df)
  tibble(
    bin = lb,
    estimate = unname(est),
    se = unname(se),
    conf.low = est - crit * se,
    conf.high = est + crit * se,
    p.value = lmtest::coeftest(m, vcov = V)["arm_coded", 4],
    n = nrow(dfb)
  )
})

knitr::kable(bin_tbl, tbl_fmt, digits = 3, caption = "Arm effect by follow-up window (ANCOVA with HC3)") %>% style_tbl()

Arm effect by follow-up window (ANCOVA with HC3)
bin	estimate	se	conf.low	conf.high	p.value	n
≤1.6d	1.006	0.692	-0.443	2.455	0.163	22
1.6–2.1d	0.944	0.425	0.055	1.832	0.039	22
>2.1d	1.588	0.591	0.352	2.824	0.015	22

Adjusted arm effect by follow-up window using quantile-based bins; points show estimates and bars show 95% CIs; dashed line at zero.

Show code

ggplot(bin_tbl, aes(x = bin, y = estimate)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.2, color = treatment_color) +
  geom_point(size = 2.2, color = treatment_color) +
  labs(x = "Follow-up window", y = "Adjusted arm effect on post-intent") +
  theme_minimal()

Three bin plot of adjusted arm effects by follow-up delay window with 95% confidence intervals; all estimates positive with wide intervals in longer delays. — Adjusted arm effect by follow-up window using quantile-based bins; points show estimates and bars show 95% CIs; dashed line at zero.

Taken together, the positive arm effect is evident within the most common follow-up window (≤1.6d) at 1.01, and remains positive in the next window (1.6–2.1d) at 0.94. This pattern indicates that the effect persists over the observed follow-up period (up to 7.1 days), with wider confidence intervals at longer delays due to smaller sample sizes.

Show code

# Comparison figure: immediate vs rescue overall and windows --------------
imm_vc <- sandwich::vcovHC(mod_h1, type = "HC3")
imm_est <- unname(coef(mod_h1)["arm_coded"])
imm_se  <- sqrt(diag(imm_vc))["arm_coded"]
imm_df  <- mod_h1$df.residual
imm_crit <- qt(0.975, imm_df)

comp_tbl <- tibble(
  group = c("Immediate (primary)", "Rescue overall", paste0("Rescue ", bin_tbl$bin)),
  estimate = c(imm_est, arm_est_D, bin_tbl$estimate),
  se = c(imm_se, arm_se_D, bin_tbl$se)
) %>% mutate(
  conf.low = estimate - 1.96 * se,
  conf.high = estimate + 1.96 * se,
  group = factor(group, levels = rev(group))
)

p_durability_compare <- ggplot(comp_tbl, aes(x = group, y = estimate)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray60") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.2, color = treatment_color) +
  geom_point(size = 2.2, color = treatment_color) +
  coord_flip() +
  labs(x = NULL, y = "Adjusted arm effect (95% CI)") +
  theme_minimal()
save_pdf(p_durability_compare, "durability-compare", width_in = 6, height_in = 3)
p_durability_compare

Horizontal dot-and-errorbar chart comparing the immediate primary effect, the overall rescue effect, and the rescue effects within delay windows; all estimates are positive. — Comparison of adjusted arm effects: immediate (primary) and rescue overall, alongside rescue window-specific estimates with 95% CIs.

Engagement in the Experimental Arm

We explore whether greater chat engagement within the Experimental arm is associated with higher post-intent or change in intent (post minus pre), adjusting for baseline intention. These are descriptive associations and do not imply causality.

In the Experimental arm, chat engagement shows a negative but not statistically significant association with post-intent: β = -0.0007 per chat-point (95% CI -0.0026, 0.0012).

Experimental arm: ANCOVA with chat_points (HC3)
term	estimate	std.error	p.value
(Intercept)	1.2427	0.5314	NA
pre_intent	0.9944	0.0868	0.0000
chat_points	-0.0007	0.0010	0.4845

Scatter of chat engagement (capped) versus change in intention within the Treatment arm with a fitted regression line. — Experimental arm: chat points vs change in intent (with linear fit)

Individual Trajectories

Show code

traj_points <- analysis_confirm %>%
  transmute(arm_fct,
            pre = pre_intent,
            post = post_intent,
            pid = pid_hash) %>%
  tidyr::pivot_longer(c(pre, post), names_to = "time", values_to = "intent") %>%
  mutate(time = factor(time, levels = c("pre","post"), labels = c("Pre","Post")))

# To avoid overplotting on integer scale 
set.seed(123)
traj_points$intent_j <- traj_points$intent + runif(nrow(traj_points), -0.03, 0.03)

ggplot(traj_points, aes(x = time, y = intent_j, group = pid, color = arm_fct)) +
  geom_line(alpha = 0.2) +
  scale_color_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) +
  scale_y_continuous(breaks = 1:7, limits = c(1,7)) +
  facet_wrap(~ arm_fct, nrow = 1) +
  labs(x = NULL, y = "Intention (1–7)") +
  theme_minimal() +
  theme(legend.position = "none")

Slopegraph showing each participant's pre and post intention connected by a line, faceted by Control and Treatment; most Treatment lines slope upward. — Individual participant trajectories from Pre to Post by arm

Discussion

A combination of persuasive informational content and a focused motivational-interviewing-style engagement with an LLM produced a clear and significant increase in vaccination intention among MMR-hesitant U.S. parents relative to a structure-matched active non-vaccine-related child safety information control. The increase in intent appears to persist over several days.

Intent Increase

Among participants with baseline head‑room (pre ≤ 6), the Treatment arm increased by Δ ≈ 1.07 points, while Control was essentially unchanged (Δ ≈ 0.03). The ANCOVA arm effect (Treatment vs Control) is β̂ ≈ 1.03 with a 95% CI of 0.72–1.34, excluding 0. Treatment group parent intent increases from pre-to-post intervention by slightly more than a full point on the seven‑point scale. 63.7% of Treatment participants increased their vaccination intention by at least one point vs 12.4% of Control participants.

Durability

Using delayed post‑intent collected on Prolific, the adjusted arm effect remains positive (β̂ ≈ 1.09) over several days. While exploratory and subject to caveats, these results are suggestive of effect persistence.

Prior Context

In a prior RCT (analysis write‑up, preregistration), an LLM conversation about MMR did not significantly outperform static CDC‑style materials (β̂ ≈ 0.14; 95% CI −0.11, 0.40), though both arms improved pre→post. Here, the control group was not exposed to vaccine‑related content, so a larger between‑arm contrast was both expected and observed. The absolute effect is approximately double the effect size observed in either arm in RCT‑1. Possible explanations include:

an additive effect from combining static content with LLM dialogue
more persuasive static content, including social norm statements and anticipated regret cues
improved prompt engineering, including motivational interviewing style
framing the intervention around an imagined appointment

Disentangling these mechanisms requires further research, but overall the trials suggest that both static content and LLM conversation can raise intention, and that combining them against a non‑MMR control yields a substantial effect.

Limitations

Outcomes are self‑reported intentions, evidence for durability is over a short time window and with attrition, and the experiment was conducted online in a U.S.-only sample.

Implications

A brief, appointment‑framed MMR content review and conversation can shift intention meaningfully relative to a non‑vaccine control, with encouraging signs of durability over several days.

Further research

Additional pre-clinical work could explore and disentangle the mechanisms of action, and a clinical trial to assess whether an intervention of this kind can impact real-world vaccination rates seems clearly warranted.

Data & Code Availability

The Quarto source for this report and a self-contained HTML render will be published on the investigator’s website and linked to from the OSF project containing the preregistration. Participant‑level data include potentially identifying Prolific IDs and cannot be shared publicly; they will be provided upon reasonable request under a data‑use agreement.

Provenance

This document was rendered with R 4.5 and renv‑pinned packages. Batch analysis files resolved at render time:

Main RCT‑2 analysis files resolved at render time
batch	file
2B	/Users/scott/Projects/verum-analysis/experiments/45ff14/data_freezes/experiment_analysis_view_exp-45ff14_20250910-153854.csv
2B2	/Users/scott/Projects/verum-analysis/experiments/5bb2fd/data_freezes/experiment_analysis_view_exp-5bb2fd_20250910-153904.csv
2B3	/Users/scott/Projects/verum-analysis/experiments/8bb589/data_freezes/experiment_analysis_view_exp-8bb589_20250910-095801.csv

Build: 2025-09-19 14:53 PDT; Git: b3c8cbd

--- title: "Discussing MMR vaccines with an LLM after a brief content review increases vaccination intent among hesitant parents" subtitle: "Analysis of a Preregistered Two‑Arm Active Control Online RCT, with a Durability Check" author: "Scott J. Forman" date: "`r format(Sys.Date(), '%B %d, %Y')`" prefer-html: true format: html: theme: cosmo toc: true toc-depth: 3 toc-location: left code-fold: true code-summary: "Show code" code-tools: true self-contained: false format-links: false link-external-newwindow: true mermaid: theme: default gfm: standalone: true # Produce both HTML and GFM by default execute: freeze: auto message: false warning: false eval: true params: main_freeze_dirs: - "experiments/45ff14/data_freezes" # 2B (first pass) - "experiments/5bb2fd/data_freezes" # 2B2 (increased comp to $3.50) - "experiments/8bb589/data_freezes" # 2B3 (increased comp to $4.50) durability_freeze_dir: "experiments/ab8086/data_freezes" durability_freeze_csv: "experiment_analysis_view_exp-ab8086_nopost.csv" durability_prolific_csv: "prolific_export_68a63717859761977be4e572.csv" --- ```{r abstract-values, include=FALSE} # Single source of truth for Abstract numbers --------------------------------- abs_n_main <- 180 abs_est_main <- 1.03 abs_ci_main_low <- 0.72 abs_ci_main_high <- 1.34 abs_n_rescue <- 66 abs_est_rescue <- 1.09 abs_ci_rescue_low <- 0.52 abs_ci_rescue_high <- 1.66 ``` ## Abstract We conducted a preregistered two‑arm online RCT (N = `r abs_n_main` MMR hesitant U.S. parents of young children) comparing a timer‑gated four‑panel MMR content carousel followed by an LLM‑guided conversation against a structure-matched, non‑vaccine-related active control (car seat safety). An ANCOVA on post‑intent controlling for baseline intention estimates an adjusted arm effect of **β̂ ≈ `r sprintf('%.2f', abs_est_main)`** points (95% CI **`r sprintf('%.2f', abs_ci_main_low)`–`r sprintf('%.2f', abs_ci_main_high)`**) on a 1–7 vaccination intention scale. A post‑intervention delayed follow‑up on a separate sample (N = `r abs_n_rescue`) shows a consistent effect size of **β̂ ≈ `r sprintf('%.2f', abs_est_rescue)`** (95% CI **`r sprintf('%.2f', abs_ci_rescue_low)`, `r sprintf('%.2f', abs_ci_rescue_high)`**). Conclusion: a brief intervention combining persuasive content with an LLM conversation significantly increases MMR intention relative to control, with signs of durability over a period of several days. ## Preregistration - OSF preregistration: [Using Conversational AI to Support Parental MMR Decision‑Making: An Active‑Control Randomized Trial](https://osf.io/qx46h) - Deviations: (i) An initial outcome‑page failure prevented immediate post‑intent collection for a subset; those participants were later contacted in a rescue follow‑up. Confirmatory inference uses only the clean re‑run per preregistration; the rescue durability analysis is exploratory. (ii) Compensation increased across batches to maintain enrollment ($2.50 → $3.50 → $4.50) without changing the protocol or analysis plan. - Confirmatory set and exclusions: in preregistration, hard exclusions for obvious bots or exposure failures were planned. In practice no participant could be unambiguously classified as such, so the confirmatory set includes all randomized participants who completed the study. We report a sensitivity excluding two borderline cases; results are unchanged. ```{r setup, include=FALSE} # Activate the local mmr-2 renv explicitly (isolated env for this analysis) if (requireNamespace("renv", quietly = TRUE)) { renv::activate(project = here::here("analysis/mmr-2")) } # Global chunk options ------------------------------------------------- knitr::opts_chunk$set(echo = knitr::is_html_output(), comment = NA, fig.align = "center", code_folding = TRUE) # Libraries ------------------------------------------------------------ library(tidyverse) library(here) library(fs) library(digest) # for hashing Prolific IDs library(lmtest) # for HC3 robust SEs library(sandwich) # bread for HC3 library(jsonlite) library(kableExtra) library(lubridate) library(broom) # Table formatting helpers -------------------------------------------- tbl_fmt <- if (knitr::is_html_output()) "html" else "pipe" style_tbl <- function(kbl_obj) { if (knitr::is_html_output()) { kableExtra::kable_styling( kbl_obj, bootstrap_options = c("striped", "condensed"), full_width = FALSE ) } else { kbl_obj # return as-is for GFM/pipe tables } } # Fixed color palette -------------------------------------------------- control_color <- "#1F78B4" # blue treatment_color <- "#E69F00" # orange # Helper: hash_pid ----------------------------------------------------- hash_pid <- function(pid, salt = Sys.getenv("PID_SALT")) { if (salt == "xFV2Mm") salt <- "static-study-salt" vapply(pid, function(x) digest::digest(paste0(salt, x), algo = "sha256"), FUN.VALUE = character(1)) } # Helper: robust SE wrapper -------------------------------------------- hc3 <- function(m) lmtest::coeftest(m, vcov = sandwich::vcovHC(m, type = "HC3")) # JSON helpers --------------------------------------------------------- safe_parse <- function(x) { tryCatch(jsonlite::fromJSON(x), error = function(e) NULL) } pluck_chr <- function(obj, name) { val <- tryCatch(obj[[name]], error = function(e) NULL); if (is.null(val)) NA_character_ else as.character(val) } pluck_num <- function(obj, name) { val <- tryCatch(obj[[name]], error = function(e) NULL); if (is.null(val)) NA_real_ else suppressWarnings(as.numeric(val)) } # Resolve latest analysis-view CSV, avoid pulling Prolific exports ----- resolve_latest_csv <- function(dir_path) { full_dir <- here::here(dir_path) stopifnot(fs::dir_exists(full_dir)) files_pref <- fs::dir_ls(full_dir, regexp = "experiment_analysis_view_exp-.*\\.csv$", type = "file") files_all <- fs::dir_ls(full_dir, regexp = "\\.csv$", type = "file") files <- if (length(files_pref) > 0) files_pref else files_all files <- files[!grepl("prolific_export_", basename(files))] if (length(files) == 0) stop(paste("No suitable analysis-view CSV files in", full_dir)) files[order(fs::file_info(files)$modification_time, decreasing = TRUE)][1] } resolve_latest_prolific <- function(dir_path) { full_dir <- here::here(dir_path) stopifnot(fs::dir_exists(full_dir)) files <- fs::dir_ls(full_dir, regexp = "prolific_export_.*\\.csv$", type = "file") if (length(files) == 0) NA_character_ else files[order(fs::file_info(files)$modification_time, decreasing = TRUE)][1] } infer_batch_label <- function(path_str) { if (grepl("/45ff14/", path_str)) return("2B") if (grepl("/5bb2fd/", path_str)) return("2B2") if (grepl("/8bb589/", path_str)) return("2B3") if (grepl("/ab8086/", path_str)) return("Rescue") basename(dirname(dirname(path_str))) } ``` ```{r fig-export-helpers, include=FALSE} # Figure export helpers -------------------------------------------------- ensure_dir <- function(path) { if (!fs::dir_exists(path)) fs::dir_create(path) } save_pdf <- function(plot_obj, filename_base, width_in, height_in, subdir = here::here("analysis/mmr-2/manuscript/figs")) { ensure_dir(subdir) device_fun <- if (capabilities("cairo")) grDevices::cairo_pdf else grDevices::pdf fpath <- file.path(subdir, paste0(filename_base, ".pdf")) ggplot2::ggsave(filename = fpath, plot = plot_obj, device = device_fun, width = width_in, height = height_in, units = "in") } ``` ## Methods - Recruitment: U.S.-resident parents of at least one child born in 2019 or later, who indicated less than complete confidence in vaccine safety, and who had not participated in one of our previous studies, were recruited via [Prolific](https://www.prolific.com/). - Flow: After consenting, participants were shown a mock‑appointment page with two buttons: "I have questions or concerns about MMR" and "No concerns about the MMR vaccine." Participants who clicked "No concerns about the MMR vaccine" were screened out and awarded a small payment. Participants who clicked "I have questions or concerns about MMR" were asked to imagine an upcoming appointment with a pediatric medical provider, and indicate on a scale of 1-7 the likelihood that they would have their child receive a dose of the MMR vaccine at that visit. Those at the ceiling (7) exited and were granted a small payment. Those with baseline ≤ 6 were randomized 1:1 to Experimental or Control, and completed a short pre-intervention demographic survey. All randomized participants saw the matched carousel and interactive chat segment, then answered the vaccination intent question a second time. - Structure: Both study arms used the same interface and engagement rules. Participants first saw a brief, scrollable information carousel, followed by an interactive conversation segment. The only difference between arms was the content topic: MMR vaccine (Experimental) versus car‑seat safety (Active Control). - Outcome: Post‑intervention MMR intention (1–7). The primary analysis adjusted for baseline intention. - Primary model: ANCOVA `post_intent ~ arm_coded + pre_intent` with HC3 robust SEs on participants with baseline head‑room (≤ 6) who met preregistered engagement rules. We retained all randomized participants; two cases appeared borderline low‑quality and were excluded in a sensitivity check, which did not materially change results. - A separate exploratory "durability" analysis used delayed post‑intent from the "rescue" dataset. - Batches: Recruitment proceeded in three batches with rising compensation to maintain enrollment when it slowed: Batch 2B at $2.50, Batch 2B2 at $3.50, and Batch 2B3 at $4.50. Each relaunch followed the same protocol and analysis plan. ### LLM Settings All LLM conversations were powered by Claude 4.0 Sonnet via the Anthropic API (model: `claude-sonnet-4-20250514`). Generation settings: `temperature = 1`, `max_tokens = 4096`, `thinking_enabled = TRUE`, `thinking_budget = 1024`. Prompts followed an identical motivational‑interviewing style; only the topic‑specific elements of the prompts differed by arm. ## Data Import Here we load all data sources used in the analysis. ### Main RCT‑2 Analysis Views ```{r import-main} freeze_files <- vapply(params$main_freeze_dirs, resolve_latest_csv, FUN.VALUE = character(1)) load_tbl_main <- tibble(batch = vapply(freeze_files, infer_batch_label, FUN.VALUE = character(1)), file = freeze_files) knitr::kable(load_tbl_main, tbl_fmt, caption = "Selected analysis‑view files (Main RCT‑2)") %>% style_tbl() read_main <- function(fpath) { df <- readr::read_csv(fpath, show_col_types = FALSE) df$batch <- infer_batch_label(fpath) if ("prolific_pid" %in% names(df)) { df <- df %>% mutate(pid_hash = hash_pid(prolific_pid)) %>% select(-prolific_pid) } else if (!"pid_hash" %in% names(df)) { df$pid_hash <- NA_character_ } df %>% mutate( pre_intent = suppressWarnings(as.numeric(pre_intent)), post_intent = suppressWarnings(as.numeric(post_intent)), arm_fct = factor(assigned_condition, levels = c("control", "mmr"), labels = c("Control", "Treatment")), arm_coded = dplyr::if_else(assigned_condition == "mmr", 1, 0, missing = NA_real_), time_seconds = suppressWarnings(as.numeric(time_seconds)), chat_turns = suppressWarnings(as.numeric(chat_turns)), chat_user_chars = suppressWarnings(as.numeric(chat_user_chars)) ) } trial_main_raw <- purrr::map_dfr(freeze_files, read_main) trial_main <- trial_main_raw %>% select(pid_hash, user_id, batch, arm_fct, arm_coded, pre_intent, post_intent, time_seconds, chat_turns, chat_user_chars) coverage_main <- tibble( N_rows = nrow(trial_main), N_with_pre = sum(!is.na(trial_main$pre_intent)), N_with_post = sum(!is.na(trial_main$post_intent)) ) knitr::kable(coverage_main, tbl_fmt, caption = "Main freezes: coverage of pre/post intention") %>% style_tbl() ``` These denormalized analysis views are the batch‑level source used for the confirmatory models (one row per randomized participant, with assigned condition, pre/post intention, engagement metrics, and batch labels). ```{r} ``` ### Durability (Rescue) Files ```{r import-durability} freeze_file_rescue <- here::here(params$durability_freeze_dir, params$durability_freeze_csv) prolific_file_rescue <- here::here(params$durability_freeze_dir, params$durability_prolific_csv) knitr::kable(tibble(file_role=c("rescue_analysis_view","rescue_prolific_csv"), file=c(freeze_file_rescue, prolific_file_rescue)), tbl_fmt, caption = "Durability input files") %>% style_tbl() stopifnot(fs::file_exists(freeze_file_rescue)) stopifnot(fs::file_exists(prolific_file_rescue)) freeze_df <- readr::read_csv(freeze_file_rescue, show_col_types = FALSE) if ("prolific_pid" %in% names(freeze_df)) { freeze_df <- freeze_df %>% mutate(pid_hash = hash_pid(prolific_pid)) } else { freeze_df$pid_hash <- NA_character_ } freeze_df <- freeze_df %>% mutate( pre_intent = suppressWarnings(as.numeric(pre_intent)), arm_fct = factor(assigned_condition, levels = c("control", "mmr"), labels = c("Control", "Treatment")), arm_coded = dplyr::if_else(assigned_condition == "mmr", 1, 0, missing = NA_real_) ) prol_raw_rescue <- readr::read_csv(prolific_file_rescue, show_col_types = FALSE) intent_col <- names(prol_raw_rescue)[length(names(prol_raw_rescue))] prol_rescue <- prol_raw_rescue %>% rename( prolific_pid = `Participant id`, prolific_completed_at = `Completed at` ) %>% mutate( rescue_intent_raw = .data[[intent_col]], prolific_completed_at = lubridate::ymd_hms(prolific_completed_at, quiet = TRUE), pid_hash = hash_pid(prolific_pid), intent_post_rescue = suppressWarnings(as.numeric(stringr::str_extract(as.character(rescue_intent_raw), "[1-7]"))) ) %>% select(pid_hash, intent_post_rescue, prolific_completed_at) trial_rescue <- freeze_df %>% left_join(prol_rescue, by = "pid_hash") %>% mutate( intervention_completed_at_parsed = lubridate::ymd_hms(intervention_completed_at, quiet = TRUE), days_delay = as.numeric(difftime(prolific_completed_at, intervention_completed_at_parsed, units = "days")) ) %>% transmute( pid_hash, user_id, arm_fct, arm_coded, intent_pre = pre_intent, intent_post_rescue, time_seconds, chat_turns, chat_user_chars, prolific_completed_at, intervention_completed_at = intervention_completed_at_parsed, days_delay ) coverage_rescue <- tibble( N_rows = nrow(trial_rescue), N_with_pre = sum(!is.na(trial_rescue$intent_pre)), N_with_post_rescue = sum(!is.na(trial_rescue$intent_post_rescue)) ) knitr::kable(coverage_rescue, tbl_fmt, caption = "Rescue set: coverage of pre and delayed post‑intent") %>% style_tbl() ``` ```{r rescue-response-rate, echo=FALSE} resp_tbl <- trial_rescue %>% mutate(responded = !is.na(intent_post_rescue)) %>% group_by(arm_fct) %>% summarise(N = n(), Responded = sum(responded), Rate = Responded / N, .groups = "drop") %>% mutate(Rate = scales::percent(Rate, accuracy = 0.1)) knitr::kable(resp_tbl, tbl_fmt, caption = "Rescue follow-up response by assigned arm") %>% style_tbl() ``` The rescue analysis_view contains baseline and process fields from the affected run; the Prolific export holds delayed post‑intent collected later. We link them by hashed Prolific IDs and compute the delay in days between intervention completion and the post-intervention intent data collection. ```{r} ``` ### Prolific Exports ```{r prolific-import} prolific_files <- vapply(params$main_freeze_dirs, resolve_latest_prolific, FUN.VALUE = character(1)) prolific_tbl <- tibble(batch = vapply(prolific_files, infer_batch_label, FUN.VALUE = character(1)), file = prolific_files) knitr::kable(prolific_tbl, tbl_fmt, caption = "Resolved Prolific exports (latest by dir)") %>% style_tbl() read_prolific <- function(fpath) { if (is.na(fpath) || !fs::file_exists(fpath)) return(NULL) df <- readr::read_csv(fpath, show_col_types = FALSE) df$batch <- infer_batch_label(fpath) df %>% transmute( batch, submission_id = `Submission id`, prolific_pid = `Participant id`, status = Status, started_at = `Started at`, completed_at = `Completed at`, time_taken_s = suppressWarnings(as.numeric(`Time taken`)), completion_code = `Completion code` ) } prolific_raw <- purrr::map(prolific_files, read_prolific) %>% purrr::compact() %>% bind_rows() status_levels <- c("RETURNED","SCREENED OUT","TIMED OUT","REJECTED","APPROVED","MISSING") prolific_clean <- prolific_raw %>% mutate(status = ifelse(is.na(status) | status == "", "MISSING", status), status = factor(status, levels = status_levels), batch = factor(batch, levels = c("2B","2B2","2B3"))) counts_long <- prolific_clean %>% count(status, batch, name = "N") batch_totals <- counts_long %>% group_by(batch) %>% summarise(batch_total = sum(N), .groups = "drop") # Compute row-wise percentages in long form, then pivot to labels counts_long_labeled <- counts_long %>% left_join(batch_totals, by = "batch") %>% mutate(pct = ifelse(batch_total > 0, 100 * N / batch_total, NA_real_), label = sprintf("%d (%.1f%%)", N, pct)) %>% select(status, batch, label) # Numeric wide for totals, labeled wide for display counts_wide_num <- counts_long %>% tidyr::pivot_wider(names_from = batch, values_from = N, values_fill = 0) %>% arrange(status) counts_wide_lbl <- counts_long_labeled %>% tidyr::pivot_wider(names_from = batch, values_from = label, values_fill = "0 (NA%)") %>% arrange(status) counts_wide_lbl$Total <- rowSums(counts_wide_num %>% select(any_of(levels(prolific_clean$batch)))) # Total row grand <- tibble(status = "Total") for (b in levels(prolific_clean$batch)) { bt <- batch_totals$batch_total[batch_totals$batch == b] grand[[b]] <- sprintf("%d (100.0%%)", bt) } grand$Total <- sum(counts_wide_lbl$Total) fmt_tbl <- bind_rows(counts_wide_lbl, grand) knitr::kable(fmt_tbl, tbl_fmt, caption = "Prolific statuses by batch with totals (N and % of batch total)") %>% style_tbl() ``` These Prolific exports summarize recruitment statuses by batch (e.g., APPROVED, RETURNED). They are used for flow and recruitment descriptives only, not for outcome modeling. ### Exit Paths ```{r exit-paths} resolve_exit_paths <- function(dir_path) { fp <- here::here(dir_path, "exit_paths.csv") if (fs::file_exists(fp)) fp else NA_character_ } exit_files <- vapply(params$main_freeze_dirs, resolve_exit_paths, FUN.VALUE = character(1)) knitr::kable(tibble(batch = vapply(exit_files, infer_batch_label, FUN.VALUE = character(1)), file = exit_files), tbl_fmt, caption = "Resolved exit_paths.csv per batch") %>% style_tbl() read_exit <- function(fpath) { if (is.na(fpath) || !fs::file_exists(fpath)) return(NULL) readr::read_csv(fpath, show_col_types = FALSE) %>% mutate(batch = infer_batch_label(fpath)) } exit_paths <- purrr::map(exit_files, read_exit) %>% purrr::compact() %>% bind_rows() if (nrow(exit_paths) > 0) { exit_paths <- exit_paths %>% mutate(pid_hash = hash_pid(prolific_pid), path = factor(completion_pathway, levels = c("confirm","ceiling","complete"))) %>% filter(!is.na(path)) exit_counts <- exit_paths %>% count(path, batch, name = "N") exit_totals <- exit_counts %>% group_by(batch) %>% summarise(batch_total = sum(N), .groups = "drop") # Compute percentages in long form then pivot to labels exit_long_labeled <- exit_counts %>% left_join(exit_totals, by = "batch") %>% mutate(pct = ifelse(batch_total > 0, 100 * N / batch_total, NA_real_), label = sprintf("%d (%.1f%%)", N, pct)) %>% select(path, batch, label) exit_wide_num <- exit_counts %>% tidyr::pivot_wider(names_from = batch, values_from = N, values_fill = 0) %>% arrange(path) exit_wide_lbl <- exit_long_labeled %>% tidyr::pivot_wider(names_from = batch, values_from = label, values_fill = "0 (NA%)") %>% arrange(path) batches_fac <- levels(factor(exit_paths$batch)) exit_wide_lbl$Total <- rowSums(exit_wide_num %>% select(any_of(batches_fac))) grand_e <- tibble(path = "Total") for (b in batches_fac) { bt <- exit_totals$batch_total[exit_totals$batch == b] grand_e[[b]] <- sprintf("%d (100.0%%)", bt) } grand_e$Total <- sum(exit_wide_lbl$Total) fmt_exit <- bind_rows(exit_wide_lbl, grand_e) knitr::kable(fmt_exit, tbl_fmt, caption = "Exit pathways by batch (DB): N and % of batch total (confirm → ceiling → complete)") %>% style_tbl() } ``` Exit paths provide the database source‑of‑truth for participant flow through the mock‑appointment step and pre-intervention intent survey ("confirm" [no concerns about MMR], "ceiling" [pre-intervention intent = 7], "complete"). We use these counts for the reach metric and in the CONSORT‑style flow diagram. ## Sample Composition We summarize randomized participants from the analysis view, as counts and percentages within each batch. ```{r main-counts-pivot} by_batch_arm <- trial_main %>% count(batch, arm_fct, name = "N") by_batch_tot <- by_batch_arm %>% group_by(batch) %>% summarise(batch_total = sum(N), .groups = "drop") # Percentages in long form, then pivot to labels by_long_lbl <- by_batch_arm %>% left_join(by_batch_tot, by = "batch") %>% mutate(pct = ifelse(batch_total > 0, 100 * N / batch_total, NA_real_), label = sprintf("%d (%.1f%%)", N, pct)) %>% select(arm_fct, batch, label) main_wide_num <- by_batch_arm %>% tidyr::pivot_wider(names_from = batch, values_from = N, values_fill = 0) %>% arrange(arm_fct) main_wide_lbl <- by_long_lbl %>% tidyr::pivot_wider(names_from = batch, values_from = label, values_fill = "0 (NA%)") %>% arrange(arm_fct) main_wide_lbl$Total <- rowSums(main_wide_num %>% select(-arm_fct)) # Totals row grand_row <- tibble(arm_fct = "Total") for (b in unique(by_batch_arm$batch)) { denom <- by_batch_tot$batch_total[by_batch_tot$batch == b] grand_row[[b]] <- sprintf("%d (100.0%%)", denom) } grand_row$Total <- sum(main_wide_lbl$Total) fmt_main <- bind_rows(main_wide_lbl, grand_row) knitr::kable(fmt_main, tbl_fmt, caption = "Randomized sample by arm and batch (N and % of batch total)") %>% style_tbl() ``` ## Demographics We summarize basic demographics using fields captured in the pre‑survey JSON (age, gender, political ideology). ```{r demographics} # Prepare demographics --------------------------------------------------- trial_demo <- trial_main_raw if (all(c("pre_json") %in% names(trial_demo))) { parsed <- purrr::map(trial_demo$pre_json, safe_parse) trial_demo$age <- vapply(parsed, pluck_chr, FUN.VALUE = character(1), name = "age") trial_demo$gender <- vapply(parsed, pluck_chr, FUN.VALUE = character(1), name = "sex") trial_demo$ideology <- vapply(parsed, pluck_chr, FUN.VALUE = character(1), name = "political_leaning") } else { if (!"age" %in% names(trial_demo) && "Age" %in% names(trial_demo)) trial_demo$age <- trial_demo$Age if (!"gender" %in% names(trial_demo) && "sex" %in% names(trial_demo)) trial_demo$gender <- trial_demo$sex if (!"ideology" %in% names(trial_demo) && "political_ideology" %in% names(trial_demo)) trial_demo$ideology <- trial_demo$political_ideology } democat <- function(df, var) { if (!var %in% names(df)) return(tibble()) df %>% filter(!is.na(.data[[var]]), .data[[var]] != "") %>% count(.data[[var]], name = "N") %>% arrange(desc(N)) %>% mutate(Percent = 100 * N / sum(N), Variable = var) %>% rename(Level = !!var) } age_tbl <- democat(trial_demo, "age") gend_tbl <- democat(trial_demo, "gender") ideo_tbl <- democat(trial_demo, "ideology") ``` ```{r demographics-interpretation, echo=FALSE, results='asis'} # Brief, human-style summary of observed distributions -------------------- library(glue) top_row <- function(tbl, k = 1) { if (nrow(tbl) == 0 || k > nrow(tbl)) return(list(level = NA_character_, n = NA_real_, pct = NA_real_)) row <- tbl %>% arrange(desc(N)) %>% slice_head(n = k) %>% slice_tail(n = 1) list(level = as.character(row$Level), n = as.numeric(row$N), pct = as.numeric(row$Percent)) } age1 <- top_row(age_tbl, 1) age2 <- top_row(age_tbl, 2) gend <- top_row(gend_tbl, 1) ideo <- top_row(ideo_tbl, 1) fmt <- function(x, d = 0) ifelse(is.na(x), "NA", sprintf(paste0("%.", d, "f"), x)) txt <- glue("The sample skews {tolower(ifelse(is.na(gend$level), 'unknown', gend$level))} ({fmt(gend$pct,1)}%). Ideology leans {tolower(ifelse(is.na(ideo$level), 'unknown', ideo$level))} ({fmt(ideo$pct,1)}%). Ages cluster in {ifelse(is.na(age1$level),'unknown',age1$level)} ({fmt(age1$pct,1)}%) and {ifelse(is.na(age2$level),'unknown',age2$level)} ({fmt(age2$pct,1)}%).") cat(txt) ``` ### Age ```{r demographics-age-table, echo=FALSE} if (nrow(age_tbl) > 0) { knitr::kable(age_tbl %>% transmute(Level, N, Percent = sprintf("%.1f%%", Percent)), tbl_fmt, caption = "Age distribution (counts and %)") %>% style_tbl() } else { knitr::kable(tibble(Note = "Age not available"), tbl_fmt, caption = "Age") %>% style_tbl() } ``` ```{r demographics-age-plot, fig.width=6, fig.height=3, fig.cap="Age distribution (counts)", fig.alt="Bar chart of participant age categories with counts; the sample skews to 25–34 and 35–44."} if (nrow(age_tbl) > 0) { ggplot(age_tbl, aes(x = reorder(Level, -N), y = N)) + geom_col(fill = "#5B8E7D") + labs(x = NULL, y = "N") + theme_minimal() + theme(axis.text.x = element_text(angle = 20, hjust = 1)) } ``` ### Gender ```{r demographics-gender-table, echo=FALSE} if (nrow(gend_tbl) > 0) { knitr::kable(gend_tbl %>% transmute(Level, N, Percent = sprintf("%.1f%%", Percent)), tbl_fmt, caption = "Gender distribution (counts and %)") %>% style_tbl() } else { knitr::kable(tibble(Note = "Gender not available"), tbl_fmt, caption = "Gender") %>% style_tbl() } ``` ```{r demographics-gender-plot, fig.width=6, fig.height=3, fig.cap="Gender distribution (counts)", fig.alt="Bar chart of participant gender categories with counts."} if (nrow(gend_tbl) > 0) { ggplot(gend_tbl, aes(x = reorder(Level, -N), y = N)) + geom_col(fill = "#7F7F7F") + labs(x = NULL, y = "N") + theme_minimal() } ``` ### Political ideology ```{r demographics-ideology-table, echo=FALSE} if (nrow(ideo_tbl) > 0) { knitr::kable(ideo_tbl %>% transmute(Level, N, Percent = sprintf("%.1f%%", Percent)), tbl_fmt, caption = "Political ideology distribution (counts and %)") %>% style_tbl() } else { knitr::kable(tibble(Note = "Political ideology not available"), tbl_fmt, caption = "Political ideology") %>% style_tbl() } ``` ```{r demographics-ideology-plot, fig.width=6, fig.height=3, fig.cap="Political ideology distribution (counts)", fig.alt="Bar chart of participant political ideology categories with counts."} if (nrow(ideo_tbl) > 0) { ggplot(ideo_tbl, aes(x = reorder(Level, -N), y = N)) + geom_col(fill = "#6A3D9A") + labs(x = NULL, y = "N") + theme_minimal() + theme(axis.text.x = element_text(angle = 20, hjust = 1)) } ``` ## Overview of Participant Flow ```{r flow-data, include=FALSE} N_started <- nrow(prolific_clean) consented_ids <- unique(c(exit_paths$prolific_pid, trial_main_raw$prolific_pid)) N_consented <- sum(!is.na(consented_ids)) N_noqualms <- exit_paths %>% filter(completion_pathway == "confirm") %>% nrow() N_ceiling <- exit_paths %>% filter(completion_pathway == "ceiling") %>% nrow() N_randomised <- nrow(trial_main %>% distinct(user_id)) N_other <- exit_paths %>% filter(completion_pathway == "NA") %>% nrow() # Summary view: set Consented as the sum that flows forward; Abandoned = Started - Consented N_consented_summary <- N_noqualms + N_ceiling + N_randomised N_not_consented <- max(N_started - N_consented_summary, 0) N_screened <- N_noqualms + N_ceiling # Arm counts among all randomised arm_counts <- trial_main %>% count(arm_fct, name = "N") N_control <- arm_counts$N[arm_counts$arm_fct == "Control"] %>% ifelse(length(.)==0, 0, .) N_treat <- arm_counts$N[arm_counts$arm_fct == "Treatment"] %>% ifelse(length(.)==0, 0, .) consort_text <- paste( "flowchart TD", # Enrollment sprintf("A[Started study: %d] --> A0[Abandoned: %d]", N_started, N_not_consented), sprintf("A --> B[Consented: %d]", N_consented_summary), sprintf("B --> S[Screened out: %d]", N_screened), sprintf("S --> C[No concerns about the MMR vaccine: %d]", N_noqualms), sprintf("S --> D[Ceiling: %d]", N_ceiling), # No additional minor buckets in summary view # Allocation sprintf("B --> E[Randomized: %d]", N_randomised), sprintf("E --> A1[Allocated to Control: %d]", N_control), sprintf("E --> A2[Allocated to Treatment: %d]", N_treat), # (No lost-to-follow-up nodes; all randomized completed post in this dataset) sep = "\n" ) ``` ```{r consort-diagram, results='asis', echo=FALSE} if (knitr::is_html_output()) { cat('<pre class="mermaid mermaid-js">') cat(consort_text) cat('</pre>\n') } ``` ```{r mermaid-assets, results='asis', echo=FALSE} if (knitr::is_html_output()) { cat('<div style="display:none">\n') cat('```{mermaid}\nflowchart LR\nx([.]) --> y([.])\n```\n') cat('</div>\n') } ``` ```{r reach-summary, echo=FALSE, results='asis'} library(glue) # Qualms vs no-qualms among consented qualms_n <- N_ceiling + N_randomised noqualms_n <- N_noqualms qualms_pct <- ifelse(N_consented_summary > 0, 100 * qualms_n / N_consented_summary, NA_real_) noqualms_pct <- ifelse(N_consented_summary > 0, 100 * noqualms_n / N_consented_summary, NA_real_) # 95% binomial CIs helper fmt_ci <- function(k, n) { if (is.na(n) || n <= 0 || is.na(k)) return("") ci <- stats::binom.test(k, n)$conf.int paste0(" [", sprintf("%.1f", 100*ci[1]), ", ", sprintf("%.1f", 100*ci[2]), "]%") } # Within the qualms branch: ceiling (7) vs <7 (randomized) ceiling_in_qualms_pct <- ifelse(qualms_n > 0, 100 * N_ceiling / qualms_n, NA_real_) lt7_in_qualms_pct <- ifelse(qualms_n > 0, 100 * N_randomised / qualms_n, NA_real_) # Overall reach to intervention reach_overall_pct <- ifelse(N_consented_summary > 0, 100 * N_randomised / N_consented_summary, NA_real_) html <- glue( '<div class="mt-2">Among those who reached the mock‑appointment screen (N = {N_consented_summary}), ', '{round(qualms_pct,1)}% (95% CI{fmt_ci(qualms_n, N_consented_summary)}) clicked "I have questions or concerns about MMR" (N = {qualms_n}) and ', '{round(noqualms_pct,1)}% (95% CI{fmt_ci(noqualms_n, N_consented_summary)}) clicked "No concerns about the MMR vaccine" (N = {noqualms_n}). ', 'Within the questions/concerns branch, {round(ceiling_in_qualms_pct,1)}% (95% CI{fmt_ci(N_ceiling, qualms_n)}) selected a baseline of 7 (N = {N_ceiling}) and ', '{round(lt7_in_qualms_pct,1)}% (95% CI{fmt_ci(N_randomised, qualms_n)}) had baseline ≤ 6 and proceeded to the intervention (N = {N_randomised}). ', 'Overall, reach to the intervention was {round(reach_overall_pct,1)}% (95% CI{fmt_ci(N_randomised, N_consented_summary)}) of those at the mock‑appointment step.</div>' ) md <- glue( 'Among those who reached the mock‑appointment screen (N = {N_consented_summary}), ', '{round(qualms_pct,1)}% (95% CI{fmt_ci(qualms_n, N_consented_summary)}) clicked "I have questions or concerns about MMR" (N = {qualms_n}) and ', '{round(noqualms_pct,1)}% (95% CI{fmt_ci(noqualms_n, N_consented_summary)}) clicked "No concerns about the MMR vaccine" (N = {noqualms_n}). ', 'Within the questions/concerns branch, {round(ceiling_in_qualms_pct,1)}% (95% CI{fmt_ci(N_ceiling, qualms_n)}) selected a baseline of 7 (N = {N_ceiling}) and ', '{round(lt7_in_qualms_pct,1)}% (95% CI{fmt_ci(N_randomised, qualms_n)}) had baseline ≤ 6 and proceeded to the intervention (N = {N_randomised}). ', 'Overall, reach to the intervention was {round(reach_overall_pct,1)}% (95% CI{fmt_ci(N_randomised, N_consented_summary)}) of those at the mock‑appointment step.' ) if (knitr::is_html_output()) cat(html) else cat(md) ``` ## Batch Comparability ```{r batch-interpretation, echo=FALSE, results='asis'} # Brief summary at top of section ---------------------------------------- library(glue) trial_bc_i <- trial_main %>% mutate(batch_fct = factor(batch, levels = sort(unique(batch)))) tab_i <- trial_bc_i %>% count(batch_fct, arm_fct) %>% tidyr::pivot_wider(names_from = arm_fct, values_from = n, values_fill = 0) %>% select(-batch_fct) %>% as.matrix() if (all(tab_i >= 5)) { test_arm_i <- chisq.test(tab_i) } else { test_arm_i <- fisher.test(tab_i) } aov_pre_i <- aov(pre_intent ~ batch_fct, data = trial_bc_i) p_pre <- tryCatch(summary(aov_pre_i)[[1]][["Pr(>F)"]][1], error = function(e) NA_real_) analysis_bc_i <- trial_bc_i %>% filter(!is.na(post_intent), !is.na(pre_intent), pre_intent <= 6) %>% mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5, engagement_met = (coalesce(chat_turns, 0) >= 3 & coalesce(chat_points, 0) >= 100)) %>% filter(engagement_met) mod_int_i <- lm(post_intent ~ pre_intent + arm_coded * batch_fct, data = analysis_bc_i) ct_i <- lmtest::coeftest(mod_int_i, vcov = sandwich::vcovHC(mod_int_i, type = "HC3")) rows <- grepl("^arm_coded:batch_fct", rownames(ct_i)) p_int_max <- if (any(rows)) max(ct_i[rows, 4], na.rm = TRUE) else NA_real_ txt <- glue("We test arm balance across batches and baseline comparability. Arms are balanced (p = {sprintf('%.3g', test_arm_i$p.value)}); baseline intention is similar across batches (ANOVA p = {ifelse(is.na(p_pre), 'NA', sprintf('%.3g', p_pre))}); the estimated arm effect is stable across batches (Arm×Batch interaction p-values {ifelse(is.na(p_int_max), 'NA', paste0('≥ ', sprintf('%.3g', p_int_max)))}).") cat(txt) ``` ```{r batch-comparability, fig.width=8, fig.height=4} trial_bc <- trial_main %>% mutate(batch_fct = factor(batch, levels = sort(unique(batch)))) # 1) Design balance: Arm × Batch counts and test ------------------------ arm_by_batch <- trial_bc %>% count(batch_fct, arm_fct, name = "N") %>% tidyr::pivot_wider(names_from = arm_fct, values_from = N, values_fill = 0) knitr::kable(arm_by_batch, tbl_fmt, caption = "Arm counts by batch") %>% style_tbl() tab <- trial_bc %>% count(batch_fct, arm_fct) %>% tidyr::pivot_wider(names_from = arm_fct, values_from = n, values_fill = 0) %>% select(-batch_fct) %>% as.matrix() if (all(tab >= 5)) { test_arm <- chisq.test(tab) test_name <- "Chi-squared" } else { test_arm <- fisher.test(tab) test_name <- "Fisher exact" } cat(sprintf("Arm~Batch %s p-value: %.3g\n\n", test_name, test_arm$p.value)) # 2) Baseline comparability across batches ------------------------------ base_by_batch <- trial_bc %>% summarise(N = sum(!is.na(pre_intent)), mean_pre = mean(pre_intent, na.rm=TRUE), sd_pre = sd(pre_intent, na.rm=TRUE), .by = batch_fct) knitr::kable(base_by_batch, tbl_fmt, digits = 2, caption = "Baseline intention by batch") %>% style_tbl() aov_pre <- aov(pre_intent ~ batch_fct, data = trial_bc) cat("ANOVA for baseline intention across batches:\n"); print(summary(aov_pre)) # 3) Treatment effect stability (Arm × Batch in ANCOVA) ----------------- analysis_bc <- trial_bc %>% filter(!is.na(post_intent), !is.na(pre_intent), pre_intent <= 6) %>% mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5, engagement_met = (coalesce(chat_turns, 0) >= 3 & coalesce(chat_points, 0) >= 100)) %>% filter(engagement_met) mod_int <- lm(post_intent ~ pre_intent + arm_coded * batch_fct, data = analysis_bc) cat("\nANCOVA with Arm × Batch interaction (HC3):\n") print(hc3(mod_int)) ``` ## Confirmatory Analysis ### H1 – Arm effect on post‑intervention intention We estimate the adjusted arm difference using a simple ANCOVA on participants with baseline head‑room (≤ 6) who met preregistered engagement rules, and find that the treatment increased MMR‑intention by about a full point on the 1–7 scale (adjusted for baseline). All participants are retained; two borderline low‑quality cases do not affect results. post_intent = β0 + β1·arm_coded + β2·pre_intent + ε - arm_coded = 1 for Treatment (MMR chat), 0 for Control. - We report HC3 robust SEs, a 95% CI for β1, and partial η² as an effect‑size. ```{r ancova} # Construct the single confirmatory analysis set ------------------------- analysis_confirm <- trial_main %>% mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5, engagement_met = (coalesce(chat_turns, 0) >= 3 & coalesce(chat_points, 0) >= 100)) %>% filter(!is.na(post_intent), !is.na(pre_intent), pre_intent <= 6, engagement_met) cat(sprintf("Confirmatory sample size: %d\n", nrow(analysis_confirm))) stopifnot(nrow(analysis_confirm) > 0) mod_h1 <- lm(post_intent ~ arm_coded + pre_intent, data = analysis_confirm) cat("\nHC3 robust coefficients (ANCOVA):\n") print(hc3(mod_h1)) # Robust 95% CI for arm effect ----------------------------------------- vc <- sandwich::vcovHC(mod_h1, type = "HC3") co <- coef(mod_h1) se <- sqrt(diag(vc)) df <- mod_h1$df.residual crit <- qt(0.975, df) ci_tbl <- tibble( term = names(co), estimate = unname(co), std.error = unname(se), conf.low = estimate - crit * std.error, conf.high = estimate + crit * std.error ) %>% dplyr::filter(term == "arm_coded") knitr::kable(ci_tbl, tbl_fmt, caption = "Arm effect (Treatment vs Control) with HC3 SEs and 95% CI") %>% style_tbl() if (requireNamespace("effectsize", quietly = TRUE)) { library(effectsize) cat("\nPartial Eta^2 (ANCOVA):\n") print(effectsize::eta_squared(mod_h1, partial = TRUE)) } ```{r ancova-coef-plot, fig.width=5, fig.height=2.5, fig.cap="Adjusted arm effect (Treatment vs Control) with 95% CI from the primary ANCOVA; horizontal dashed line denotes no effect.", fig.alt="Horizontal coefficient plot showing the adjusted Treatment vs Control effect with 95% confidence interval; the interval does not cross zero."} coef_df <- ci_tbl %>% transmute(term = "Treatment vs Control", estimate, conf.low, conf.high) p_main_effect <- ggplot(coef_df, aes(x = term, y = estimate)) + geom_hline(yintercept = 0, linetype = "dashed", color = "gray60") + geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.15, color = treatment_color) + geom_point(size = 2.2, color = treatment_color) + coord_flip() + labs(x = NULL, y = "Arm effect (β̂, 95% CI)") + theme_minimal() save_pdf(p_main_effect, "main-effect", width_in = 5, height_in = 2.5) p_main_effect ``` ```{r ancova-cohens-d, echo=FALSE} # Cohen's d via ANCOVA residual SD (within-group SD adjusted for covariate) n_treat <- sum(analysis_confirm$arm_coded == 1, na.rm = TRUE) n_ctrl <- sum(analysis_confirm$arm_coded == 0, na.rm = TRUE) df_res <- mod_h1$df.residual s_resid <- sqrt(sum(residuals(mod_h1)^2) / df_res) d_ancova <- unname(coef(mod_h1)["arm_coded"]) / s_resid J <- 1 - (3 / (4*df_res - 1)) g_ancova <- J * d_ancova knitr::kable(tibble( metric = c("Cohen's d (ANCOVA, residual SD)", "Hedges' g (small-sample corrected)"), value = c(d_ancova, g_ancova) ), tbl_fmt, digits = 3, caption = "Standardized arm effect from ANCOVA") %>% style_tbl() ``` ```{r ancova-cohens-note, echo=FALSE, results='asis'} cat("Method: unstandardized arm coefficient divided by model residual SD (ANCOVA), with Hedges' correction J. This standardizes the adjusted mean difference.") ``` ### Residual Diagnostics We summarize how far model predictions are from the observed post‑intent (after accounting for baseline). Smaller, centered residuals indicate the model is fitting sensibly. ```{r ancova-residual-summary, echo=FALSE} resid_vals <- residuals(mod_h1) summ <- summary(resid_vals) resid_tbl <- tibble( Metric = c("Min", "1st Qu.", "Median", "Mean", "3rd Qu.", "Max", "SD"), Value = c(summ[[1]], summ[[2]], summ[[3]], mean(resid_vals, na.rm=TRUE), summ[[5]], summ[[6]], sd(resid_vals, na.rm=TRUE)) ) knitr::kable(resid_tbl, tbl_fmt, digits = 2, caption = "Residual summary for primary ANCOVA") %>% style_tbl() ``` ### Sensitivity: Rank-Inverse-Normal (RIN) transform As a robustness check, we apply a RIN transform to both intention variables and re-estimate the model. ```{r ancova-rin, results='asis'} rin <- function(x) { # Rank-Inverse-Normal transform with Blom adjustment r <- rank(x, ties.method = "average", na.last = "keep") n <- sum(!is.na(x)) qnorm((r - 3/8) / (n + 1/4)) } rin_data <- analysis_confirm %>% mutate(pre_rin = rin(pre_intent), post_rin = rin(post_intent)) mod_rin <- lm(post_rin ~ arm_coded + pre_rin, data = rin_data) co_rin <- lmtest::coeftest(mod_rin, vcov = sandwich::vcovHC(mod_rin, type = "HC3")) arm_est_rin <- unname(coef(mod_rin)["arm_coded"]) arm_se_rin <- sqrt(diag(sandwich::vcovHC(mod_rin, type = "HC3")))["arm_coded"] df_rin <- mod_rin$df.residual crit_rin <- qt(0.975, df_rin) ci_low_rin <- arm_est_rin - crit_rin * arm_se_rin ci_high_rin <- arm_est_rin + crit_rin * arm_se_rin knitr::kable(tibble( term = "arm_coded", estimate = arm_est_rin, std.error = arm_se_rin, conf.low = ci_low_rin, conf.high = ci_high_rin ), tbl_fmt, digits = 3, caption = "RIN-transform sensitivity: arm effect (HC3) with 95% CI") %>% style_tbl() cat(paste0("RIN sensitivity estimates the adjusted arm effect on the standardized z-scale (", sprintf("%.3f", arm_est_rin), "; 95% CI ", sprintf("%.3f", ci_low_rin), ", ", sprintf("%.3f", ci_high_rin), "), with direction and inference consistent with the primary ANCOVA.")) ``` ```{r ancova-rin-note, echo=FALSE, results='asis'} cat("Note: RIN estimates are in standardized z-units (not 1–7 scale).") ``` ### Sensitivity: Excluding two borderline low‑quality cases ```{r ancova-bot-sensitivity, results='asis', echo=FALSE} qc_paths <- c(here::here("analysis/mmr-2/qc_manual.csv"), vapply(params$main_freeze_dirs, function(d) here::here(d, "qc_manual.csv"), FUN.VALUE = character(1))) qc_files <- qc_paths[fs::file_exists(qc_paths)] qc_manual <- purrr::map(qc_files, ~ readr::read_csv(.x, show_col_types = FALSE)) %>% purrr::compact() qc_manual <- if (length(qc_manual) == 0) tibble() else bind_rows(qc_manual) if (nrow(qc_manual) > 0 && !"pid_hash" %in% names(qc_manual) && "prolific_pid" %in% names(qc_manual)) { qc_manual <- qc_manual %>% mutate(pid_hash = hash_pid(prolific_pid)) } flagged <- character(0) if (nrow(qc_manual) > 0 && "flag_bot" %in% names(qc_manual)) { flagged <- qc_manual %>% filter(coalesce(flag_bot, 0) == 1) %>% pull(pid_hash) %>% unique() flagged <- flagged[!is.na(flagged)] } sens_data <- analysis_confirm %>% filter(!pid_hash %in% flagged) mod_sens <- lm(post_intent ~ arm_coded + pre_intent, data = sens_data) cat(sprintf('We re-estimate the primary model after excluding %d borderline low‑quality case(s); results are materially unchanged.\n', length(flagged))) # Render robust coefficients as a proper table to avoid Markdown mangling of underscores co_sens <- lmtest::coeftest(mod_sens, vcov = sandwich::vcovHC(mod_sens, type = "HC3")) coef_tbl_sens <- tibble( term = rownames(co_sens), estimate = unname(co_sens[, 1]), std.error = unname(co_sens[, 2]), t.value = unname(co_sens[, 3]), p.value = unname(co_sens[, 4]) ) coef_tbl_sens$term <- gsub("_", " ", coef_tbl_sens$term) knitr::kable(coef_tbl_sens, tbl_fmt, digits = 3, caption = "HC3 robust coefficients (sensitivity)") %>% style_tbl() ``` ### Descriptive Outcomes ```{r descriptives-interpret, echo=FALSE, results='asis'} # Brief summary for descriptives ------------------------------------------ sumI <- analysis_confirm %>% mutate(delta = post_intent - pre_intent) %>% group_by(arm_fct) %>% summarise(n = n(), pre = mean(pre_intent, na.rm = TRUE), post = mean(post_intent, na.rm = TRUE), d = mean(delta, na.rm = TRUE), .groups = 'drop') ctrl_d <- sumI$d[sumI$arm_fct == "Control"] %>% ifelse(length(.) == 0, NA_real_, .) treat_d <- sumI$d[sumI$arm_fct == "Treatment"] %>% ifelse(length(.) == 0, NA_real_, .) cat(glue("We summarize within-arm change. Control shows essentially no change (Δ ≈ {ifelse(is.na(ctrl_d), 'NA', sprintf('%.2f', ctrl_d))}), while Treatment increases on average (Δ ≈ {ifelse(is.na(treat_d), 'NA', sprintf('%.2f', treat_d))}).")) ``` ```{r descriptives-setA} # Per-arm descriptive means and within-arm changes with SDs sumA <- analysis_confirm %>% mutate(delta = post_intent - pre_intent) %>% group_by(arm_fct) %>% summarise( n = n(), pre_mean = mean(pre_intent, na.rm = TRUE), pre_sd = sd(pre_intent, na.rm = TRUE), post_mean = mean(post_intent, na.rm = TRUE), post_sd = sd(post_intent, na.rm = TRUE), delta_mean = mean(delta, na.rm = TRUE), delta_sd = sd(delta, na.rm = TRUE), .groups = 'drop' ) %>% rowwise() %>% mutate( crit = qt(0.975, df = n - 1), pre_ci_low = pre_mean - crit * pre_sd / sqrt(n), pre_ci_high = pre_mean + crit * pre_sd / sqrt(n), post_ci_low = post_mean - crit * post_sd / sqrt(n), post_ci_high = post_mean + crit * post_sd / sqrt(n), delta_ci_low = delta_mean - crit * delta_sd / sqrt(n), delta_ci_high = delta_mean + crit * delta_sd / sqrt(n) ) %>% ungroup() fmt_tbl <- sumA %>% transmute( Arm = as.character(arm_fct), `Pre mean (SD)` = sprintf("%.2f (%.2f)", pre_mean, pre_sd), `Post mean (SD)` = sprintf("%.2f (%.2f)", post_mean, post_sd), `Δ Post–Pre (SD)` = sprintf("%.2f (%.2f)", delta_mean, delta_sd) ) knitr::kable(fmt_tbl, tbl_fmt, caption = "Per-arm descriptive means and within-arm changes (SDs)") %>% style_tbl() ``` ### Responder Rates (Δ ≥ +1 point) ```{r responder-rates} # Define responders within confirmatory analysis set ------------------------- resp_df <- analysis_confirm %>% mutate(delta = post_intent - pre_intent, responder = delta >= 1) by_arm_resp <- resp_df %>% count(arm_fct, responder, name = "N") %>% group_by(arm_fct) %>% mutate(total = sum(N), pct = 100 * N / total) %>% ungroup() # Simple proportions by arm --------------------------------------------------- prop_by_arm <- resp_df %>% group_by(arm_fct) %>% summarise(Responders = sum(responder, na.rm = TRUE), N = n(), Percent = 100 * Responders / N, .groups = 'drop') %>% mutate(Arm = arm_fct) %>% select(Arm, Responders, N, Percent) prop_tbl_fmt <- prop_by_arm %>% transmute(Arm, `Responders / N` = sprintf("%d / %d", Responders, N), `Percent` = sprintf("%.1f%%", Percent)) knitr::kable(prop_tbl_fmt, tbl_fmt, caption = "Responder rates (Δ ≥ +1) by arm") %>% style_tbl() # Compact bar (no CIs) ------------------------------------------------------- plot_prop <- prop_by_arm %>% mutate(Arm = factor(Arm, levels = c("Control","Treatment"))) p_responders <- ggplot(plot_prop, aes(x = Arm, y = Percent, fill = Arm)) + geom_col(width = 0.6) + scale_fill_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) + labs(x = NULL, y = "Responders (Δ ≥ +1) %") + theme_minimal() + theme(legend.position = "none") save_pdf(p_responders, "responder-rates", width_in = 6, height_in = 3) p_responders ``` ```{r prepost-plot, fig.cap="Pre vs Post means by arm (full 1–7 scale)", fig.alt="Two side-by-side line plots showing mean pre and post intention for Control and Treatment arms on the 1–7 scale; Treatment increases more than Control.", fig.width=6, fig.height=3} plot_df <- sumA %>% transmute(arm_fct, Pre = pre_mean, Post = post_mean) %>% tidyr::pivot_longer(cols = c(Pre, Post), names_to = "time", values_to = "mean") %>% mutate(time = factor(time, levels = c("Pre","Post"))) p_prepost_means <- ggplot(plot_df, aes(x = time, y = mean, group = arm_fct, color = arm_fct)) + geom_line(linewidth = 0.7) + geom_point(size = 2) + facet_wrap(~ arm_fct, nrow = 1) + scale_color_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) + scale_y_continuous(limits = c(1, 7), breaks = 1:7) + labs(x = NULL, y = "Mean intent (1–7)") + theme_minimal() + theme(legend.position = "none") save_pdf(p_prepost_means, "prepost-means", width_in = 6, height_in = 3) p_prepost_means ``` ```{r delta-violin, fig.cap="Distribution of within-participant changes by arm (violin with boxplot)", fig.alt="Violin and box plots of change in intention (Post - Pre) by arm; dashed line at zero.", fig.width=6, fig.height=3} delta_df <- analysis_confirm %>% transmute(arm_fct, delta = post_intent - pre_intent) p_delta_violin <- ggplot(delta_df, aes(x = arm_fct, y = delta, fill = arm_fct)) + geom_hline(yintercept = 0, linetype = "dashed", color = "gray60") + geom_violin(trim = TRUE, alpha = 0.45, color = NA) + geom_boxplot(width = 0.15, outlier.shape = NA, alpha = 0.9, color = "#444444") + scale_fill_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) + labs(x = NULL, y = "Change in intent (Post - Pre)") + theme_minimal() + theme(legend.position = "none") save_pdf(p_delta_violin, "delta-violin", width_in = 6, height_in = 3) p_delta_violin ``` ## Durability Analysis We encountered an outcome‑page implementation failure in the first run that prevented collection of immediate post‑intent. To recover the primary outcome, we contacted those participants later via Prolific for a short follow‑up survey that included the same 1–7 intention item. We linked respondents deterministically by hashed Prolific IDs and computed the delay between their original intervention completion and the follow‑up response. This section estimates the arm effect using the delayed post and examines whether the effect appears durable over the observed follow‑up window. This analysis is exploratory only and the participants in this sample are excluded from the confirmatory analysis. ```{r durability-interpret, echo=FALSE, results='asis'} # Prepare rescue analysis frame ------------------------------------------- dur <- trial_rescue %>% filter(!is.na(intent_post_rescue), !is.na(intent_pre)) %>% mutate( days_delay = as.numeric(days_delay) ) # Overall ANCOVA with HC3 ------------------------------------------------- mod_D <- lm(intent_post_rescue ~ arm_coded + intent_pre, data = dur) vc_D <- sandwich::vcovHC(mod_D, type = "HC3") co_D <- lmtest::coeftest(mod_D, vcov = vc_D) arm_est_D <- unname(coef(mod_D)["arm_coded"]) arm_se_D <- sqrt(diag(vc_D))["arm_coded"] df_D <- mod_D$df.residual crit_D <- qt(0.975, df_D) ci_low_D <- arm_est_D - crit_D * arm_se_D ci_high_D <- arm_est_D + crit_D * arm_se_D p_D <- unname(co_D["arm_coded", 4]) N_D <- nrow(dur) fmt2 <- function(x, d=3) ifelse(is.na(x), "NA", sprintf(paste0("%.", d, "f"), x)) cat(glue::glue( "We estimate the adjusted arm effect using the delayed post‑intent (N = {N_D}). The estimate is {fmt2(arm_est_D)} (95% CI {fmt2(ci_low_D)}, {fmt2(ci_high_D)}; p = {fmt2(p_D)}).") ) ``` ```{r durability-table} ci_D_tbl <- tibble( term = "arm_coded", estimate = arm_est_D, std.error = arm_se_D, conf.low = ci_low_D, conf.high = ci_high_D, p.value = p_D ) knitr::kable(ci_D_tbl, tbl_fmt, digits = 3, caption = "Rescue arm effect (HC3) with 95% CI") %>% style_tbl() ``` ### Follow‑up Timing ```{r durability-delay-summary, fig.cap="Histogram of days between intervention and rescue follow‑up", fig.alt="Histogram of follow-up delays in days with most responses clustered at shorter delays."} # Delay distribution summary and plot ------------------------------------- delay_summary <- dur %>% summarise( N = n(), mean_days = mean(days_delay, na.rm = TRUE), median_days = median(days_delay, na.rm = TRUE), min_days = min(days_delay, na.rm = TRUE), max_days = max(days_delay, na.rm = TRUE) ) knitr::kable(delay_summary, tbl_fmt, digits = 1, caption = "Follow‑up delay (days): N, mean, median, min, max") %>% style_tbl() # Narrower bins without annotations for clarity --------------------------- binw <- max(0.25, diff(range(dur$days_delay, na.rm = TRUE)) / 20) ggplot(dur, aes(x = days_delay)) + geom_histogram(binwidth = binw, fill = "#7DA0B1", color = "white", boundary = 0) + labs(x = "Days between intervention completion and follow-up", y = "N") + theme_minimal() ``` ### Effect by Follow‑up Window ```{r durability-by-bin, fig.cap="Adjusted arm effect by follow-up window using quantile-based bins; points show estimates and bars show 95% CIs; dashed line at zero.", fig.alt="Three bin plot of adjusted arm effects by follow-up delay window with 95% confidence intervals; all estimates positive with wide intervals in longer delays."} # Data-driven binning by quantiles with minimum size ---------------------- q <- quantile(dur$days_delay, probs = c(0, 1/3, 2/3, 1), na.rm = TRUE) q <- unique(as.numeric(q)) if (length(q) < 4) { q <- unique(as.numeric(quantile(dur$days_delay, probs = c(0, 0.5, 1), na.rm = TRUE))) } if (length(q) <= 2) { q <- c(min(dur$days_delay, na.rm = TRUE), max(dur$days_delay, na.rm = TRUE)) } # Ensure strictly increasing breaks eps <- 1e-6 for (i in 2:length(q)) if (q[i] <= q[i-1]) q[i] <- q[i-1] + eps # Construct labels fmtd <- function(x) sprintf("%.1f", x) labels <- NULL if (length(q) >= 3) { labels <- c( paste0("≤", fmtd(q[2]), "d"), if (length(q) == 4) paste0(fmtd(q[2]), "–", fmtd(q[3]), "d") else NULL, paste0(">", fmtd(q[length(q)-1]), "d") ) } else { labels <- c(paste0("≤", fmtd(q[2]), "d")) } # Cut into bins brks <- q if (length(q) == 3) { dur$delay_bin2 <- cut(dur$days_delay, breaks = brks, include.lowest = TRUE, right = TRUE, labels = labels) } else if (length(q) >= 4) { dur$delay_bin2 <- cut(dur$days_delay, breaks = brks, include.lowest = TRUE, right = TRUE, labels = c(labels[1], labels[2], labels[3])) } else { dur$delay_bin2 <- factor(rep(labels[1], nrow(dur)), levels = labels) } # Bin-wise ANCOVA estimates bin_levels <- levels(dur$delay_bin2) bin_tbl <- purrr::map_dfr(bin_levels, function(lb) { dfb <- dur %>% filter(delay_bin2 == lb) if (nrow(dfb) < 10 || length(unique(dfb$arm_coded)) < 2) { return(tibble(bin = lb, estimate = NA_real_, se = NA_real_, conf.low = NA_real_, conf.high = NA_real_, p.value = NA_real_, n = nrow(dfb))) } m <- lm(intent_post_rescue ~ arm_coded + intent_pre, data = dfb) V <- sandwich::vcovHC(m, type = "HC3") est <- coef(m)["arm_coded"] se <- sqrt(diag(V))["arm_coded"] df <- m$df.residual crit <- qt(0.975, df) tibble( bin = lb, estimate = unname(est), se = unname(se), conf.low = est - crit * se, conf.high = est + crit * se, p.value = lmtest::coeftest(m, vcov = V)["arm_coded", 4], n = nrow(dfb) ) }) knitr::kable(bin_tbl, tbl_fmt, digits = 3, caption = "Arm effect by follow-up window (ANCOVA with HC3)") %>% style_tbl() ggplot(bin_tbl, aes(x = bin, y = estimate)) + geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") + geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.2, color = treatment_color) + geom_point(size = 2.2, color = treatment_color) + labs(x = "Follow-up window", y = "Adjusted arm effect on post-intent") + theme_minimal() ``` ```{r durability-summary-text, echo=FALSE, results='asis'} # Emphasize persistence over the observed window -------------------------- max_days <- max(dur$days_delay, na.rm = TRUE) non_na <- bin_tbl %>% filter(!is.na(estimate)) lead_bin <- non_na %>% arrange(desc(n)) %>% slice_head(n = 1) second_bin <- non_na %>% filter(bin != lead_bin$bin) %>% arrange(desc(n)) %>% slice_head(n = 1) fmt2 <- function(x) ifelse(is.na(x), "NA", sprintf("%.2f", x)) txt <- glue::glue( "Taken together, the positive arm effect is evident within the most common follow-up window ({lead_bin$bin}) at {fmt2(lead_bin$estimate)}, and remains positive in the next window ({second_bin$bin}) at {fmt2(second_bin$estimate)}. This pattern indicates that the effect persists over the observed follow-up period (up to {sprintf('%.1f', max_days)} days), with wider confidence intervals at longer delays due to smaller sample sizes." ) cat(txt) ``` ```{r durability-compare-figure, fig.width=6, fig.height=3, fig.cap="Comparison of adjusted arm effects: immediate (primary) and rescue overall, alongside rescue window-specific estimates with 95% CIs.", fig.alt="Horizontal dot-and-errorbar chart comparing the immediate primary effect, the overall rescue effect, and the rescue effects within delay windows; all estimates are positive."} # Comparison figure: immediate vs rescue overall and windows -------------- imm_vc <- sandwich::vcovHC(mod_h1, type = "HC3") imm_est <- unname(coef(mod_h1)["arm_coded"]) imm_se <- sqrt(diag(imm_vc))["arm_coded"] imm_df <- mod_h1$df.residual imm_crit <- qt(0.975, imm_df) comp_tbl <- tibble( group = c("Immediate (primary)", "Rescue overall", paste0("Rescue ", bin_tbl$bin)), estimate = c(imm_est, arm_est_D, bin_tbl$estimate), se = c(imm_se, arm_se_D, bin_tbl$se) ) %>% mutate( conf.low = estimate - 1.96 * se, conf.high = estimate + 1.96 * se, group = factor(group, levels = rev(group)) ) p_durability_compare <- ggplot(comp_tbl, aes(x = group, y = estimate)) + geom_hline(yintercept = 0, linetype = "dashed", color = "gray60") + geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.2, color = treatment_color) + geom_point(size = 2.2, color = treatment_color) + coord_flip() + labs(x = NULL, y = "Adjusted arm effect (95% CI)") + theme_minimal() save_pdf(p_durability_compare, "durability-compare", width_in = 6, height_in = 3) p_durability_compare ``` ## Engagement in the Experimental Arm We explore whether greater chat engagement within the Experimental arm is associated with higher post-intent or change in intent (post minus pre), adjusting for baseline intention. These are descriptive associations and do not imply causality. ```{r engagement-experimental-fit, include=FALSE} exp_only <- trial_main %>% filter(arm_fct == "Treatment", !is.na(post_intent), !is.na(pre_intent)) %>% mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5) mod_eng <- lm(post_intent ~ pre_intent + chat_points, data = exp_only) vc_eng <- sandwich::vcovHC(mod_eng, type = "HC3") co_eng <- lmtest::coeftest(mod_eng, vcov = vc_eng) b_points <- unname(coef(mod_eng)["chat_points"]) se_points <- sqrt(diag(vc_eng))["chat_points"] df_e <- mod_eng$df.residual crit_e <- qt(0.975, df_e) ci_lo <- b_points - crit_e * se_points ci_hi <- b_points + crit_e * se_points sig_txt <- ifelse(ci_lo <= 0 & ci_hi >= 0, "not statistically significant", "statistically significant") eng_tbl <- tibble( term = c("(Intercept)", "pre_intent", "chat_points"), estimate = unname(coef(mod_eng)), std.error = unname(sqrt(diag(vc_eng))), p.value = c(NA_real_, unname(co_eng["pre_intent", 4]), unname(co_eng["chat_points", 4])) ) ``` In the Experimental arm, chat engagement shows a `r ifelse(b_points >= 0, 'positive', 'negative')` but `r sig_txt` association with post-intent: β = `r sprintf('%.4f', b_points)` per chat-point (95% CI `r sprintf('%.4f', ci_lo)`, `r sprintf('%.4f', ci_hi)`). ```{r engagement-experimental-table, echo=FALSE} knitr::kable(eng_tbl, tbl_fmt, digits = 4, caption = "Experimental arm: ANCOVA with chat_points (HC3)") %>% style_tbl() ``` ```{r engagement-delta-plot, echo=FALSE, fig.cap="Experimental arm: chat points vs change in intent (with linear fit)", fig.alt="Scatter of chat engagement (capped) versus change in intention within the Treatment arm with a fitted regression line."} # Scatter of chat points vs change in intent with smooth ------------------- exp_only <- exp_only %>% mutate(delta_intent = post_intent - pre_intent, cp_c = chat_points) ggplot(exp_only, aes(x = cp_c, y = delta_intent)) + geom_point(alpha = 0.25, size = 1, color = treatment_color) + geom_smooth(method = "lm", formula = y ~ x, color = "#333333", linewidth = 0.8, se = TRUE) + labs(x = "Chat points", y = "Change in intent (Post - Pre)") + theme_minimal() ``` ## Individual Trajectories ```{r trajectories, fig.cap="Individual participant trajectories from Pre to Post by arm", fig.alt="Slopegraph showing each participant's pre and post intention connected by a line, faceted by Control and Treatment; most Treatment lines slope upward."} traj_points <- analysis_confirm %>% transmute(arm_fct, pre = pre_intent, post = post_intent, pid = pid_hash) %>% tidyr::pivot_longer(c(pre, post), names_to = "time", values_to = "intent") %>% mutate(time = factor(time, levels = c("pre","post"), labels = c("Pre","Post"))) # To avoid overplotting on integer scale set.seed(123) traj_points$intent_j <- traj_points$intent + runif(nrow(traj_points), -0.03, 0.03) ggplot(traj_points, aes(x = time, y = intent_j, group = pid, color = arm_fct)) + geom_line(alpha = 0.2) + scale_color_manual(values = c("Control" = control_color, "Treatment" = treatment_color)) + scale_y_continuous(breaks = 1:7, limits = c(1,7)) + facet_wrap(~ arm_fct, nrow = 1) + labs(x = NULL, y = "Intention (1–7)") + theme_minimal() + theme(legend.position = "none") ``` ## Discussion A combination of persuasive informational content and a focused motivational-interviewing-style engagement with an LLM produced a clear and significant increase in vaccination intention among MMR-hesitant U.S. parents relative to a structure-matched active non-vaccine-related child safety information control. The increase in intent appears to persist over several days. ### Intent Increase Among participants with baseline head‑room (pre ≤ 6), the Treatment arm increased by **Δ ≈ `r round(sumA$delta_mean[sumA$arm_fct=='Treatment'],2)` points**, while Control was essentially unchanged (**Δ ≈ `r round(sumA$delta_mean[sumA$arm_fct=='Control'],2)`**). The ANCOVA arm effect (Treatment vs Control) is **β̂ ≈ `r sprintf('%.2f', ci_tbl$estimate)`** with a 95% CI of **`r sprintf('%.2f', ci_tbl$conf.low)`–`r sprintf('%.2f', ci_tbl$conf.high)`**, excluding 0. Treatment group parent intent increases from pre-to-post intervention by slightly more than a full point on the seven‑point scale. **`r sprintf('%.1f', prop_by_arm$Percent[prop_by_arm$Arm=='Treatment'])`%** of Treatment participants increased their vaccination intention by at least one point vs **`r sprintf('%.1f', prop_by_arm$Percent[prop_by_arm$Arm=='Control'])`%** of Control participants. ### Durability Using delayed post‑intent collected on Prolific, the adjusted arm effect remains positive (**β̂ ≈ `r sprintf('%.2f', arm_est_D)`**) over several days. While exploratory and subject to caveats, these results are suggestive of effect persistence. ### Prior Context In a prior RCT ([analysis write‑up](https://sjforman.me/mmr-persuasion-analysis.html), [preregistration](https://osf.io/7upk5)), an LLM conversation about MMR did not significantly outperform static CDC‑style materials (β̂ ≈ 0.14; 95% CI −0.11, 0.40), though both arms improved pre→post. Here, the control group was not exposed to vaccine‑related content, so a larger between‑arm contrast was both expected and observed. The absolute effect is approximately double the effect size observed in either arm in RCT‑1. Possible explanations include: - an additive effect from combining static content with LLM dialogue - more persuasive static content, including social norm statements and anticipated regret cues - improved prompt engineering, including motivational interviewing style - framing the intervention around an imagined appointment Disentangling these mechanisms requires further research, but overall the trials suggest that both static content and LLM conversation can raise intention, and that combining them against a non‑MMR control yields a substantial effect. ### Limitations Outcomes are self‑reported intentions, evidence for durability is over a short time window and with attrition, and the experiment was conducted online in a U.S.-only sample. ### Implications A brief, appointment‑framed MMR content review and conversation can shift intention meaningfully relative to a non‑vaccine control, with encouraging signs of durability over several days. ### Further research Additional pre-clinical work could explore and disentangle the mechanisms of action, and a clinical trial to assess whether an intervention of this kind can impact real-world vaccination rates seems clearly warranted. <details> <summary>Data & Code Availability</summary> The Quarto source for this report and a self-contained HTML render will be published on the investigator's website and linked to from the OSF project containing the preregistration. Participant‑level data include potentially identifying Prolific IDs and cannot be shared publicly; they will be provided upon reasonable request under a data‑use agreement. </details> <details> <summary>Provenance</summary> This document was rendered with R 4.5 and renv‑pinned packages. Batch analysis files resolved at render time: ```{r provenance, echo=FALSE} knitr::kable(load_tbl_main, tbl_fmt, caption = "Main RCT‑2 analysis files resolved at render time") %>% style_tbl() ``` ```{r provenance-meta, echo=FALSE} # Git short SHA placeholder (set before publishing) git_sha <- Sys.getenv("GIT_SHORT_SHA", unset = "<set GIT_SHORT_SHA before publish>") build_time <- format(Sys.time(), "%Y-%m-%d %H:%M %Z") cat(paste0("Build: ", build_time, "; Git: ", git_sha)) ``` </details> ```{r abstract-sanity-check, echo=FALSE, warning=TRUE, message=FALSE} # Sanity check: Abstract variables vs computed results ------------------------ # Computed values from earlier chunks calc_n_main <- tryCatch(nrow(analysis_confirm), error = function(e) NA_integer_) calc_est_main <- tryCatch(as.numeric(ci_tbl$estimate), error = function(e) NA_real_) calc_ci_main_low <- tryCatch(as.numeric(ci_tbl$conf.low), error = function(e) NA_real_) calc_ci_main_high <- tryCatch(as.numeric(ci_tbl$conf.high), error = function(e) NA_real_) calc_n_rescue <- tryCatch(N_D, error = function(e) NA_integer_) if (is.na(calc_n_rescue)) { calc_n_rescue <- tryCatch(nrow(dur), error = function(e) NA_integer_) } calc_est_rescue <- tryCatch(as.numeric(arm_est_D), error = function(e) NA_real_) calc_ci_rescue_low <- tryCatch(as.numeric(ci_low_D), error = function(e) NA_real_) calc_ci_rescue_high <- tryCatch(as.numeric(ci_high_D), error = function(e) NA_real_) # Exact equality after rounding to 2 decimals (or integer for N) fmt2 <- function(x) sprintf("%.2f", x) if (!is.na(calc_n_main) && as.integer(calc_n_main) != as.integer(abs_n_main)) { stop(sprintf("Abstract N (main) = %d, computed = %d", abs_n_main, calc_n_main)) } if (!is.na(calc_est_main) && !identical(fmt2(calc_est_main), fmt2(abs_est_main))) { stop(sprintf("Abstract β (main) = %s, computed = %s", fmt2(abs_est_main), fmt2(calc_est_main))) } if (!is.na(calc_ci_main_low) && !identical(fmt2(calc_ci_main_low), fmt2(abs_ci_main_low))) { stop(sprintf("Abstract CI low (main) = %s, computed = %s", fmt2(abs_ci_main_low), fmt2(calc_ci_main_low))) } if (!is.na(calc_ci_main_high) && !identical(fmt2(calc_ci_main_high), fmt2(abs_ci_main_high))) { stop(sprintf("Abstract CI high (main) = %s, computed = %s", fmt2(abs_ci_main_high), fmt2(calc_ci_main_high))) } if (!is.na(calc_n_rescue) && as.integer(calc_n_rescue) != as.integer(abs_n_rescue)) { stop(sprintf("Abstract N (rescue) = %d, computed = %d", abs_n_rescue, calc_n_rescue)) } if (!is.na(calc_est_rescue) && !identical(fmt2(calc_est_rescue), fmt2(abs_est_rescue))) { stop(sprintf("Abstract β (rescue) = %s, computed = %s", fmt2(abs_est_rescue), fmt2(calc_est_rescue))) } if (!is.na(calc_ci_rescue_low) && !identical(fmt2(calc_ci_rescue_low), fmt2(abs_ci_rescue_low))) { stop(sprintf("Abstract CI low (rescue) = %s, computed = %s", fmt2(abs_ci_rescue_low), fmt2(calc_ci_rescue_low))) } if (!is.na(calc_ci_rescue_high) && !identical(fmt2(calc_ci_rescue_high), fmt2(abs_ci_rescue_high))) { stop(sprintf("Abstract CI high (rescue) = %s, computed = %s", fmt2(abs_ci_rescue_high), fmt2(calc_ci_rescue_high))) } ``` ```{r link-targets, echo=FALSE, results='asis'} if (knitr::is_html_output()) { cat('<script>\n','document.addEventListener("DOMContentLoaded", function() {\n', ' var anchors = document.querySelectorAll("a[href^=\\"http\\"]");\n', ' anchors.forEach(function(a) {\n', ' a.setAttribute("target","_blank");\n', ' a.setAttribute("rel","noopener");\n', ' });\n', '});\n','</script>') } ```