Can Conversations with LLMs Reduce MMR Vaccine Hesitancy?

Analysis of a Preregistered Two-Arm Online RCT

Author

Scott J. Forman

Published

June 24, 2025

Abstract

We conducted a preregistered online experiment (N = 216 U.S. parents) comparing an AI-driven “Interactive Conversation Tool” (ICT) with standard educational materials (SEM) for reducing measles-mumps-rubella (MMR) vaccine hesitancy. Parents completed (1) a Prolific prescreen, (2) a baseline survey immediately before the intervention, and (3) a post-intervention survey. Primary outcome was post-intervention intention to vaccinate (1–7 Likert), controlling for baseline intention. Both arms produced medium within-subject gains (SEM Δ ≈ +0.4, ICT Δ ≈ +0.54; Cohen’s d ≈ 0.45–0.60) relative to baseline, but ICT was not superior to SEM (β̂ = 0.14, 95 % CI [-0.11, 0.40]). Engagement metrics (time-on-task, chat activity) and preregistered moderators (concerns, ideology, trust, prior behavior) did not alter the null ICT–SEM difference. Results suggest that thoughtfully crafted materials, whether static or conversational, can shift parental MMR intention upward, but an AI chatbot did not offer meaningful incremental benefit under these conditions.

Preregistration Snapshot

OSF preregistration: https://osf.io/7upk5

Methods

Participants & Design

We recruited N = 216 U.S.–based parents via the Prolific online panel. Eligibility required (a) at least one child under 10 and (b) completion of a pre-screening survey that included the MMR-intention item. The experiment used a two-arm between-subjects pre-post design:

  • ICT = Interactive Conversation Tool (AI-driven chatbot)
  • SEM = Standard Educational Materials (static NIH‐style content)

Randomization was implemented via a stratified minimisation routine within three baseline-intention strata (1–2, 3–4, 5–6); participants at ceiling (7) were tracked in a separate stratum.

The study ran entirely online in a single ~7 min session: pre-survey → intervention → post-survey. Minimum-engagement rules were enforced automatically (≥ 150 s exposure in SEM; ≥ 100 chat-points in ICT).

No blinding of participants was possible due to obvious format differences.

Measures

  1. MMR-vaccination intention – single 7-point Likert item (1 = definitely would not7 = definitely would). Asked three times:
    • Pre-screen (several days before the RCT)
    • Pre-intervention intention (immediately before the intervention)
    • Post-intervention intention (immediately after the intervention)
  2. Engagement metrics
    • time_seconds – total seconds between intervention load and post-survey start (all participants).
    • chat_turns, chat_user_chars – interaction counts in ICT. Combined into preregistered chat_points = chat_turns × 10 + chat_user_chars × 0.5.
  3. Covariates / potential moderators – pre-survey captured demographics, political ideology, trust in providers, prior vaccination behavior, and concern check-boxes (side-effects, effectiveness, etc.).

Analytic Approach

Analyses follow the OSF preregistration (v 2025-05-19).

  • Confirmatory set = participants with pre_intent ≤ 6 (i.e., not already at ceiling). Hypotheses H1, H2a, H2b are tested here.
  • Exploratory set = full sample with additional interaction and moderation models.

Primary model for H1 is an ANCOVA:

\[ \operatorname{post\_intent}_{i} = \beta_{0} + \beta_{1}\,\operatorname{condition}_{i} + \beta_{2}\,\operatorname{pre\_intent}_{i} + \varepsilon_{i} \]

where \(\beta_{1}\) estimates the ICT–SEM difference after adjusting for baseline intention. We report OLS point estimates along with HC3 robust standard errors (Long & Ervin 2000) to guard against heteroscedasticity. Effect sizes include partial \(\eta^{2}\) and Cohen’s d.

Missingness is minimal (<2%); per preregistration we use list-wise deletion in each model.

Software Environment

We used R 4.5.0 with tidyverse 2.0.0 (versions frozen via renv).

Participant Characteristics and Survey Descriptives

We first describe the sample’s background characteristics and the distribution of key survey responses collected from the survey administered immediately before the intervention.

The bar charts visualise the distribution of age, sex, income, education, political ideology, and prior vaccination behavior. The histogram shows provider-trust ratings. The bar charts visualise the most common vaccine concerns and primary information sources on vaccines.

Results

Confirmatory Analyses

H1 – Effect of ICT vs SEM on Post-Intervention Intention

# Create analysis set (pre_intent < 7)
analysis_set <- trial_merged %>%
  mutate(
    pre_intent  = as.numeric(pre_intent),
    post_intent = as.numeric(post_intent),
    condition_coded = if_else(assigned_condition == "chat", 1, 0),
    condition_fct   = factor(assigned_condition,
                              levels = c("standard", "chat"),
                              labels = c("SEM", "ICT"))
  ) %>%
  filter(pre_intent < 7) %>%
  drop_na(post_intent, pre_intent)

cat(sprintf("Sample size for H1 (pre_intent < 7): %d\n", nrow(analysis_set)))
Sample size for H1 (pre_intent < 7): 197
# ANCOVA
mod_h1 <- lm(post_intent ~ condition_coded + pre_intent, data = analysis_set)

cat("\nOLS summary:\n")

OLS summary:
print(summary(mod_h1))

Call:
lm(formula = post_intent ~ condition_coded + pre_intent, data = analysis_set)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4655 -0.5852 -0.4476  0.4986  3.3431 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)      0.55520    0.18755   2.960  0.00346 ** 
condition_coded  0.13759    0.12520   1.099  0.27317    
pre_intent       0.98206    0.03689  26.620  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.8786 on 194 degrees of freedom
Multiple R-squared:  0.7854,    Adjusted R-squared:  0.7832 
F-statistic: 354.9 on 2 and 194 DF,  p-value: < 2.2e-16
cat("\nHC3 robust SEs:\n")

HC3 robust SEs:
print(hc3(mod_h1))

t test of coefficients:

                Estimate Std. Error t value  Pr(>|t|)    
(Intercept)     0.555204   0.185525  2.9926  0.003126 ** 
condition_coded 0.137586   0.126115  1.0910  0.276645    
pre_intent      0.982060   0.034818 28.2057 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Effect size
if (requireNamespace("effectsize", quietly = TRUE)) {
  library(effectsize)
  cat("\nPartial Eta^2:\n")
  print(effectsize::eta_squared(mod_h1, partial = TRUE))
}

Partial Eta^2:
# Effect Size for ANOVA (Type I)

Parameter       | Eta2 (partial) |       95% CI
-----------------------------------------------
condition_coded |       6.63e-03 | [0.00, 1.00]
pre_intent      |           0.79 | [0.75, 1.00]

- One-sided CIs: upper bound fixed at [1.00].

Interpretation: a positive \(\beta_{1}\) (or mean difference) indicates higher post-intent in ICT after adjusting for baseline intention.

H2a – Engagement (Time Spent)

# Prepare data: ensure time_seconds numeric and non-missing
analysis_set_h2a <- analysis_set %>%
  mutate(time_seconds = as.numeric(time_seconds)) %>%
  drop_na(time_seconds)

cat(sprintf("Sample size for H2a: %d\n", nrow(analysis_set_h2a)))
Sample size for H2a: 197
# Model: post_intent ~ condition + pre_intent + time_seconds
mod_h2a <- lm(post_intent ~ condition_coded + pre_intent + time_seconds,
              data = analysis_set_h2a)

cat("\nHC3 robust SEs (H2a):\n")

HC3 robust SEs (H2a):
print(hc3(mod_h2a))

t test of coefficients:

                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.50795429 0.20675690  2.4568   0.0149 *  
condition_coded 0.11102877 0.13550058  0.8194   0.4136    
pre_intent      0.98246226 0.03496945 28.0949   <2e-16 ***
time_seconds    0.00016159 0.00034697  0.4657   0.6419    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Semi-partial Eta² for time_seconds
if (requireNamespace("effectsize", quietly = TRUE)) {
  library(effectsize)
  cat("\nSemi-partial Eta^2 for time_seconds:\n")
  print(effectsize::eta_squared(mod_h2a, partial = TRUE) %>%
          dplyr::filter(Parameter == "time_seconds"))
}

Semi-partial Eta^2 for time_seconds:
# Effect Size for ANOVA (Type I)

Parameter    | Eta2 (partial) |       95% CI
--------------------------------------------
time_seconds |       1.67e-03 | [0.00, 1.00]

- One-sided CIs: upper bound fixed at [1.00].

Exploratory interaction: does the slope differ by arm?

mod_h2a_int <- lm(post_intent ~ condition_coded * time_seconds + pre_intent,
                  data = analysis_set_h2a)

cat("\nHC3 robust SEs with interaction (H2a exploratory):\n")

HC3 robust SEs with interaction (H2a exploratory):
print(hc3(mod_h2a_int))

t test of coefficients:

                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  5.1173e-01 2.5344e-01  2.0191  0.04486 *  
condition_coded              1.0499e-01 2.7502e-01  0.3817  0.70308    
time_seconds                 1.4834e-04 5.9811e-04  0.2480  0.80438    
pre_intent                   9.8245e-01 3.5257e-02 27.8657  < 2e-16 ***
condition_coded:time_seconds 1.8443e-05 7.5747e-04  0.0243  0.98060    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

H2b – Chat Engagement (ICT Arm)

# ICT-only subset and chat_points calculation
ict_set <- analysis_set %>%
  filter(condition_coded == 1) %>%
  mutate(chat_points = chat_turns * 10 + chat_user_chars * 0.5) %>%
  drop_na(chat_points)

cat(sprintf("Sample size for H2b (ICT arm): %d\n", nrow(ict_set)))
Sample size for H2b (ICT arm): 98
mod_h2b <- lm(post_intent ~ pre_intent + chat_points, data = ict_set)

cat("\nHC3 robust SEs (H2b):\n")

HC3 robust SEs (H2b):
print(hc3(mod_h2b))

t test of coefficients:

               Estimate  Std. Error t value Pr(>|t|)    
(Intercept)  0.77302235  0.36115090  2.1404  0.03488 *  
pre_intent   1.00546500  0.05263928 19.1010  < 2e-16 ***
chat_points -0.00085751  0.00069441 -1.2349  0.21992    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Effect size for chat_points
if (requireNamespace("effectsize", quietly = TRUE)) {
  library(effectsize)
  cat("\nSemi-partial Eta^2 for chat_points:\n")
  print(effectsize::eta_squared(mod_h2b, partial = TRUE) %>%
          dplyr::filter(Parameter == "chat_points"))
}

Semi-partial Eta^2 for chat_points:
# Effect Size for ANOVA (Type I)

Parameter   | Eta2 (partial) |       95% CI
-------------------------------------------
chat_points |           0.01 | [0.00, 1.00]

- One-sided CIs: upper bound fixed at [1.00].

Interpretation: a significant positive coefficient on time_seconds (H2a) or chat_points (H2b) indicates greater engagement predicts higher post-intention after adjusting for baseline intention.

Pre-Registered Exploratory Analyses

(Moderation, logistic regression, and engagement plots.)

H3 – Does the ICT effect vary by baseline characteristics?

moderators <- c(
  paste0("concern_", concern_ids),   # (a) concern dummies
  "political_ideology",             # (b)
  "trust_provider",                 # (c)
  "prev_vaccination"                # (d)
)

# Container for results
oh3_results <- list()

for (m in moderators) {
  if (!m %in% names(processed)) next
  dat <- processed %>% drop_na(post_intent, pre_intent, !!sym(m))
  if (n_distinct(dat[[m]]) < 2) next   # needs variance

  formula_h3 <- as.formula(paste0("post_intent ~ condition_coded * ", m, " + pre_intent"))
  mod <- lm(formula_h3, data = dat)
  res <- hc3(mod)
  cat("\n--- Moderator:", m, "---\n")
  print(res)
  oh3_results[[m]] <- res
}

--- Moderator: concern_side_effects ---

t test of coefficients:

                                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)                           0.512065   0.299534  1.7095  0.08882 .  
condition_coded                       0.191838   0.317220  0.6047  0.54600    
concern_side_effects                  0.224600   0.219901  1.0214  0.30825    
pre_intent                            0.937110   0.035825 26.1584  < 2e-16 ***
condition_coded:concern_side_effects -0.078581   0.345548 -0.2274  0.82032    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: concern_effectiveness ---

t test of coefficients:

                                       Estimate Std. Error t value  Pr(>|t|)
(Intercept)                            0.574138   0.194080  2.9583  0.003447
condition_coded                        0.156414   0.137105  1.1408  0.255233
concern_effectiveness                  0.314832   0.189406  1.6622  0.097956
pre_intent                             0.943293   0.034154 27.6190 < 2.2e-16
condition_coded:concern_effectiveness -0.112504   0.278748 -0.4036  0.686912
                                         
(Intercept)                           ** 
condition_coded                          
concern_effectiveness                 .  
pre_intent                            ***
condition_coded:concern_effectiveness    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: concern_ingredients ---

t test of coefficients:

                                     Estimate Std. Error t value  Pr(>|t|)    
(Intercept)                          0.732906   0.259445  2.8249  0.005183 ** 
condition_coded                      0.211583   0.186355  1.1354  0.257506    
concern_ingredients                  0.035914   0.176128  0.2039  0.838621    
pre_intent                           0.925853   0.039536 23.4181 < 2.2e-16 ***
condition_coded:concern_ingredients -0.155511   0.246109 -0.6319  0.528150    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: concern_necessity ---

t test of coefficients:

                                   Estimate Std. Error t value  Pr(>|t|)    
(Intercept)                        0.946430   0.227371  4.1625 4.585e-05 ***
condition_coded                    0.198572   0.130795  1.5182    0.1305    
concern_necessity                 -0.204352   0.225187 -0.9075    0.3652    
pre_intent                         0.893260   0.037936 23.5466 < 2.2e-16 ***
condition_coded:concern_necessity -0.339799   0.330952 -1.0267    0.3057    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: concern_too_many ---

t test of coefficients:

                                  Estimate Std. Error t value  Pr(>|t|)    
(Intercept)                       0.841035   0.211499  3.9765 9.606e-05 ***
condition_coded                   0.227420   0.176264  1.2902    0.1984    
concern_too_many                 -0.116598   0.167253 -0.6971    0.4865    
pre_intent                        0.918744   0.036131 25.4284 < 2.2e-16 ***
condition_coded:concern_too_many -0.215766   0.249656 -0.8643    0.3884    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: concern_trust_pharma ---

t test of coefficients:

                                      Estimate Std. Error t value  Pr(>|t|)    
(Intercept)                           0.898118   0.257908  3.4823 0.0006042 ***
condition_coded                       0.042083   0.174413  0.2413 0.8095718    
concern_trust_pharma                 -0.236804   0.168026 -1.4093 0.1602087    
pre_intent                            0.917557   0.038983 23.5372 < 2.2e-16 ***
condition_coded:concern_trust_pharma  0.193644   0.245960  0.7873 0.4319907    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: concern_trust_doctors ---

t test of coefficients:

                                       Estimate Std. Error t value  Pr(>|t|)
(Intercept)                            0.897419   0.236408  3.7961 0.0001921
condition_coded                        0.076497   0.136268  0.5614 0.5751406
concern_trust_doctors                 -0.364605   0.168643 -2.1620 0.0317438
pre_intent                             0.910375   0.039705 22.9283 < 2.2e-16
condition_coded:concern_trust_doctors  0.237396   0.333045  0.7128 0.4767548
                                         
(Intercept)                           ***
condition_coded                          
concern_trust_doctors                 *  
pre_intent                            ***
condition_coded:concern_trust_doctors    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: political_ideology ---

t test of coefficients:

                                             Estimate Std. Error t value
(Intercept)                                 0.7747075  0.2031113  3.8142
condition_coded                            -0.0052412  0.1756150 -0.0298
political_ideologyliberal                   0.0189580  0.2753308  0.0689
political_ideologymoderate                 -0.0608048  0.1676415 -0.3627
pre_intent                                  0.9237045  0.0376523 24.5325
condition_coded:political_ideologyliberal   0.2908032  0.3561183  0.8166
condition_coded:political_ideologymoderate  0.2363614  0.2607482  0.9065
                                            Pr(>|t|)    
(Intercept)                                0.0001798 ***
condition_coded                            0.9762193    
political_ideologyliberal                  0.9451706    
political_ideologymoderate                 0.7171900    
pre_intent                                 < 2.2e-16 ***
condition_coded:political_ideologyliberal  0.4150917    
condition_coded:political_ideologymoderate 0.3657290    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: trust_provider ---

t test of coefficients:

                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                    0.548680   0.229746  2.3882  0.01781 *  
condition_coded                0.043650   0.363833  0.1200  0.90462    
trust_provider                 0.133241   0.062752  2.1233  0.03489 *  
pre_intent                     0.846365   0.056894 14.8762  < 2e-16 ***
condition_coded:trust_provider 0.011351   0.075034  0.1513  0.87990    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


--- Moderator: prev_vaccination ---

t test of coefficients:

                                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)                          0.624969   0.277114  2.2553  0.02514 *  
condition_coded                      0.133468   0.175218  0.7617  0.44707    
prev_vaccinationsome                 0.079528   0.188149  0.4227  0.67295    
pre_intent                           0.942732   0.041297 22.8280  < 2e-16 ***
condition_coded:prev_vaccinationsome 0.006858   0.244933  0.0280  0.97769    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation: any interaction term (condition_coded:moderator) with p < .05 merits follow-up simple-slopes or visual exploration.


Additional Exploratory Analyses

Change in Vaccination Intention

We visualise how mean MMR-vaccination intention moved across the three measurement points and test the within-arm gain from Pre-survey → Post-survey (the primary intervention window).

# prepare long & delta datasets ---------------------------------------
traj_dat <- trial_merged %>%
  mutate(
    prescreen = as.numeric(prescreen_intent),
    pre       = as.numeric(pre_intent),
    post      = as.numeric(post_intent)
  ) %>%
  select(pid_hash, assigned_condition, prescreen, pre, post) %>%
  drop_na(prescreen, pre, post)

traj_long <- traj_dat %>%
  pivot_longer(c(prescreen, pre, post),
               names_to = "time", values_to = "intent") %>%
  mutate(time = factor(time,
                       levels = c("prescreen", "pre", "post"),
                       labels = c("Prescreen", "Pre", "Post")),
         condition = factor(assigned_condition,
                            levels = c("standard", "chat"),
                            labels = c("SEM", "ICT")))

traj_delta <- traj_dat %>%
  mutate(delta_pre_post = post - pre,
         condition = factor(assigned_condition,
                            levels = c("standard", "chat"),
                            labels = c("SEM", "ICT")))
Figure 1: Mean MMR intention (1–7) across measurement points; whiskers = 95% CI.
Within-arm improvement from Pre- to Post-survey (paired t-test).
Arm N Δ (Post–Pre) SD d t p
SEM 110 0.400 0.880 0.455 4.768 <0.001
ICT 106 0.538 0.907 0.593 6.105 <0.001
Combined 197 0.543 0.877 0.619 8.689 <0.001
Figure 2: Change in MMR intention (Post – Pre) by study arm; whiskers = 95% CI around the mean change. Dashed line marks no change.

Key take-aways
• Both arms registered medium within‐subject gains (SEM Δ ≈ +0.4, ICT Δ ≈ +0.54; d ≈ 0.45–0.60).

• The gain was statistically significant inside each arm (p < .001).

• Consistent with the ANCOVA above, the difference between arms is negligible.

Engagement Within the ICT Arm

We examine whether greater chatbot interaction (measured by chat_points = chat_turns × 10 + chat_user_chars × 0.5) is associated with higher post-intervention intention.

Chat engagement metrics in ICT arm
N Median Chat Points % engagement ≥ 100
106 176.5 100%

Distribution of chat engagement in ICT arm; dashed line = preregistered minimum (100 chat-points)

Distribution of chat engagement in ICT arm; dashed line = preregistered minimum (100 chat-points)

t test of coefficients:

               Estimate  Std. Error t value Pr(>|t|)    
(Intercept)  0.89901948  0.36905355  2.4360  0.01657 *  
pre          0.94935430  0.05420846 17.5130  < 2e-16 ***
chat_points -0.00058530  0.00072049 -0.8124  0.41845    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation
A positive coefficient for chat_points indicates that, within the chatbot arm, more engagement predicts higher post-intention after accounting for baseline intention.

Overall Change Across All Participants

Collapsing across study arms clarifies the two-step rise in mean intention: first from Prescreen → Pre (before any intervention) and then from Pre → Post (during the intervention).

Mean intention change across both arms.
N Δ Prescreen→Pre Δ Pre→Post Total Δ
216 0.2 0.47 0.67

The average participant gained 0.2 points before the intervention and 0.47 additional points during the intervention, yielding a total increase of 0.67 points in vaccination intention.

Trust in Providers

We asked about vaccine trust in two forms: a continuous 1–7 rating (“How much do you trust healthcare providers for vaccine information?”, stored as trust_provider) and a binary checkbox (“I don’t trust medical professionals to have my best interests at heart.” → concern_trust_doctors). Both showed small but statistically reliable main effects in the models above. Figure Figure 3 isolates the incremental effect of provider trust after holding baseline intention constant. Each one-point rise on the scale adds roughly 0.13 post-intention points (95 % CI ≈ 0.01–0.25). The partial-residual display removes the confounding influence of baseline intention, revealing a modest but clearly linear association.

Figure 3: Incremental effect of provider trust after controlling for baseline intention (partial-residual plot).

Behavioral Proxy: Vaccines.gov Click-through

We embedded a behavioral proxy link in the post-survey that directed parents to the CDC’s vaccines.gov page. Clicking this link indicates a concrete step toward seeking vaccination information. Below we compare the proportion of participants in each arm who clicked.

Click‐through to vaccines.gov by study arm.
Arm N # Clicked % Clicked
SEM 110 7 6.4%
ICT 106 10 9.4%

Chi-squared test p-value: 0.559

Interpretation A higher click-through rate suggests stronger behavioral engagement. A non-significant p-value indicates no reliable difference between SEM and ICT arms.

Discussion

In short: exposure to either resource significantly increased stated parental intention to vaccinate, but the chatbot did not meaningfully outperform the static material.

  1. Immediate gain (confirmatory sample). Among all participants with baseline head-room (pre-survey ≤ 6), intention increased by Δ ≈ 0.54 points (d ≈ 0.62), or about 9 % of the 1–7 scale range.

  2. Trajectory across the whole sample. Average intention rose by Δ ≈ 0.2 points between the prescreen and the immediate pre-survey (a “mere-measurement” effect). It then climbed a further Δ ≈ 0.47 points during the intervention (pre → post). The intervention-driven bump is about 2.3× the earlier drift and accounts for roughly 70 % of the total Δ ≈ 0.67-point gain from prescreen to post.

  3. Arm comparison. After adjusting for baseline intention the ICT–SEM gap was not significant (partial η² = 0.006).

  4. Provider trust. Each one-point rise on the 1–7 trust scale predicted an extra ≈ 0.13 post-intention points (See Figure 3).

  5. Behavioral proxy. ICT participants were slightly more likely to click through to vaccines.gov, but not to a statistically meaningful degree (SEM 6.4 %, ICT 9.4 %).

  6. Engagement metrics. Neither time-on-page (SEM) nor chat activity (ICT) predicted additional gains, nor did any subgroup show a hidden ICT advantage.

  7. Limitations. Findings rely on self-report intention, face ceiling effects (~10 % at 7), capture only short-term change, and derive from a U.S. Prolific sample.

  8. Implications. A short chatbot exchange can increase parental vaccination intent at least as much as well-crafted static content, but not significantly more so. Chat interfaces may aid usability or reach but, in this setting, did not enhance persuasiveness.

Possible avenues for future work include testing:

  • Refined conversational designs, including visual and narrative elements
  • Integration with practical clinical workflows
  • Impact on actual vaccine uptake

Data & Code Availability

This Quarto source file and the rendered self-contained HTML report are openly available on the Open Science Framework. The raw participant-level data include potentially identifying Prolific IDs and therefore cannot be shared publicly. They will be provided to qualified researchers upon reasonable request, subject to a data-use agreement.