How reactiphi works
This page is the technical reader's view: peer-reviewed methods, formulas, and citations. Everywhere else in the app, we use plain-English equivalents so non-marketers can use the tool without learning new vocabulary.
Glossary: what we call things vs what they're technically called
Same concepts, two vocabularies. We say the plain version everywhere except this page.
| In the app | Technically known as |
|---|---|
| Audience score | PIDES (PR/Marketing) · SCREEN (Studio) |
| Simulated audience | Personas |
| Version (of an ad) | Stimulus |
| Alternative version | Distractor |
| Emotional driver | Lever / expected lever |
| Balanced audience mix | Stratified Latin hypercube sampling |
| US Census audience data | US ACS marginals |
| Personality mix | Big Five (OCEAN) |
| Audience reliability check | Sycophancy adversarial battery |
| Audience balance check | WEIRD-bias audit (TVD) |
| Audience segment | Cohort |
| Case study | Validation backtest |
| Use of proven persuasion moves | Cialdini density |
| What the audience felt | Plutchik-8 emotion distribution |
| How fresh it felt | Predictability (SCREEN) |
| Did the audience care about them? | Character resonance (SCREEN) |
| Right-audience match | Demographic fit (SCREEN) |
The pipeline
Each stage is persisted end-to-end; runs survive restarts and can be re-opened from History.
- 1. Demographic spec
Free-text label + optional filters across 8 axes (age, income, gender, education, region, urbanicity, household, political lean).
- 2. Persona generation
Stratified Latin hypercube sampling from US ACS marginals; Big Five personality priors from Open Psychometrics.
- 3. Agent simulation
Each persona reads each stimulus with a JSON-structured system prompt and an explicit anti-sycophancy instruction.
- 4. Scoring
An LLM judge scores every response on the PIDES or SCREEN rubric across nine psychologically-grounded dimensions.
- 5. Brief
Themes, top phrases by lift, cohort deltas, and a client-ready executive brief in PDF / PPTX / CSV.
Persona engine
Independent marginals; joint distributions (copulas) are on the v0.2 roadmap.
Sampling
A DemographicSpec defines optional filters per axis. reactiphi draws n quasi-random coordinates from a Latin hypercube on [0,1]^k, then maps each coordinate to a bucket via the inverse CDF of the (filtered and renormalized) marginal for that axis. Latin hypercube guarantees each bucket receives coverage proportional to its weight without the clumping of uniform random draws.
Personality
Each persona receives a Big Five vector (OCEAN) drawn from trait-wise truncated normal priors (μ, σ from Open Psychometrics adult sample), a Schwartz-value ranking, and a Jobs-to-be-Done specification. The full persona is serialized to JSON and delivered as the agent's system prompt.
Political leanopt-in
For copy or scripts where partisanship matters, you can sample personas across a six-point lean scale (progressive · liberal · moderate · conservative · libertarian · apolitical) based on Pew political-typology marginals. When you don't opt in, lean is left unspecified and never appears in the persona prompt, so the engine never injects a partisan frame by accident. Per-lean guidance is kept mild to inform reactions without producing caricatures.
PIDES: how we score marketing copy
Nine dimensions, eight additive and one multiplicative modifier. Grounded in peer-reviewed instruments.
| Dimension | Measures | Weight | Source |
|---|---|---|---|
| Arousal | Emotional intensity | 25% | Mehrabian & Russell (1974); AdSAM |
| Valence | Affect polarity (negative to positive) | 20% | Plutchik (1980); Osgood et al. (1957) |
| Cialdini density | # persuasion principles triggered | 20% | Cialdini (2009) |
| Behavior intent | Stated/implied intent to act | 15% | Lavidge & Steiner (1961) |
| Personal relevance | Values + jobs-to-be-done alignment | 10% | Schwartz (2012); JTBD |
| Personality fit | Big Five × message-frame match | 5% | Haugtvedt et al. (1992) |
| Social proof signal | Peer / expert / majority citation | 3% | Cialdini (2009) |
| Elaboration depth | Cognitive processing depth | 2% | Cacioppo & Petty (1982) |
| Congruence | Emotional fit to message framing | ×[0.5 to 1.0] | Ortony, Clore & Collins (1988) |
Weights are theory-driven; emotion (arousal + valence) receives 45% of the weight because it is the strongest predictor of ad recall and attitude shift in the peer-reviewed creative-effectiveness literature. PIDES deliberately excludes the "triune brain" / limbic primal-instinct framing, a discredited neuroscientific model (Cesario et al., 2020).
SCREEN: how we score scenes and ad beats
Applied to Studio audits (film/TV scripts and ad concepts: 30s, 60s, long-form, vertical social). Seven per-viewer dimensions aggregate to a 0 to 100 score; a Plutchik-8 emotion distribution captures texture.
Dimensions
- EngagementHow engaged the viewer stayed30%
- Emotional intensityStrength of felt emotion25%
- Character resonanceCare / empathy with characters20%
- Narrative tensionStory tension experienced15%
- Demographic fitWould this viewer choose to watch?10%
Plutchik emotion distribution
Each viewer reports a primary felt emotion from Plutchik's 8-category wheel plus optional neutral. Distribution is rendered as a radial wheel where sector radius scales with share.
After scoring, a synthesis pass produces scene-level edit suggestions tagged critical major minor polish, with a script-level verdict (greenlight / rework / shelve) at the top of the report.
Audits: how we check our own work
Defensibility checks that run alongside every campaign.
WEIRD-bias balance score
For each axis, we compute the total variation distance between the observed persona distribution and the baseline US-adult marginal. The platform-level balance score is 100 × (1 − mean TVD) across six axes. A score of 100 means the audience mirrors US-adult marginals; low scores on intentionally-targeted runs are flagged honestly rather than hidden.
Balance = 100 × (1 − mean TVD)
Sycophancy adversarial battery
For each stimulus we generate a counter-frame (same tone and length, opposite argument), then run a sample of personas against both. The sycophancy rate is the fraction of (persona, stimulus) pairs where the score swings less than the threshold (default 8 PIDES points). High sycophancy means the audience agrees with whatever they're shown; it's a reliability flag, not a failure.
SycRate = |{(p,s) : Δ < τ}| / N
Historical validation
reactiphi is backtested against known historical brand campaigns and famous trailers.
Nine historical cases are wired as validation backtests, spanning marketing copy, screen marketing, and political campaigns:
- • Domino's Pizza Turnaround (2009 to 2010)
- • Old Spice "The Man Your Man Could Smell Like" (2010)
- • KFC UK "FCK" chicken-shortage apology (2018)
- • Always "#LikeAGirl" (2014)
- • Apple "1984" Macintosh launch (1984)
- • Nike "Just Do It" launch (1988)
- • Stranger Things Season 1 trailer (2016)
- • Hillary 2016 vs Trump MAGA, upper-Midwest swing voters
- • Virginia Governor 2021, Youngkin upset
Each case exposes reactiphi to the real winning concept plus 2 alternative or synthesized distractor concepts, then grades whether the platform ranks the actual winner in its top-2 and whether its extracted themes surface the right emotional levers.
Failed cases generate a structured failure reason (rank gap, lever miss). The platform owns its mistakes rather than hiding them.