Marianne Bertrand’s 2004 article “How much should we trust differences-in-differences estimates?” (appeared in QJE) outlines several tests that can be done to assess the robustness of difference-in-differences estimates given concerns of false positives.

One recommendation is to run a **placebo simulation** in which–in a first step–the treatment indicator is randomly assigned to observations in the data set and–in a second step–the regressions are run again with the goal to compare the main estimates with those from the placebo regression.

I have written a little **Stata script** that runs such a placebo simulation and **compiles an Excel spreadsheet** which gives the placebo coefficient estimates along with the confidence interval bounds.

Here’s that script. It assumes a panel dataset in use which observations take the form of unit-years (e.g., firm-years). The only thing necessary to adjust for your purposes is to set the parameters at the top.

global project_folder = `"C:\Users\path to project"'
global depvar = "dependent variable"
global treatment = "treatment binary"
global post = "time binary which is 1 for observations after the treatment"
global idvar = "unit identifier variable (e.g., id)"
global timevar = "time identifier variable (e.g., years)"
global controls = "list of control variables (e.g., age)"
global seed = "110" //sets the memory for reproducible random variable generations
global treatment_groupsize = "number of observations in the treatment group (e.g., 100)"
global numruns = "#runs of the simulation (e.g., 60)"
**set excel headers
putexcel set $project_folder, replace
putexcel A1=("DV Coefficient")
putexcel B1=("DV Lower CI")
putexcel C1=("DV Upper CI")
local cellcounter = 3
set seed $seed
*estimate "true" regression
xtset $idvar $timevar
xtreg $depvar i.$treatment##i.$post $controls $timevar, fe robust
putexcel A2=(_b[1.$treatment#1.$post])
putexcel B2=(_b[1.$treatment#1.$post] - invttail(e(df_r),0.025)*_se[1.$treatment#1.$post])
putexcel C2=(_b[1.$treatment#1.$post] + invttail(e(df_r),0.025)*_se[1.$treatment#1.$post])
forvalues i=1/$numruns {
randomtag if $timevar == awardm-4, count($treatment_groupsize) gen(r) //ssc
bys $idvar: egen placebo = max(r)
drop r
tab placebo
capture xtreg $depvar i.placebo##i.$post $controls $timevar, fe robust
putexcel A`cellcounter'=(_b[1.placebo#1.$post])
putexcel B`cellcounter'=(_b[1.placebo#1.$post] - invttail(e(df_r),0.025)*_se[1.placebo#1.$post])
putexcel C`cellcounter'=(_b[1.placebo#1.$post] + invttail(e(df_r),0.025)*_se[1.placebo#1.$post])
if _rc!=0 {
display "Error on run "`i'
}
else {
estimates store result`i'
}
drop placebo
local cellcounter=`cellcounter'+1
}

In one of the next blog posts, I will show how to use this generated spreadsheet for plots of the placebo confidence intervals or simple tabulation summaries for your papers.