-
Notifications
You must be signed in to change notification settings - Fork 0
/
dd_cbo_mw.Rmd
2877 lines (2215 loc) · 125 KB
/
dd_cbo_mw.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: |-
Dynamic Documention for "The Effects of a Minimum-Wage Increase on Employment and Family Income"
author: "Download/Modify/Contribute the analysis [here](https://github.com/fhoces/dd_cbo_mw)"
date: '`r paste("Last edit:", Sys.Date(), sep=" ")`'
output:
html_document:
number_sections: yes
theme: united
toc: yes
toc_depth: 2
toc_float:
collapsed: no
smooth_scroll: no
pdf_document:
header-includes:
- \usepackage{caption}
- \captionsetup{labelformat=empty}
toc: yes
toc_depth: '2'
---
<script language="javascript">
function toggle(num) {
var ele = document.getElementById("toggleText" + num);
var text = document.getElementById("displayText" + num);
if(ele.style.display == "block") {
ele.style.display = "none";
text.innerHTML = "show";
}
else {
ele.style.display = "block";
text.innerHTML = "hide";
}
}
</script>
<!--
resource_files:
- apps/app1/app.R
runtime: shiny
-->
# Introduction
The role of policy analysis is to connect research with policy. Because of heavy time constrains, policy analyses are typically ambiguous regarding the details of how the analysis was carried out. This creates three problems: (i) its hard to understand the connection between research and policy, (ii) allows policy makers to cherry pick policy reports, and (iii) hinders systematic improvement and/or automation of parts of the analysis. In this document we demonstrate the use of a reproducible workflow to reduce the ambiguity in policy analysis.
Here we attempt to contribute to the policy discussion of the minimum wage. The minimum wage is a contentious policy issue in the US. Increasing it has positive and negative effects that different policymakers value differently. We aim to add clarity on what those effects are, how much do we know about them, and how those effects vary when elements of the analysis change. We select the most up-to-date, non-partisan, policy analysis of the effects of raising the minimum wage, and build an open-source reproducible analysis on top of it.
In 2014 the Congressional Budget Office published the report titled ["The Effects of a Minimum-Wage Increase on Employment and Family Income"](https://www.cbo.gov/publication/44995). The report receive wide attention from key stakeholders and has been used extensible as an input in the debate around the minimum wage[^3]. To this date we consider the CBO report to be the best non-partisan estimation of the effects of raising the minimum wage at the federal level. Although there was disagreement among experts around some technical issues, this disagreement has been mainly circumscribed around one of the many inputs used in the analysis, and we can fit the opposing positions in to our framework.
Our purposes are twofold: First, promote the technical discussion around a recurrent policy issue (minimum wage) by making explicit and visible all the components and key assumptions of its most up-to-date official policy analysis. Second, demonstrate how new scientific practices of transparency and reproducibility (T & R) can be applied to policy analysis. We encourage the reader to collaborate in this document and help develop an ever-improving version of the important policy estimates[^3] (re)produced here.
To achieve our goal we reviewed the CBO report and extract the key components of its analysis. We adapt new guidelines propose by the scientific community ([TOP](https://cos.io/top/)) into policy analysis. In it, the analysis achieves the highest standards of transparency and reproducibility (T & R) when the data, methods and workflow are completely reproducible and every part of the analysis and its assumptions, are easily readable. We also benefit from hindsight and structure this document around the costs and benefits mainly discussed in the policy debate.
CBO's report, in its original form already represents a significant improvement in T & R relative to the standard practices of policy analyses. The report contains most of the components required for a full reproduction. We add the missing components, make explicit assumptions when needed, complement the narrative explanations with some mathematical formulae, visualizations, and the analytical code use behind all the replication.
**Important Note:**
Although our aim is to translate practices of T & R from Science to Policy Analysis, we need to highlight an important difference regarding reproducibility between the two of them. A scientific report takes the form of a peer review publication that represent several months or years of research, followed up by a review process that can be as lengthy as the research itself. For this reason, when a scientific publication is subject to replication is expected to succeed. Policy analysis is usually performed under tight deadlines, and is not unusual to rely on arbitrary assumptions and/or irreproducible calculations. For this reasons we do not attempt to replicate the CBO report as a way of testing the veracity of the analysis. We use reproducibility, paired with full transparency, to generate a living document that represents the best policy analysis up to date. Our expectations are that this living document will be serve as a building block to discuss and incorporate incremental improvements to the policy analysis used to inform the debate around the minimum wage.
The CBO report describes three policy estimates: the effects of raising the minimum wage on income of families with members that receive a raise, the effects on income of families with members that loose their jobs, and the distributions of losses in the economy used to pay for the raise in the minimum wage. All the policy estimates to replicate are presented in the following tables.
**Note on the code languages (`R` and `Stata`):** The analysis can be replicated using either language, but only R provides the one-click workflow. For `Stata` the reader has to copy and paste the scripts sequentially or excecute [this do file](link to final do file).
Also add link to video on how to install R.
```{r setup, include=FALSE}
if (TRUE) {
if (Sys.info()["sysname"] == "Darwin") {
setwd("/Users/fhoces/dissertation/Replication")
}else{
setwd("C:/Users/fhocesde/Documents/replication")
}
}
# Loading required libraries
list.of.packages <- c("knitr","foreign", "dplyr", "weights", "survey", "Hmisc",
"openxlsx", "rio", "highr", "XML", "RCurl", "treemap",
"reshape2", "tidyr", "xtable")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages, repos= "http://cran.cnr.berkeley.edu/")
lapply(list.of.packages, require, character.only = TRUE)
# Setting working directory
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(cache = FALSE)
statapath <- "/Applications/Stata/Stata.app/Contents/MacOS/Stata"
display_code <- TRUE #set false when outputing to pdf (maybe?)
```
```{r table with output, eval=TRUE,echo=FALSE, warning=FALSE, message=FALSE, cache=TRUE}
# This is summary of all the policy estimates presented in the report
# Aggregate effects
output.template1 <- matrix(" ???",nrow = 6, ncol = 1)
rownames(output.template1) <- c("wage gains (billions of $)", "wage losses (bns of $)", "Balance losses (bns of $)",
"Net effect (bns of $)", "# of Wage gainers (millions)", "#of Wage losers (millions)")
colnames(output.template1) <- "Effects/Policy Estimates"
output.template1[,"Effects/Policy Estimates" ] <- c("31", "~5", "~24" , "2", "16.5", "0.5")
#Some distributional effects
output.template2 <- matrix(" ???",nrow = 2, ncol = 4)
rownames(output.template2) <- c("Balance losses (bns of $)", "Net effect (bns of $)")
colnames(output.template2) <- c("<1PL", "[1PL, 3PL)", "[3PL, 6PL)", ">6PL")
output.template2["Balance losses (bns of $)", ] <- c("~0.3", "~3.4", "~3.4", "~17")
output.template2["Net effect (bns of $)", ] <- c("5", "12", "2", "-17")
knitr::kable(
list(
output.template1 ),
caption = 'Policy estimates in CBO report: Overall effects', booktabs = TRUE,
align = 'c'
)
knitr::kable(
list(
output.template2
),
caption = 'Policy estimates in CBO report: Distributional effects across poverty lines (PL)', booktabs = TRUE,
align = 'c'
)
#This is the first suggestion: change how policy estimates are reported and add information.
mod.output.template <- matrix(" ???",nrow = 7, ncol = 5)
rownames(mod.output.template) <- c("wage gains", "wage losses", "Balance losses", "Net effect", "# of Wage gainers", "#of Wage losers", "Population")
colnames(mod.output.template) <- c("<1PL", "[1PL, 3PL)", "[3PL, 6PL)", ">6PL", "Total")
mod.output.template[,"Total" ] <- c("31", "~5", "~24" , "2", "16.5", "0.5", "330/140")
mod.output.template["Balance losses", ] <- c("~0.3", "~3.4", "~3.4", "~17", "~24")
mod.output.template["Net effect", ] <- c("5", "12", "2", "-17", "2")
#knitr::kable(mod.output.template, caption="Template for final results to replicate", digits = 1)
```
```{r SA params, eval=TRUE,echo=FALSE, warning=FALSE, message=FALSE, cache=TRUE}
# The base growth rate of wages will determine the dispersion of wage growth, the mean wage groth rate will be given by the 10 year economic forecast.
#data inputs:
param.wage.gr <- 1
param.worker.gr <- 1
param.eta.lit <- 1 #
param.factor.extrap <- 1.0
param.base.growth <- 0.024 * 1.0
param.N <- 1.0
param.fract.minwage <- 1.0
param.noncomp <- 1.0
param.F.adj <- 1.0
param.av.wage.var <- 1.0
param.wages <- 1.0 ####
param.nonwage.gr <- 1.0
param.hours <- 1.0
param.weeks <- 1.0
param.factor.1 <- 1
param.net.benef <- 2e9*1.0
param.ripple <- c("scope_below" = 8.7*1, "scope_above" = 11.5*1.0, "intensity" = 0.5*1.0)
param.dist.loss <- c(0.01, 0.29, 0.70) #c(0.2, 0.4, 0.40) #c(0.39567218, 0.53851506, 0.06581275)
param.jobcut <- 1
param.states.raise <- 1.0
```
```{r tables with notes, results='asis', eval=FALSE, echo=FALSE}
mod <- lm(mpg ~ wt, data=mtcars) #my linear model
print(xtable(mod,
caption = "Estimates of linear model for father Muro CB ",
digits = c(0,2, 2, 2,3)),
type="html",
table.placement = "h!",
caption.placement = "top",
add.to.row = list(list(2),
'<tr><td colspan="5"><b>Note: </b>
This is a description, blah, blah, blah, blah, blah, blah,
blah, blah, blah, blah, blah, blah, blah, blah, blah, blah,
blah, blah, blah, blah, blah, blah, blah, blah, blah, blah,
blah, blah, blah, blah, blah, blah</td></tr>'))
```
In this companion we attempt to reproduce all the policy estimates of table 1 and 2, and walk the reader through all the details behind it.
# Employment effects
At a general level the effects on employment ($\widehat{\Delta E}$) will be calculated using a more detailed version of the following equation:
$$
\begin{aligned}
\widehat{\Delta E} &= N \times \eta \times \% \Delta w + \text{Other factors}
\end{aligned}
$$
Where $N$ represents the relevant population, $\eta$ the elasticity of labor demand, $\Delta w$ the relevant percentual variation in wages, and the *Other factors* will encapsulate effects on employment through an increase in the aggregate demand.
To describe the methodology behind each of those four components we first describe the data used, the wage variable choose, and the procedure used to forecast the wage and population distribution of 2016 using data from 2013.
## Data, wages, and forecast
To simulate the policy effects we need the distribution of wages and employment under the status quo. From the perspective of 2013, this implies forecasting to 2016 data on employment and wages.
### Data
The Current Population Survey (CPS) was used to compute the effects on employment. From the analysis in the section on distributional effects we can deduce that the data corresponds to the Outgoing Rotation Group (ORG). CPS is a monthly cross sectional survey. The same individual is interviewed eight times over a period of 12 months. The interviews take place in the first and last 4 months of that period. By the 4th and 12th interview, individuals are asked detailed information on earnings. The CPS ORG file contains the information on this interviews for a given year. We analyze the data for 2013.
Currently three versions of these data sets can be found online: [CPS raw files](http://thedataweb.rm.census.gov/ftp/cps_ftp.html#cpsbasic), [ORG NBER](http://www.nber.org/data/morg.html) and [ORG CEPR](http://ceprdata.org/cps-uniform-data-extracts/cps-outgoing-rotation-group/). The analysis will be performed using the CPER ORG data base.
The weights used in our analysis will be `orgwgt/12`
#### Code to load the data
<a id="displayText" href="javascript:toggle(1);">`R`</a>
<div id="toggleText1" style="display: none">
```{r loading data R, eval=TRUE,echo=display_code, warning=FALSE, message=FALSE}
call.cps.org.data <- function(){
data_use <- "CPER_ORG"
# Using CEPR ORG data
if (data_use == "CPER_ORG") {
# Checking if working directory contains data, download if not.
if ( !("cepr_org_2013.dta" %in% dir()) ) {
# create name of file to store data
tf <- "cepr_org_2013.zip"
# download the CPS repwgts zipped file to the local computer
download.file(url = "http://ceprdata.org/wp-content/cps/data/cepr_org_2013.zip", tf , mode = "wb" )
# unzip the file's contents and store the file name within the temporary directory
fn <- unzip( zipfile = tf , overwrite = T )
}
df <- read.dta("cepr_org_2013.dta")
}
# Using NBER ORG data
if (data_use == "NBER_ORG") {
# Checking if working directory contains data, download if not.
if ( !("morg13.dta" %in% dir()) ) {
# Downloading data 53mb
df <- read.dta("http://www.nber.org/morg/annual/morg13.dta")
}
df <- read.dta("morg13.dta")
}
df <- tbl_df(df)
# There are 1293 cases with missin values for the weigths. I delete them from the data.
df <- df %>% filter(!is.na(orgwgt))
df$final_weights <- df$orgwgt/12
return(df)
}
df <- call.cps.org.data()
```
</div>
<a id="displayText" href="javascript:toggle(2);">`Stata`</a>
<div id="toggleText2" style="display: none">
```{r loading data Stata SHC, eval=FALSE,echo=display_code, engine="stata", engine.path=statapath, comment=""}
* How to get the data:
use "/Users/fhocesde/Documents/data/CPS/cepr_org_2013.dta", clear
*Following the notes here (https://cps.ipums.org/cps/outgoing_rotation_notes.shtml) I generate the weights as orgwgt/12
cap drop *_weight
gen final_weight = orgwgt/12
gen round_weight = round(orgwgt/12, 1)
* There are 1293 cases with missin values for the weigths. I delete them from the data.
drop if orgwgt == .
sum(orgwgt)
```
</div>
### Wage variable
We assume no further adjustments like imputation for top coding, trimming, excluding overtime/commissions, or imputation of usual hours for ''hours vary'' respondents. The CEPR ORG data includes several wage variables ([described here](http://ceprdata.org/cps-uniform-data-extracts/cps-outgoing-rotation-group/)). The wage variable that best matches the description above is `wage3`. This variable measures earnings for hourly workers (excluding overtime, tips, commissions and bonuses -otc-) and non hourly workers (including otc). According to CEPR "...attempts to match the NBER’s recommendation for the most consistent hourly wage series from 1979 to the present"
### Wage adjustment
An adjustment was made to the wage of all the workers that did not report an hourly wage (`wage3` is estimated as usual salary per self-reported pay-period over usual hours per pay-period). In order to reduce the measurement error in those wages, we follow the methodology proposed in [this paper](https://www.aeaweb.org/articles?id=10.1257/aer.96.3.461) and compute the adjusted wage as a weighted average of the original wage and the average wage of workers with similar characteristics.
$$
\begin{aligned}
w_{ig} &= \alpha w^{raw}_{ig} - (1 - \alpha) \overline{w^{raw}_{g}} \label{samp.eq1} \\
\text{with: } \quad \overline{w^{raw}_{g}} &= \frac{\sum_{g} w^{raw}_{ig} }{N_{g}} \nonumber
\end{aligned}
$$
TO BE COMPLETED: Ask CBO about $\alpha$ and $G$.
### Wage forecast
With this data-variable-adjustment we forecast the wage distribution, from 2013 to 2016 in the following way:
#### Growth adjustments
We assume that the growth forecasts were taken from the 10-Year Economic Projections from CBO ([this website](https://www.cbo.gov/about/products/budget_economic_data)). Annualized growth rates for the number of workers $g_{workers}$, and nominal wage per $g_{wages}$ worker where computed as follows:
$$
\begin{aligned}
\widehat{ g_{workers} } &= \left[ \frac{\widehat{ N_{workers}^{2016} } }{N_{workers}^{2013}} \right]^{1/3}- 1 \\
\widehat{ g_{wages} } &= \left[ \frac{\widehat{ Wages^{2016} } / \widehat{ N_{workers}^{2016} } }{Wages^{2013} / N_{workers}^{2013}} \right]^{1/3} - 1
\end{aligned}
$$
The report assumes higher wage growth for high wages than low wages. To create different rates of growth in wage, we compute different wage growth rates for each decile of wage. The increments across deciles were constant and the set to match a final lowest decile with a yearly growth rate of 2.9%.
The adjustment over number of workers was made through the weight variable `final_weights` (multiplying it by the growth rate) whereas the `wage3` variable was multiplied by the forecast growth rate of per worker wages.
##### Code to get economic growth forecasts
<a id="displayText" href="javascript:toggle(3);">`R`</a>
<div id="toggleText3" style="display: none">
```{r Getting forecast data, eval=TRUE,echo=display_code, warning=FALSE, message=FALSE, results = "hide"}
get.gr.data <- function() {
# All projections data comes from this website: https://www.cbo.gov/about/products/budget_economic_data
# name of the files that contain projections from CBO
early.2016 <- "51135-2016-01-Economic%20Projections.xlsx"
late.2015 <- "51135-2015-08-EconomicProjections.xlsx"
early.2015 <- "51135-2015-01-EconomicProjections.xlsx"
late.2014 <- "51135-2014-08-EconomicProjections.xlsx"
early.2014 <- "51135-2014-02-EconomicProjections.xlsx"
early.2013 <- "51135-2013-02-EconomicProjections.xls" #there is no late 2013 report
# This function loads the data for a given report
get.growth.data <- function(x){
# Checking if working directory contains data, download if not.
if ( !(x %in% dir()) ) {
download.file(url = paste("https://www.cbo.gov/sites/default/files/",
x , sep = ""),
destfile = x, mode="wb")
}
if (x == early.2013) {
if ( !(require(XLConnect)) ) install.packages("XLConnect", repos= "http://cran.cnr.berkeley.edu/")
out.df <- rio::import( x , sheet= "2. Calendar Year")
} else {
out.df <- read.xlsx( x , sheet = "2. Calendar Year")
}
return(out.df)
}
# Working with projections from 2013
trends.df <- get.growth.data( early.2013 )
# Get column of all projections for 2013: get data from 2012 up to 2019)
sel.col <- which(trends.df==2012, arr.ind = TRUE)[2]
# Get row with all projections for wages and salaries in billions of (nominal) dollars
# Note: the excel file always contains two rows with the words "wage[s]" and "salar[ies|y]",
# we are looking for the second one (corresponding to Wages and Salaries under Income)
sel.row1 <- unique(
apply(trends.df,
2, function(x) grep("Wage.*Salar.*",x ) )
)[[2]]
sel.row1 <- sel.row1[2]
# Get row with all projections for number people employed (in millions)
sel.row2 <- which(trends.df=="Employment, Civilian, 16 Years or Older (Household Survey)", arr.ind = TRUE)[1]
# FH: I would use the following. But CBO uses the Price Index, Personal Consumption Expenditures (PCE)
# sel.row3 <- unique(
# apply(trends.df,
# 2, function(x) grep("Nonwage Income", x ) )
# )[[2]]
#
#
sel.row3 <- unique(
apply(trends.df,
2, function(x) grep("Price Index, Personal Consumption", x ) )
)[[2]]
#Keep only rows and colums identified above
trends.df <- trends.df[c(sel.row1, sel.row2, sel.row3) , sel.col:(sel.col+7)]
#Labeling and formating
colnames(trends.df) <- 2012:(2012+7)
trends.df <- apply(trends.df, 2, as.numeric)
row.names(trends.df) <- c( "wages(total)", "workers", "Price Index, Personal Consumption")
#Generate wage and non-wage income per worker
trends.df <- rbind( trends.df ,
(trends.df["wages(total)", ] * 1e9 ) / ( trends.df["workers", ] * 1e6) )
row.names(trends.df) <- c( "wages(total)", "workers", "Price Index, Personal Consumption",
"wages per worker")
#Transpose the data
trends.df <- t(trends.df)
# Define a new data set with the anual growth rate of each variable over time
growth.df <- trends.df/lag(trends.df,1) - 1
return(growth.df)
}
# Compute the compounded growth factor for a given variable in a time interval
# For example growth factor between years 1,2 and 3 will be:
# (1+growth_rate_yr1) * (1+growth_rate_yr2) * (1+growth_rate_yr3)
growth.df <- get.gr.data()
gr.factor <- function(var1, init.year, last.year) {
if (init.year == 2012) {init.year <- 2013}
prod((growth.df[, var1 ] + 1)[as.character(init.year:last.year)])
}
```
</div>
<a id="displayText" href="javascript:toggle(4);">`Stata`</a>
<div id="toggleText4" style="display: none">
```{r Getting forecast data - Stata HC, eval=FALSE,echo=display_code, engine="stata", engine.path=statapath, comment=""}
* Anual growth rates (R code to compute rates in commnets):
* ( gr.factor("wages per worker", 2014, 2016) )^(1/3) - 1
scalar wage_gr = 0.04538147
*( gr.factor("workers", 2014, 2016) )^(1/3) - 1
scalar workers_gr = 0.01550989
```
</div>
#### ACA adjustments
[Not done yet]
#### State level minimum wage adjustments
CBO had to predict the future changes in the state level minimum wages. We use the actual values implemented by each state. The data comes from the Department of Labor ([here](https://www.dol.gov/whd/state/stateMinWageHis.htm)).
Whenever the predicted wages were below the 2016 state minimum wage they were replace by it.
**Important assumption:** when imputing state level min wages, we assume that no effects on employment where incorporated.
##### Code to get minimum wage values by state
<a id="displayText" href="javascript:toggle(5);">`R`</a>
<div id="toggleText5" style="display: none">
```{r Getting min wage data , eval=TRUE,echo=display_code, warning=FALSE, message=FALSE, results = "hide"}
# Minimum wage by state:
# Check if data is in machine and download if not.
# To excecute the following piece of code you cannot be behind a firewall
if ( !("minwage" %in% dir()) ) {
fileURL <- "https://www.dol.gov/whd/state/stateMinWageHis.htm"
xData <- getURL(fileURL)
aux.1 <- readHTMLTable(xData, header = TRUE)
min.wage.data <- cbind(aux.1[[1]], aux.1[[2]][,-1],
aux.1[[3]][,-1], aux.1[[4]][,-1],
aux.1[[5]][1:55,-1])
min.wage.data <- min.wage.data[, - (32:37)]
colnames(min.wage.data) <- c(gsub("(.*)([0-9]{4})(.*)","\\2",
names(min.wage.data))[-c(30, 31)],
"2014", "2015")
rownames(min.wage.data) <- min.wage.data[,1]
min.wage.data <- min.wage.data[,-1]
# This part was hard coded, important to check over and over.
rownames(min.wage.data) <- c("Federal","AK","AL","AR","AZ","CA","CO","CT",
"DE","FL","GA","HI","ID","IL","IN","IA","KS","KY",
"LA","ME","MD","MA","MI","MN","MS","MO",
"MT","NE","NV","NH","NJ","NM","NY","NC",
"ND","OH","OK","OR","PA","RI","SC","SD",
"TN","TX","UT","VT","VA","WA","WV","WI",
"WY","DC", "Guam", "PR", "USVI")
#Save all min wage data in a single csv file
saveRDS(min.wage.data, "minwage")
}
min.wage.data <- readRDS("minwage")
# Function that extracts (in numeric format) the min wage for a specfic year for each state
state.minw <- function(char.year) {
options(warn=-1)
if ( !( char.year %in% colnames(min.wage.data) ) ) {
res1 <- as.data.frame( rep(NA, dim(min.wage.data)[1]) )
} else {
aux.1 <- as.numeric(gsub("(.*)([0-9]{1,2}\\.[0-9]{1,2})(.*)",
"\\2", min.wage.data[, char.year]) )
# If no state min wage, assign federal.
res1 <- as.data.frame(ifelse(is.na(aux.1), aux.1[1] , aux.1))
options(warn=0)
}
rownames(res1) <- rownames(min.wage.data)
colnames(res1) <- char.year
return(res1)
}
st.minw <- state.minw("2013")
#To get all min wages in one data frame use:
# as.data.frame(lapply(1980:2015, function(x) state.minw(as.character(x))) )
#CBO makes a forecast of future min wages. We can look at the actual min wage that took place.
#If CBO provides their forecast, we could check forecast acuracy.
#Min wage from 2016 are not available in the website above. I am hard coding any changes
#found in wikipedia (https://en.wikipedia.org/wiki/Minimum_wage_in_the_United_States access 5/16/2016):
st.minw.2016 <- state.minw("2015")
st.minw.2016[c("AK" , "AZ", "CA", "CT" , "HI" , "IL", "MA", "MI" ,
"MN" , "MT" , "NV" , "NE" , "NY" , "OH" , "RI", "VT"), ] <-
c( 9.75 , 8.05, 10 , 9.6 , 8.5 , 8.25, 10 , 8.5 ,
9 , 8.05 , 8.25 , 9 , 9 , 8.1 , 9.6 , 9.6)
colnames(st.minw.2016) <- "2016"
# Export MW data to Stata
aux.data <- data.frame("states" = rownames(st.minw),
"minwage_2013" = st.minw,
"minwage_2016" = st.minw.2016)
names(aux.data) <- c("states","minwage_2013", "minwage_2016")
write.dta(aux.data, "state_min_w.dta")
```
</div>
<a id="displayText" href="javascript:toggle(6);">`Stata`</a>
<div id="toggleText6" style="display: none">
```{r Getting min wage data - Stata , eval=FALSE,echo=display_code, engine="stata", engine.path=statapath, comment=""}
preserve
use "/Users/fhocesde/Documents/dissertation/Replication/state_min_w.dta", clear
sort states
tempfile min_wage
save `min_wage'
restore
```
</div>
#### Code to forecast wages and workers
<a id="displayText" href="javascript:toggle(7);">`R`</a>
<div id="toggleText7" style="display: none">
```{r Adjusting wages to 2016 , eval=TRUE,echo=display_code, warning=FALSE, message=FALSE, results = "hide"}
#GENERAL NOTE FOR THIS SECTION: the analysis perfomed here is the same as the one with CPS ASEC, so it should be better to wrap it in a function that is called twice to. This way I can make sure that the sensitivity analysis works everywhere.
#Wage adjutsment
#CBO mentions that the lowest 10th percent gets a 2.9% growth in anual wage
#I compute the anualized growth rate of wages and creat 10 bins of wage growth
#starting at 2.4%, then adjust by minimum wages of 2016 and get a anualized
#growth of 2.9% for the lowest decile.
#THIS TWO LINES OF CODE ARE DIFFERENT BETWEEN ASEC AND ORG
wage.gr.f <- function(SA.wage.gr = param.wage.gr) {
( ( gr.factor("wages per worker", 2014, 2016) )^(1/3) - 1 ) * SA.wage.gr
}
wage.gr <- wage.gr.f()
workers.gr.f <- function(SA.worker.gr = param.worker.gr) {
( ( gr.factor("workers", 2014, 2016) )^(1/3) - 1 ) * SA.worker.gr
}
workers.gr <- workers.gr.f()
#SAME
half.gap.f <- function(SA.wage.gr = param.wage.gr,
SA.base.growth = param.base.growth) {
wage.gr.f(SA.wage.gr) - SA.base.growth
}
half.gap <- half.gap.f()
wage.gr.bins.f <- function(SA.base.growth = param.base.growth,
SA.wage.gr = param.wage.gr) {
seq(SA.base.growth, wage.gr.f(SA.wage.gr) +
half.gap.f(SA.wage.gr,SA.base.growth), length.out = 10)
}
wage.gr.bins <- wage.gr.bins.f()
# CAUTION: DO NOT apply 'ntile()' fn from dplry as is will split ties differently than 'cut()' and results will not
# be comparable to STATA.
#NOT THE SAME (power of 3 intead of 4)
# Here we adjust min wages
# SAME
wages.final.cps.org.f <- function(SA.states.raise = param.states.raise,
SA.wages = param.wages) {
aux.var <- wtd.quantile(x = df$wage3, probs = 1:9/10,weights = df$final_weights)
df %>%
mutate( "w3.deciles" = cut(wage3, c(0, aux.var, Inf),
right = TRUE, include.lowest = TRUE) ,
"w3.adj1" = wage3 * ( 1 + wage.gr.bins[w3.deciles] )^3,
"wages.final" = ifelse(w3.adj1> st.minw.2016[state,] * SA.states.raise,
w3.adj1,
st.minw.2016[state,] * SA.states.raise)* SA.wages )
}
df <- wages.final.cps.org.f()
# to be done: adjust some states by inflation.
```
</div>
<a id="displayText" href="javascript:toggle(8);">`Stata`</a>
<div id="toggleText8" style="display: none">
```{r Adjusting wages to 2016 - Stata, eval=FALSE,echo=display_code, engine="stata", engine.path=statapath, comment=""}
* Forecast wages to 2016 : apply diff growth rates per decile (deciles of growth gen in R)
cap drop w3_*
xtile w3_deciles = wage3 [w =final_weight], nq(10)
gen w3_adj1 = wage3 * (1 + 0.02400000)^3 if w3_decile == 1
replace w3_adj1 = wage3 * (1 + 0.02875144)^3 if w3_decile == 2
replace w3_adj1 = wage3 * (1 + 0.03350288)^3 if w3_decile == 3
replace w3_adj1 = wage3 * (1 + 0.03825432)^3 if w3_decile == 4
replace w3_adj1 = wage3 * (1 + 0.04300575)^3 if w3_decile == 5
replace w3_adj1 = wage3 * (1 + 0.04775719)^3 if w3_decile == 6
replace w3_adj1 = wage3 * (1 + 0.05250863)^3 if w3_decile == 7
replace w3_adj1 = wage3 * (1 + 0.05726007)^3 if w3_decile == 8
replace w3_adj1 = wage3 * (1 + 0.06201151)^3 if w3_decile == 9
replace w3_adj1 = wage3 * (1 + 0.06676295)^3 if w3_decile == 10
* Merge with State min data and replace wages below state min in 2016 by it.
decode state, g(state_s)
sort state_s
merge state_s using `min_wage'
* Drop Guam, PRVI, Federal
drop if _m == 2
drop _m
gen w3_adj_min = w3_adj1
replace w3_adj_min = minwage_2016 if w3_adj1 < minwage_2016
```
</div>
## Get the $N$
### Identify the relevant universe
```{r temp nums for text below, echo=FALSE, }
temp.working.age.pop <- round( sum(df$final_weights, na.rm = TRUE)/1e6, 1)
temp.employed.pop <- round( sum( df$final_weights *
(df$lfstat == "Employed"),
na.rm = TRUE)/1e6, 1)
temp.unemployed.pop <- round( sum( df$final_weights *
(df$lfstat == "Unemployed"),
na.rm = TRUE)/1e6, 1)
temp.nilf.pop <- round( sum( df$final_weights *
(df$lfstat == "NILF"),
na.rm = TRUE)/1e6, 1)
temp.salary <- round( with(df, sum(final_weights *
(empl == 1 &
(selfinc == 0 & selfemp == 0)),
na.rm = TRUE))/1e6 , 1)
temp.salary.g1 <- round(with(df, sum(final_weights *
( (empl == 1 &
selfinc == 0 & selfemp == 0) &
(paidhre == 1 | hrsvary != 1 |
is.na(hrsvary) ) &
(wage3==0 | is.na(wage3) ) ) ,
na.rm = TRUE))/1e6, 1)
temp.nohour <- round(with(df, sum(final_weights *
(empl == 1 &
(selfinc == 0 & selfemp == 0) &
(paidhre == 0 |is.na(paidhre))),
na.rm = TRUE))/1e6, 1)
temp.nohour.hours.vary <- round(with(df, sum( final_weights *
(empl == 1 &
(selfinc == 0 & selfemp == 0) &
(paidhre == 0 | is.na(paidhre) ) &
hrsvary == 1),
na.rm = TRUE))/1e6, 1)
temp.pop.of.interest <- round(with(df, sum(final_weights *
(empl == 1 &
(selfinc ==0 & selfemp == 0) &
( (paidhre == 0 & hrsvary != 1) |
paidhre ==1 ) & wage3 != 0),
na.rm = TRUE))/1e6, 1)
```
According to the CPS data the population of working age in 2013 was `r temp.working.age.pop` million*. Of those, `r temp.employed.pop` million were working, `r temp.unemployed.pop`, were unemployed and `r temp.nilf.pop` were not in the labor force (NILF).
Among those employed, `r temp.salary` million workers receive a salary (not self employed or self incorporated). A small number of salary workers (`r temp.salary.g1` million) did not reported any wages and were excluded from the sample. Of the employed salary workers `r temp.nohour` million did not report an hourly wage and it was computed from their reported pay-period divided by the reported hours in such pay-period. However, `r temp.nohour.hours.vary` million workers from this group reported having varying hours. Their wages were not calculated and were also excluded from the sample. As a result the final number of workers where a rise in the minimum wage can have a direct effect is `r temp.pop.of.interest` million (= `r temp.salary` - `r temp.salary.g1` - `r temp.nohour.hours.vary`), this is our universe of interest. Figure 1 presents visual representation of all these populations.
*[FH: why are some indiv with age > 65?]
We know compute some descriptive statistics of the labor force in 2013 and the distribution of wages of the universe of interest both in 2013 and the predicted values for 2016.
Define variable that tags population of interest
#### Statistics and code behind figure 1
<a id="displayText" href="javascript:toggle(9);">`R`</a>
<div id="toggleText9" style="display: none">
```{r desc stats2, eval=TRUE,echo=display_code, warning=FALSE, message=FALSE, error=FALSE, collapse=TRUE}
# Tag population of interest
get.pop.int <- function() {
df %>% mutate("pop_of_int" = (empl == 1 &
(selfinc == 0 & selfemp == 0) &
( (paidhre == 0 & ( hrsvary != 1 | is.na(hrsvary) ) ) |
paidhre == 1 ) &
!(wage3 == 0 | is.na(wage3) ) ) )
}
df <- get.pop.int()
# Tables 1 - 4 where constructed to look at the data. Only table 4 is shown in the final output
# To compute the new total of workers we multiply the original weigths by the growth rate.
table_1 <- df %>%
summarise("(1) Total" =
sum(final_weights, na.rm = TRUE),
"(2) Employed" =
sum( final_weights * (empl == 1), na.rm = TRUE),
"(3) Salary (among employed)" =
sum(final_weights * (empl == 1 & #Salary worker if
(selfinc == 0 & selfemp == 0)) #not self employed or
, na.rm = TRUE), #self incorp.
"(4) Not Paid hourly (among salary)" =
sum(final_weights * (empl == 1 & # Not paid hourly if salary and
(selfinc == 0 & selfemp == 0) & # not paid hourly
(paidhre == 0 | is.na(paidhre) )), na.rm = TRUE),
"(5) Hours Vary (among not paid hourly)" =
sum(final_weights * (empl == 1 & #Hours vary if not paid hourly and
(selfinc == 0 & selfemp == 0) & #hours vary
(paidhre == 0 | is.na(paidhre) ) & hrsvary == 1), na.rm = TRUE),
"(6) No wage (in (3) but not in (5))" =
sum(final_weights * ( (empl == 1 & selfinc == 0 & selfemp == 0) &
(paidhre == 1 | hrsvary != 1 | is.na(hrsvary) ) &
(wage3==0 | is.na(wage3) ) ) , na.rm = TRUE),
"Population of Interest = (3) - (5) - (6)" =
sum(final_weights * (empl == 1 & (selfinc ==0 & selfemp == 0) &
( (paidhre == 0 & hrsvary != 1) | paidhre ==1 ) &
wage3 != 0) , na.rm = TRUE)
)
table_1 <- t(table_1)
colnames(table_1) <- "N"
table_1 <- format(table_1, big.mark = ",", digits = 0, scientific = FALSE)
table_1_uw <- df %>%
summarise("(1) Total" =
sum(!is.na(final_weights), na.rm = TRUE),
"(2) Employed" =
sum( 1 * (empl == 1), na.rm = TRUE),
"(3) Salary (among employed)" =
sum( 1 * (empl == 1 & #Salary worker if
(selfinc == 0 & selfemp == 0)) #not self employed or
, na.rm = TRUE), #self incorp.
"(4) Not Paid hourly (among salary)" =
sum( 1 * (empl == 1 & #Not paid hourly if salary and
(selfinc == 0 & selfemp == 0) & # not paid hourly
(paidhre == 0 | is.na(paidhre) )), na.rm = TRUE),
"(5) Hours Vary (among not paid hourly)" =
sum( 1 * (empl == 1 & #Hours vary if not paid hourly and
(selfinc == 0 & selfemp == 0) & #hours vary
(paidhre == 0 | is.na(paidhre) ) & hrsvary == 1), na.rm = TRUE),
"(6) No wage (in (3) but not in (5))" =
sum( 1 * ( (empl == 1 & selfinc == 0 & selfemp == 0) &
(paidhre == 1 | hrsvary != 1 | is.na(hrsvary) ) &
(wage3==0 | is.na(wage3) ) ) , na.rm = TRUE),
"Population of Interest = (3) - (5) - (6)" =
sum( 1 * (empl == 1 & (selfinc ==0 & selfemp == 0) &
( (paidhre == 0 & hrsvary != 1) | paidhre ==1 ) &
wage3 != 0) , na.rm = TRUE)
)
table_1_uw <- t(table_1_uw)
colnames(table_1_uw) <- "N_unweighted"
table_1_uw <- format(table_1_uw, big.mark = ",", digits = 0, scientific = FALSE)
table_1 <- cbind(table_1, table_1_uw)
new_total_n <- format(sum(df$final_weights[df$pop_of_int==1] *
(1 + workers.gr)^3,
na.rm = TRUE), big.mark=",")
#Summary stats of wage
sum.stas1 <- function(x, wt) {
c( "mean" = weighted.mean(x,w = wt, na.rm = TRUE),
"sd" = sqrt( wtd.var(x, weights = wt) ) ,
"median" = wtd.quantile( x, weights = wt, prob = c(.5)) ,
wtd.quantile( x, weights = wt, prob = c(.1, .9) ) )
}
table_2 <- df %>%
filter(pop_of_int == 1 & !is.na(wage3)) %>%
with(sum.stas1(wage3, final_weights))
table_2 <- cbind(table_2)
colnames(table_2) <- "Wage"
table_3 <- df %>%
filter(pop_of_int == 1 & !is.na(wage3)) %>%
summarise("> $7.5" = weighted.mean(wage3<7.5,w = final_weights),
"> $9" = weighted.mean(wage3<9,w = final_weights),
"> $10.10" = weighted.mean(wage3<10.10,w = final_weights),
"> $13" = weighted.mean(wage3<13,w = final_weights),
"> $15" = weighted.mean(wage3<15,w = final_weights)
)
table_3 <- t(table_3)
colnames(table_3) <- "Perc"
table_4 <- matrix(NA, 7, 2)
colnames(table_4) <- c("2013", "2016: status quo")
rownames(table_4) <- c("Salary workers",
"Median wage",
"% < 7.5","% < 9",
"% < 10.10", "% < 13",
"% < 15" )
table_4[1,1] <- table_1[7]
table_4[1,2] <- new_total_n
table_4[2,1] <- table_2[3]
table_4[2,2] <- round( with(df[df$pop_of_int == 1 & !is.na(df$wages.final), ],
wtd.quantile( wages.final, weights = final_weights *
(1 + workers.gr)^3, prob = c(.5) ) ), digits = 2 )
table_4[3:7,1] <- round(as.matrix(table_3), digits = 2)
aux.1 <- df %>%
filter(pop_of_int == 1 & !is.na(wages.final)) %>%
summarise("> $7.50" = weighted.mean(wages.final<7.5,
w = final_weights * (1 + workers.gr)^3),
"> $9" = weighted.mean(wages.final<9,
w = final_weights * (1 + workers.gr)^3),
"> $10.10" = weighted.mean(wages.final<10.10,
w = final_weights * (1 + workers.gr)^3),
"> $13" = weighted.mean(wages.final<13,
w = final_weights * (1 + workers.gr)^3),
"> $15" = weighted.mean(wages.final<15,
w = final_weights * (1 + workers.gr)^3)
)
table_4[3:7,2] <- round( as.matrix(aux.1), digits = 2 )
###Build first treemap
#if (!(length(dev.list()) == 0)) { dev.off() }
#x11()
universe.1 <- df %>%
mutate("teen" = ifelse(age<20, "teen", "adult"),
"selfemp_inc" = 1 * (selfemp == 1 | selfinc == 1),
"pop_of_int" = 1 * pop_of_int) %>%
group_by(lfstat,selfemp_inc, teen, pop_of_int) %>%
summarise("total" = sum(final_weights, na.rm = TRUE))
universe.1$selfemp_inc[universe.1$lfstat!="Employed"] = NA
universe.1[universe.1$lfstat!="Employed" | universe.1$selfemp_inc==1, c("teen", "pop_of_int")] = NA
universe.1$selfemp_inc[universe.1$selfemp_inc==0] <- "salary"
universe.1$selfemp_inc[universe.1$selfemp_inc==1] <- "self employed or self incorporated"
#universe.1$pop_of_int <- with(universe.1, ifelse(pop_of_int==1,"included", "excluded"))
treemap.1 <- function(){
invisible(
treemap(universe.1,
index=c("lfstat", "selfemp_inc", "teen"),
vSize=c("total"),
range = c(7, 15),
type="index",
algorithm="pivotSize",
fontsize.labels = c(12:8),
border.col = c("#FFFFFF", "#000000","#000000"),
aspRatio= 1.5,
palette = c("#D3D3D3"),
title.legend="number of employees",
fontface.labels = c(3,2,1),
align.labels=list(c("left", "top"), c("right", "top"), c("right", "bottom") ),
bg.labels = 1,
title = "Figure 1: Distribution of population of working age in 2013"
))
}
```
</div>
<a id="displayText" href="javascript:toggle(10);">`Stata`</a>
<div id="toggleText10" style="display: none">
```{r desc stats2 STATA, eval=FALSE,echo=display_code, engine="stata", engine.path=statapath, comment=""}
*Population of interest
*Employment categories:
global employed "empl == 1"
global salary "empl == 1 & selfinc == 0 & selfemp == 0"
global nhourly "empl == 1 & selfinc == 0 & selfemp == 0 & (paidhre == 0 | paidhre ==.)"
global hrs_vary "empl == 1 & selfinc == 0 & selfemp == 0 & (paidhre == 0 | paidhre ==.) & hrsvary ==1"
*Tag poppulation of interest: Salary workers that either paid hourly or not paid by the hour but hours not vary, and have a non zero non missing wage
cap drop pop_of
gen pop_of_int = (empl == 1 & (selfinc ==0 & selfemp ==0) & (paidhre ==1 | (paidhre == 0 & hrsvary != 1)) & (wage3 != 0 & wage3 != .) )
matrix table_1 = J(7,2,99)
*1 -Total
sum final_weight
noi di "Total sample in CPS ORG"
noi di %14.2f r(sum)
mat table_1[1,1] = r(sum)
count if final_weight!=.
noi di "Total sample in CPS ORG: unweighted"
noi di %14.2f r(N)
mat table_1[1,2] = r(N)
*2 -Employed
sum final_weight if $employed
noi di "Population Employed"
noi di %14.2f r(sum)
mat table_1[2,1] = r(sum)
count if $employed
noi di "Population Employed: unweighted"
noi di %14.2f r(N)
mat table_1[2,2] = r(N)
*3 -Salaried worker
sum final_weight if $salary
noi di "Salaried workers"
noi di %14.2f r(sum)
local c = r(sum)
mat table_1[3,1] = r(sum)
count if $salary
noi di "Salaried workers: unweighted"
noi di %14.2f r(N)
local c_uw = r(N)
mat table_1[3,2] = r(N)
*4 -Not paid by the hour
sum final_weight if $nhourly
noi di "Salaried workes who are not paid by the hour"
noi di %14.2f r(sum)
mat table_1[4,1] = r(sum)
count if $nhourly
noi di "Salaried workes who are not paid by the hour: unweighted"
noi di %14.2f r(N)
mat table_1[4,2] = r(N)
*5 -Among those who are not paid by the hour: hours vary
sum final_weight if $hrs_vary
noi di "Salaried workes who are not paid by the hour and hour vary"
noi di %14.2f r(sum)
local a = r(sum)
mat table_1[5,1] = r(sum)
count if $hrs_vary
noi di "Salaried workes who are not paid by the hour and hour vary: unweighted"
noi di %14.2f r(N)
local a_uw = r(N)
mat table_1[5,2] = r(N)
*Among those in group 3 but not 5, how many has no wage
sum final_weight if (empl == 1 & selfinc == 0 & selfemp == 0) & (paidhre == 1 | hrsvary != 1) & (wage3==0 | wage3==.)
noi di "Among those in group 3 but not 5, how many has no wage"
noi di %14.2f r(sum)
local b = r(sum) + `a'
mat table_1[6,1] = r(sum)
count if (empl == 1 & selfinc == 0 & selfemp == 0) & (paidhre == 1 | hrsvary != 1) & (wage3==0 | wage3==.)
noi di "Among those in group 3 but not 5, how many has no wage: unweighted"