Skip to content

Latest commit

 

History

History
64 lines (47 loc) · 2.49 KB

2-4-cohens_d.md

File metadata and controls

64 lines (47 loc) · 2.49 KB

Think Stats Chapter 2 Exercise 4 (Cohen's d)

Exercise 2.4 Using the variable totalwgt_lb, investigate whether first babies are lighter or heavier than others. Compute Cohen’s d to quantify the difference between the groups. How does it compare to the difference in pregnancy length?

Problem: Are first babies lighter or heavier than others?

Code:

import nsfg
import math

preg = nsfg.ReadFemPreg()
live = preg[preg.outcome == 1]

firsts = live[live.birthord == 1]
others = live[live.birthord != 1]

totalwgt_lb = live.birthwgt_lb + (live.birthwgt_oz / 16)

first_wgt = firsts.totalwgt_lb
other_wgt = others.totalwgt_lb

first_prglngth = firsts.prglngth
other_prglngth = others.prglngth

def CohenEffectSize(group1, group2):
    diff = group1.mean() - group2.mean()
    var1 = group1.var()
    var2 = group2.var()
    n1, n2 = len(group1), len(group2)
    pooled_var = (n1 * var1 + n2 * var2) / (n1 + n2)
    d = diff / math.sqrt(pooled_var)
    return d

def SummaryStats(group1, group2, value):
    difference = group1.mean() - group2.mean()
    
    print "Firsts average %s is: %f" % (value, group1.mean())
    print "Others average %s is: %f" % (value, group2.mean())

    print "Firsts %s standard dev is: %f" % (value, group1.std())
    print "Others %s standard dev is: %f" % (value, group2.std())

    if group1.mean() < group2.mean():
        print "Firsts %s is on average %f less than others" % (value, abs(difference))
    else:
        print "Firsts %s is on average %f more than others" % (value, abs(difference))
    print "Cohen's d is: %f standard deviations" % CohenEffectSize(group1, group2)
    
SummaryStats(first_wgt, other_wgt, "weight")

Output:

Firsts average weight is: 7.201094
Others average weight is: 7.325856
Firsts weight standard dev is: 1.420573
Others weight standard dev is: 1.394195
Firsts weight is on average 0.124761 less than others
Cohen's d is: -0.088673 standard deviations

Solution: On average, first babies are 0.125 pounds lighter than other babies. The effect size, using Cohen's d, is -0.089 standard deviations. For comparison, the Cohen's d for pregnancy length is 0.029 standard deviations and pregnancy length for first babies is on average 0.078 weeks longer than other babies . The difference in Cohen's d between weight and pregnancy length means that the difference in weight between first and other babies is more significant than the difference in pregnancy length between the two groups.