Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define Dataset: State Incarceration (Crime and Justice) #38

Open
emily878 opened this issue Mar 6, 2015 · 7 comments
Open

Define Dataset: State Incarceration (Crime and Justice) #38

emily878 opened this issue Mar 6, 2015 · 7 comments

Comments

@emily878
Copy link
Contributor

emily878 commented Mar 6, 2015

Define the essential substantive elements of the core State Incarceration dataset. What are the components that it must minimally include? Do we have a dataset that we could hold up as a model?

@emily878
Copy link
Contributor Author

Hey, @beccasjames - could you help us out with a list of minimal necessary data elements?

@beccasjames
Copy link

Core elements for a particular state incarceration dataset, inmate population data, would include the following elements:

  • be regularly updated and archived, daily or weekly preferred
  • include number of inmates in each facility
  • include at what percentage of capacity the facility is operating
  • include numerical and percent change in population from same time period of previous year

A model (or "token") dataset can be found at the California Department of Corrections and Rehabilitation (CDCR). They produce weekly and monthly population reports for both inmate and parole populations, including an extensive archive: http://www.cdcr.ca.gov/Reports_Research/Offender_Information_Services_Branch/Population_Reports.html

Further, an ideal inmate population dataset would:

  • be in a machine readable format (.csv)
  • include breakdown of how many inmates are in maximum, medium and minimum security units as well as how many are in solitary confinement

As of now, I have yet to identify a state that fulfills all of these requirements. If discovered, updates will be provided.

@waldoj
Copy link
Contributor

waldoj commented Mar 20, 2015

Is it desirable, or even possible, to have identifiable, per-prisoner granularity?

@emily878
Copy link
Contributor Author

Becca and I talked about that and I personally don't think we want that as
our first cut at a dataset. It will increase the visibility of people's PII
in a way that I think will be problematic for the project.

On Fri, Mar 20, 2015 at 4:16 PM, Waldo Jaquith [email protected]
wrote:

Is it desirable, or even possible, to have identifiable, per-prisoner
granularity?


Reply to this email directly or view it on GitHub
#38 (comment)
.

Emily Shaw
National Policy Manager | Sunlight Foundation |
(o) 202-742-1520 x 282 | (c) 207-233-5684
@emilydshaw http://twitter.com/emilydshaw

@beccasjames
Copy link

Echoing Emily here, the PII shared with inmate-level micro-data is potentially problematic. A few states actually do produce extensive, machine-readable datasets with inmate-level micro-data. If you're interested in what those look like, see examples below:

@waldoj
Copy link
Contributor

waldoj commented Mar 20, 2015

Got it—thank you!

@waldoj
Copy link
Contributor

waldoj commented Mar 20, 2015

That Nebraska data is the weirdest thing. It's an Excel spreadsheet with two worksheets—one with 60,000 records, one with a suspicion-inducing 65,535—that contain just one row, with one number in each row. I feel a bit like I just bought a hard drive at Best Buy, got it home, opened the box, and found only a brick inside.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants