Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First sensor data loads #1

Open
3 tasks done
jaketbouma opened this issue Sep 27, 2018 · 4 comments
Open
3 tasks done

First sensor data loads #1

jaketbouma opened this issue Sep 27, 2018 · 4 comments
Assignees

Comments

@jaketbouma
Copy link
Contributor

jaketbouma commented Sep 27, 2018

First data loads and investigation

As a data engineer I would like to investigate and consolidate the initial raw csv files provided on google drive so that it is easier for others to get started with exploration and analysis of the data

Definition of done

  • Generate an inventory of files
  • Document work and make a "getting started" notebook / sandbox directory
  • Generate summary statistics (number of measurements, sample rate)
  • [ ] Chase down metadata for available fields (There is no metadata, but the fields are mostly well formed)

Next steps

  • Data quality (check for gaps in sensors)
  • Data exploration
  • Baseline modelling
@jaketbouma jaketbouma self-assigned this Sep 27, 2018
This was referenced Sep 28, 2018
@jaketbouma
Copy link
Contributor Author

screen shot 2018-10-02 at 10 42 04

Wrote out multiple formats. The notebook 20180928 - Merge local data.ipynb syncs down the data and merges all the different formats.
Some things may still have crept through, we'll hopefully see them as we go.

To get started analyzing the data, you can pull up 20181002 - Getting started with prepared datasets.ipynb

@jaketbouma
Copy link
Contributor Author

Summary statistics

from 20181002 - Getting started with prepared datasets.ipynb

Column fill

Number of nulls per column

Air temp           0
Humidity           0
Water level        0
Water temp         0
EC                 0
pH                 1
CO2             8589
DO             37185

number of records

image

dead times

image

conclusion

Data looks more than good enough to get going

jaketbouma added a commit that referenced this issue Oct 9, 2018
@Patechoc
Copy link
Member

Patechoc commented Oct 9, 2018

There is actually many more months of measurements than I thought ;)

@jaketbouma
Copy link
Contributor Author

☝️ See above corrected plots

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants