Skip to content

Latest commit

 

History

History
77 lines (35 loc) · 4.19 KB

README.md

File metadata and controls

77 lines (35 loc) · 4.19 KB

playground

Miscellaneous scripts and notebooks for data scraping, visualisation, statistics, machine learning.

Statistics and mathematics

algebra_fitting_polynomials - ordinary and generalised least squares, with a full matrix of covariance, and covariance on the best-fit result

algebra_mean_with_correlations - spelled out math to compute the mean (and associated covariance matrix describe its uncertainty) for data with correlated errors

prior_likelihood_conflict - illustration of the combination of Gaussians and t-distributions (as likelihoods, or prior-likelihood), showing the importance of the behaviour of the tails.

statistical_copulae - intuitive introduction to empirical copulae as joint cumulative distributions, examples of Gaussian, Student-t, and actual APOGEE data.

maximum_likelihood_parallax - brute force illustration of estimating the mean and intrinsic dispersion of a 1D distribution of points with individual measurement errors

gaussian_processes_for_dummies

Astronomy

composite_Gaia_cmd - astroquery for Vizier, twinx and twiny axes, rcParams options

image

cube_galaxy_sky - maximum likelihood, forward modelling using a selection function

gmm_ngc_2506 - query Gaia data, use sklearn to fit a GMM to the proper motions, pick cluster members

image

dbscan_ngc_2506 - clustering of NGC 2506 in proper motion space using DBSCAN and HDBSCAN

gaia_archive_queries - examples of simple and advanced ADQL queries and programmatic access to the Gaia archive

Data science and visualisation

Examples using Python libraries for data scraping, cleaning, visualisation, processing.

geotagged_photos_venezia - FlickrAPI, convert date+time strings to datetime objects, get dayofyear, dayofweek etc.

image

google_trends - use pytrends package to plot search volumes

image

ads_citations_to_gaia_papers - use ADS API and pandas to display monthly citations to Gaia data release papers. Updated daily at this repo including a standalone HTML generated with plotly.

image

hierarchical_clustering_sp500 - pandas, pct_change, pivot, correlations. scipy, hierarchical clustering, dendrograms.

nba_shot_charts_hexbin - manipulate matplotlib hexbin plots, tweak bin size and color

image

rookies_bball_ref_2024_race - urllib and BeautifulSoup to get html tags from a table, pandas to read tables from multiple pages

scrape_all_rookie_stats - pandas to read tables, flatten nested columns

google_trends_eclipse_2024 - plotly chloropleth maps, show export to standalone HTML page (interactive version HERE)

max_temp_three_cities - package meteostat for historical weather, cmasher for the cool colour map, colorbar outside the subplots.

bar_chart_color_rain - package meteostat for historical weather, glow, font, legend.