forked from Jman4190/nba-sql
-
Notifications
You must be signed in to change notification settings - Fork 20
Home
Matthew Pope edited this page Mar 6, 2021
·
3 revisions
This is the nba-sql database.
The project grew out of the desire for a free NBA dataset that is queryable using SQL. Existing solutions, like nba_api looked interesting and feature rich, but had several issues. Existing databases are build off of similar data but are hidden behind paywalls. Providing the code (and potentially only the code) to build such a database is desirable over a centrally hosted database with pay walled access. Existing websites are rich in features but doing analysis is extremely cumbersome. The data may go back further, but it is impossible to use this data with tools like Apache Superset or Tableau.
- Reduce data duplication.
- The NBA APIs return some data items excessively. I can only assume this is to reduce the number of requests required to populate their webpage. Things like
player_name
,age
,team_name
etc. are returned with most API requests. If included in the database this would require extra space. So these values are abstracted away into generalplayer
andteam
tables.
- The NBA APIs return some data items excessively. I can only assume this is to reduce the number of requests required to populate their webpage. Things like
- Efficient indexing.
- We want to be able to query this data fast, and as a side effect of the first goal, only include unique data. We use composite primary keys in several places, which places strict uniqueness constraints on the data.
- Ease of use.
- If our current schema poses issues, please file an issue. An open discussion of how this data is organized is welcome.
I'm not very good with organizing wikis, so check the side bar for available pages.