56 lines
2.3 KiB
Org Mode
56 lines
2.3 KiB
Org Mode
* datasette
|
|
|
|
This repository contians everything you need to deploy SQLite databases to dokku via https://datasette.io/.
|
|
|
|
- All files matching `db/*.db` will get exposed in the interface. Simply add your new database if all you want to do is make it accessible through the web.
|
|
- You can edit `metadata.yml` to add descriptions, link to sources and add predefined queries.
|
|
- The `requirements.txt` contains additional packages such as plugins. Please add them manually and without a version string because python package management is kind of messy.
|
|
- The `CHECKS` file contains a URL and an expected string that is returned in this URLs HTTP response. This is added as a post-deployment sanity check and you shouldn't need to change it.
|
|
- Optional: If anything should happen after deployment, edit the `bin/post_compile` script. You can use this to fetch data from other sources for example.
|
|
|
|
The databases are assumed to be immutable or read-only. This allows us to use efficient caching by configuring nginx as a caching reverse proxy and serving content a static cache. This effectively that queries often only have to be run the first time after the database has changed and are afterwards served from a file-system cache.
|
|
|
|
** Database Setup
|
|
|
|
This section aims to contain all information needed to convert data from their respective source files to SQLite databases. This should make the updating process easier when the sources change
|
|
|
|
*** Use snake_case for all file names
|
|
|
|
#+begin_src bash
|
|
cd sources
|
|
for file in $(fd --type f); do
|
|
mv $file $(basename $file | sed 's/-/_/g')
|
|
done
|
|
#+end_src
|
|
|
|
#+RESULTS:
|
|
|
|
*** Cumulative CO2 emissions
|
|
|
|
Source: https://ourworldindata.org/grapher/cumulative-co-emissions
|
|
|
|
#+begin_src bash :results output replace
|
|
csvs-to-sqlite \
|
|
--shape 'Entity:entity(text),Code:code(text),Year:year(integer),Cumulative CO2 emissions:cumulative_co2_emissions(real)' \
|
|
--extract-column entity,code \
|
|
--index year \
|
|
--replace-tables \
|
|
sources/cumulative_co2_emissions.csv dbs/climate.db
|
|
#+end_src
|
|
|
|
#+RESULTS:
|
|
: extract_columns=('entity,code',)
|
|
: Loaded 1 dataframes
|
|
: Added 1 CSV file to dbs/climate.db
|
|
|
|
** Pitalls
|
|
|
|
*** My queries are slow, what do I do?
|
|
|
|
Make sure SQlite uses the correct indices. You can debug this by writing
|
|
|
|
#+begin_src sql
|
|
EXPLAIN QUERY PLAN SELECT …
|
|
#+end_src
|
|
|
|
and then continuing your `SELECT` query like you normally would.
|