Initial commit

2021-02-27 09:06:12 +01:00 · 2021-02-27 09:06:12 +01:00 · 44ca5ebccf
commit 44ca5ebccf
8 changed files with 23649 additions and 0 deletions
--- a/README.org
+++ b/README.org
@ -0,0 +1,54 @@
+* datasette
+
+This repository contians everything you need to deploy SQLite databases to dokku via https://datasette.io/.
+
+- All files matching `db/*.db` will get exposed in the interface. Simply add your new database if all you want to do is make it accessible through the web.
+- You can edit `metadata.yml` to add descriptions, link to sources and add predefined queries.
+- The `requirements.txt` contains additional packages such as plugins. Please add them manually and without a version string because python package management is kind of messy.
+- The `CHECKS` file contains a URL and an expected string that is returned in this URLs HTTP response. This is added as a post-deployment sanity check and you shouldn't need to change it.
+- Optional: If anything should happen after deployment, edit the `bin/post_compile` script. You can use this to fetch data from other sources for example.
+
+The databases are assumed to be immutable or read-only. This allows us to use efficient caching by configuring nginx as a caching reverse proxy and serving content a static cache. This effectively that queries often only have to be run the first time after the database has changed and are afterwards served from a file-system cache.
+
+** Database Setup
+
+This section aims to contain all information needed to convert data from their respective source files to SQLite databases. This should make the updating process easier when the sources change
+
+*** Use snake_case for all file names
+
+#+begin_src bash
+cd sources
+for file in $(fd --type f); do
+    mv $file $(basename $file | sed 's/-/_/g')
+done
+#+end_src
+
+#+RESULTS:
+
+*** Cumulative CO2
+
+Source: https://ourworldindata.org/grapher/cumulative-co-emissions
+
+#+begin_src bash :results output replace
+csvs-to-sqlite \
+    --shape 'Entity:entity(text),Code:code(text),Year:year(integer),"Cumulative CO2 emissions":cumulative_co2_emissions(real)' \
+    -c entity -c code -i year \
+    sources/cumulative_co2_emissions.csv  dbs/climate.db
+#+end_src
+
+#+RESULTS:
+: extract_columns=('entity', 'code')
+: Loaded 0 dataframes
+: Added 1 CSV file to dbs/climate.db
+
+** Pitalls
+
+*** My Queries are slow, what do I do?
+
+Make sure SQlite uses the correct indices. You can debug this by writing
+
+#+begin_src sql
+EXPLAIN QUERY PLAN SELECT …
+#+end_src
+
+and then continuing your `SELECT` query like you normally would.