Introduction
Using the Django ORM for solve the Hanukkah of Data Challenge turned out to be a great way to practice Django queries. You can view my repo on GitHub
Setting up Initial Database
We want to use the existing Sqlite file in the data folder.
The Sqlite file does not contain any constraints or indices. It may have been generated from the CSV by an automated tool. We will see that to have performant queries, we will need to improve this.
Django needs to initialise its ORM, where we can use the inspectdb flag to determine the shape of data
python manage.py inspectdb > models.py
By default it will have a very raw representation e.g.
class Customer(models.Model): customerid = models.IntegerField(blank=True, null=True) name = models.TextField(blank=True, null=True) address = models.TextField(blank=True, null=True) birthdate = models.TextField(blank=True, null=True)
- Remove blank / null options where we know data exists
- Set
customerid
as the primary key to avoid an automaticid
column (note the original DB does not have a PK, but Django requires it) - Use a CharField as we know the length of the fields
class Customer(models.Model): customerid = models.IntegerField(primary_key=True) name = models.CharField() address = models.CharField() birthdate = models.CharField()
inspectdb creates unmanaged models by default docs That is, managed = False in the model’s Meta class. This tells Django not to manage each table’s creation, modification, and deletion. We will want to modify the tables in future, so we remove this line.
Now we need to generate a migration but we don't want to delete the existing database. We can use the flag --fake-initial to skip new table creations. Note now the schema will not exactly match.
python manage.py makemigrations python manage.py migrate --fake-initial