We have updated the content of our program. To access the current Software Engineering curriculum visit curriculum.turing.edu.
Express with Knex
Pre-reqs
- Download Postgresql with
brew install postgresql
- Download Postico here and get it installed.
- Given what you know about the structure of relational databases, what is happening in these lines of code? What about these lines? _Diagram it out in your journal_
Goals
By the end of this lesson, you will:
- Understand how to create database migrations and seed files using knex
- Understand how to create and retrieve data within a database using knex
What is Knex?
Straight from the docs, Knex.js is a “batteries included” SQL query builder for Postgres, MSSQL, MySQL, MariaDB, SQLite3, and Oracle designed to be flexible, portable, and fun to use. It features both traditional node style callbacks as well as a promise interface for cleaner async flow control, a stream interface, full featured query and schema builders, transaction support (with savepoints), connection pooling and standardized responses between different query clients and dialects.
What Knex really is is Javascript instead of raw SQL.
Getting Started
Make sure Postgres is installed and running. We will prepare our app by creating a single database in Postgres. Don’t forget to include the semicolons in the CREATE DATABASE command! The name of our database will be “publications”.
$ psql
CREATE DATABASE publications;
To create a new directory and navigate into it, run the following command:
mkdir my-project && cd my-project
Next, initialize a new npm project with the default settings using the command:
npm init --yes
To install knex globally and as a project dependency, as well as the pg (PostgreSQL) package as a project dependency, use the following commands:
npm i -g knex
npm i knex --save
npm i pg --save
These commands will install knex globally, allowing you to use it across different projects, and locally within your project. Additionally, the pg package will be installed as a dependency for your project to enable PostgreSQL functionality.
We will use a knexfile to configure our database for all of our environments. Create that file using the below command with some default values:
→ knex init
Created ./knexfile.js
Configuring the Database
This generated a really nice configuration file that we are going to tweak. Take a look at what it stubbed in for us:
Multiple Environments
You’ll notice at the top level of this file we have three keys: development, staging, and production. These each represent different environments we may be running our app in. Our database setup is going to vary slightly between environments. We’ll see how in just a bit.
Let’s get rid of the staging and production environment configurations for now and just focus on the development one.
Client
First we will change our development environment to use Postgres instead of mySql. This is considered the ‘client’ and because Knex supports multiple clients, we want to explicitly tell it to use postgres:
client: 'pg'
Connection
Next we’ll update this connection value. Every database system has a slightly different way of establishing a connection. We have postgres running on our machines, and we need to connect to it by using what’s called a connection string. (You’ll notice the stubbed-in sqlite3 required a connection file.) The format for this string is just:
postgres://localhost/<name of database>
Let’s name our database ‘publications’ and update that configuration value.
Note: think back to when we had that staging and production environment configuration in here. Do you think this connection string would work when we deploy our apps to production? Which part would have to change?
useNullAsDefault
This is just an option that will take any undefined keys or values in our database and set them equal to null
rather than having to specify a default value for each one. So if you have an array of research papers in one of your postgres tables, and each paper has an optional subtitle field, we don’t want to set a default subtitle on every paper - we just want it to be blank by using null
.
Your configuration file should now currently look like this:
module.exports = {
development: {
client: 'pg',
connection: 'postgres://localhost/publications',
useNullAsDefault: true
}
};
We’re not quite done here, but let’s switch gears for a bit and talk about migrations.
Migrations
Migrations are kind of like version control for databases. They are single, timestamped files that each represent a change to your database schema. Think back to the randomly selected open-source code. Notice every file in this ‘migrations’ directory is simply a timestamped file. We’ll talk more about the structure of these files in a bit.
Migrations attempt to have as little effect on pre-existing data as possible, but sometimes it’s hard to avoid. For example if you have a migration that deletes a column from a table, any data in that column might be destroyed along with it.
Note: You might notice when working with APIs that endpoints are sometimes formatted like api/v1/some-data
. The v1
is a way to version an API - sometimes changes to your database are going to effect your API. Any applications relying on that imaginary column you just deleted are going to be broken if you don’t provide a versioned API for your users.
Configuring a Migrations Directory
Let’s hop back into our knex configuration file and tell knex where we’re going to store our migrations:
module.exports = {
development: {
client: 'pg',
connection: 'postgres://localhost/publications',
migrations: {
directory: './db/migrations'
},
useNullAsDefault: true
}
};
Making a Migration
We can create a migration by running the following command:
knex migrate:make initial
This created a migrations
directory at the root of the project and added a time stamped file with the name of the migration at the end.
The initial
word here is just a descriptor of what the migration is doing. (Think of git commit messages, the first one in your repo is Initial Commit
, the rest of them should describe exactly what’s happening.) So if we wanted to drop a column from our table we might run knex migrate:make drop-foo-column
The file created should look something like this:
exports.up = function(knex) {
};
exports.down = function(knex) {
};
up
defines what should happen when we run the migration. (i.e. What changes do we want to make to our database?)
down
is the reverse. If we want to roll back to a previous version, then down
undoes whatever up
did.
For every up
there must be an equal and opposite down
that will allow us to rollback those changes.
To see all the different options on building and modifying a table, review the Knex docs.
Let’s edit the migration to create a papers
table and a footnotes
table. We will work with a one-to-many relationship here, where one paper can have many footnotes, but a footnote can only belong to one paper:
exports.up = function (knex) {
return knex.schema
.createTable('papers', function (table) {
table.increments('id').primary();
table.string('title');
table.string('author');
table.timestamps(true, true);
})
.createTable('footnotes', function (table) {
table.increments('id').primary();
table.string('note');
table.integer('paper_id').unsigned()
table.foreign('paper_id')
.references('papers.id');
table.timestamps(true, true);
})
};
exports.down = function (knex) {
return knex.schema
.dropTable('footnotes')
.dropTable('papers')
};
Note
The return keyword is necessary because your migration functions must return a promise in order to work correctly.
Also note that in some examples online, you are likely to see a second argument of Promise
passed in. Before the 0.18.0 version of Knex, Knex utilized the bluebird promise library for promise functionality instead of native promises. As of the newest version, Knex no longer supports versions of Node.js that are older than 8.
Running Your Migrations
You can run all of your migrations to the latest point with the following command:
knex migrate:latest
Updating Your Schema with Migrations
Now what if we realized we made a mistake in our schema and we wanted to add a column to the papers table for a publisher? We can’t go directly into our initial migration file and edit it because it’s already been made and run - it’s already set in stone. So what do we have to do? Create another migration:
knex migrate:make add-publisher
And in this particular migration we’ll want to add a column on the up
, and drop it on the down
:
exports.up = function(knex) {
return knex.schema.table('papers', function(table) {
table.string('publisher');
})
};
exports.down = function(knex) {
return knex.schema.table('papers', function(table) {
table.dropColumn('publisher');
})
};
Go ahead and run this migration and then we’ll add some seed data to our tables that we’ve just set up.
Seeds
Seeds are some default data. Sometimes it’s nice to populate your database with some fake data just so you can work on other functionality while you wait for everything else to come together.
Configuring a Seeds Directory
Similar to migrations, we need to tell knex where to look for our seed data. Back in our knex configuration file, let’s add the following:
module.exports = {
development: {
client: 'pg',
connection: 'postgres://localhost/publications',
migrations: {
directory: './db/migrations'
},
seeds: {
directory: './db/seeds/dev'
},
useNullAsDefault: true
}
};
Making a Seed File
We can make a seed file with the following command:
knex seed:make papers
Again, this creates a default file for you that we’ll want to configure with our own tables and data:
exports.seed = function(knex) {
// Deletes ALL existing entries
return knex('table_name').del()
.then(function () {
// Inserts seed entries
return knex('table_name').insert([
{id: 1, colName: 'rowValue1'},
{id: 2, colName: 'rowValue2'},
{id: 3, colName: 'rowValue3'}
]);
});
};
You’ll notice the seed file starts off by deleting any records in the database. This is just so we can ensure we’re starting with a clean slate any time we run our seed file.
Because we are working with papers and footnotes, we need to:
- Clear out both tables (footnotes first, as they depend on papers existing)
- Add a paper => return the paper’s id => add footnotes with that paper id
The logic here gets a little hairy, but ultimately will end up looking like this:
exports.seed = async function (knex) {
try {
await knex('footnotes').del() // delete all footnotes first
await knex('papers').del() // delete all papers
// Now that we have a clean slate, we can re-insert our paper data
// Insert a single paper, return the paper ID, insert 2 footnotes
const paperId = await knex('papers').insert({
title: 'Fooo', author: 'Bob', publisher: 'Minnesota'
}, 'id')
return knex('footnotes').insert([
{ note: 'Lorem', paper_id: paperId[0].id },
{ note: 'Dolor', paper_id: paperId[0].id }
])
} catch (error) {
console.log(`Error seeding data: ${error}`);
}
}
Note
In our seed files, we often have to return Promises rather than just calling them. Without the use of Promises, the asynchronous code in our seed file will be kicked-off, but knex will not necessarily know to wait for it to resolve before it says ‘I’m done seeding your data’. Because knex().insert() returns a Promise, we can use async/await on it. If your seeding doesn’t seem to be working, but you’re not receiving any error messages, double-check if you’re missing any return statements for the asynchronous operations you’re writing.
Running Your Seeds
You can run your seeds (again, similar to migrations) with:
knex seed:run
Seeding Large Datasets
When you have a large dataset that needs to be seeded, you’ll often want to simplify your code by iterating over your dataset and inserting each record and any of its dependents, rather than having to manually write an insert
for each one. This can get a little hairy when we’re using Promises. We can’t simply nest Promises within a forEach
loop because our code will run through the loop without recognizing that it needs to wait for each insertion promise to resolve before ending the seed execution.
To get around this, we can break our insertion logic out into a separate function. For example, given the following dataset:
// papersData.js
let papersData = [{
author: 'Brittany',
title: 'Lorem Ipsum',
footnotes: ['one', 'two', 'three']
},
{
author: 'Robbie',
title: 'Dolor Set Amet',
footnotes: ['four', 'five', 'six']
}]
module.exports = papersData;
We could write a function that appropriately seeds a paper into the papers
table and all of it’s footnotes into the footnotes
table:
// paper.js
const papersData = require('../../../papersData');
const createPaper = async (knex, paper) => {
const paperId = await knex('papers').insert({
title: paper.title,
author: paper.author
}, 'id');
let footnotePromises = paper.footnotes.map(footnote => {
return createFootnote(knex, {
note: footnote,
paper_id: paperId[0]
})
});
return Promise.all(footnotePromises);
};
const createFootnote = (knex, footnote) => {
return knex('footnotes').insert(footnote);
};
exports.seed = async (knex) => {
try {
await knex('footnotes').del() // delete footnotes first
await knex('papers').del() // delete all papers
let paperPromises = papersData.map(paper => {
return createPaper(knex, paper);
});
return Promise.all(paperPromises);
} catch (error) {
console.log(`Error seeding data: ${error}`)
}
};
Fetching From the Database
Let’s write some express code to interact with our newly seeded database. Set up a simple express server and we’ll add some configuration to work with the knex database:
const environment = process.env.NODE_ENV || 'development';
const configuration = require('./knexfile')[environment];
const database = require('knex')(configuration);
There are a few things going on in the code above:
- We want to know if we’re in a development, testing, or production environment. If we don’t know, we’ll assume we’re in development.
- Based on that environment, we’ll fetch the database configuration from
knexfile.js
for whatever environment we’re in and now our express app will be able to connect to it. - The code NODE_ENV typically refers to an environment variable in Node.js applications that determines the runtime environment. It is commonly used to differentiate between development, testing, and production environments.
Retrieving Data from the Database
To make a selection for all the papers in the database, we can use database('papers').select()
. This will return an array of all the papers we’ve added to the paper table:
app.get('/api/v1/papers', async (request, response) => {
try {
const papers = await database('papers').select();
response.status(200).json(papers);
} catch(error) {
response.status(500).json({ error });
}
});
If we check this in POSTMAN, we should get back an array of all our papers that looks something like this:
[{
id: 1,
author: 'Brittany',
title: 'Lorem Ipsum',
publisher: 'University of Minnesota'
},
{
id: 2,
author: 'Robbie',
title: 'Dolor Set Amet',
publisher: 'University of Michigan'
}]
Now let’s say we decided we didn’t need that publisher column, and we wanted to get rid of it. We could rollback that change to our schema by running:
knex migrate:rollback
Our GET request would now return the same array without publisher columns:
[{
id: 1,
author: 'Brittany',
title: 'Lorem Ipsum'
},
{
id: 2,
author: 'Robbie',
title: 'Dolor Set Amet'
}]
On Your Own
Write a GET request to retrieve all footnotes. Verify it works using Postman.
Adding Data to the Database
Now let’s add a new paper to the database. We can do this with a POST request and use our database insert
method:
app.post('/api/v1/papers', async (request, response) => {
const paper = request.body;
for (let requiredParameter of ['title', 'author']) {
if (!paper[requiredParameter]) {
return response
.status(422)
.send({ error: `Expected format: { title: <String>, author: <String> }. You're missing a "${requiredParameter}" property.` });
}
}
try {
const id = await database('papers').insert(paper, 'id');
response.status(201).json({ id })
} catch (error) {
response.status(500).json({ error });
}
});
On Your Own
Write a POST request to add a new footnote that belongs to a pre-existing paper. Verify it works with Postman.
Querying Data for a specific Resource
What if we want to only retrieve a single, specific paper? We can do this by passing in an id
through our request params. With our database selection, we need to limit our select()
with a where
clause that matches on the id field:
// GET a specific paper
app.get('/api/v1/papers/:id', async (request, response) => {
try {
const papers = await database('papers').where('id', request.params.id).select();
if (papers.length) {
response.status(200).json(papers);
} else {
response.status(404).json({
error: `Could not find paper with id ${request.params.id}`
});
}
} catch (error) {
response.status(500).json({ error });
}
});
On Your Own
Write a GET request to retrieve all footnotes for a pre-existing paper. Verify it works with postman.
Checks for Understanding
- Why would we use a library like knex?
- In as much detail as possible, describe what migrations are and what they do.
- If we accidentally fat-finger a column name when setting up a table in our database, how can we fix that?
- What’s the difference between creating a migration and running a migration?
- What is seed data and what do we use it for?