Building a Blog Part 2: Creating an HTTP API with Scotty and Beam

In part 2 of Building a Blog I will talk about how the small HTTP API used by this blog was implemented. Once again, because I like Haskell, I’ll use the Scotty web framework to listen to HTTP requests and run SQLite queries with the Beam library.

You don’t really need to write a backend for a blog, but I wanted to do it just for fun. Haskell has a great runtime for web apps so I thought it would be cool to try it out in a “real world” application. Besides, I wanted a way to collect data about the site’s usage without using third party JavaScript. Previously I talked about how privacy matters and that this sort of thing should be opt-in. To gather this data I decided to add a like button, which users can voluntarily click to send some anonymous data to my server so that I can count how many people liked my posts. To achieve this, I need some endpoints that can update and read the likes counter with POST and GET methods respectively. The Scotty framework is simple enough to let us do this with just a few lines of code.

{-# LANGUAGE OverloadedStrings #-}

module Main where

import Web.Scotty (get, post, scotty)
import Web.Actions (getLikes, postLike)

main :: IO ()
main = scotty 8008 $ do
  get "/likes/:postId" getLikes
  post "/likes/:postId" postLike

This web app listens for HTTP requests on port 8008 and exposes two endpoints GET /likes and POST /like that are handled by the functions getLikes and postLike respectively. They both take the post’s ID as a URL parameter. The basic idea is that when a POST /like request is received, the likes counter for that post is incremented by one unit in the database. On the other hand, when a GET /likes request is received the number of likes for that post is queried in the database. This sounds fairly easy right? SQLite is my favorite database, and I’m pretty sure that it is good enough for this kind of website, so I just went with that. All that was missing was a way to talk to the database from Haskell.

Managing SQLite with Beam

I’ve always disliked ORMs and any other attempts to abstract SQL from application code. Yes, I understand that SQL injection is a thing, but I like SQL and, although it is a standard, every database that implements it is different. ORMs can be very problematic. You might end up with a simple system if you just embrace SQL, even if it doesn’t fit with the way your programming language does things.

I know that it’s probably not the most rational thing to do given my stance on database abstractions but I decided to give Beam a try. The reason is that I don’t claim to be right about absolutely everything and I was very curious about what would be the “Haskell way” of working with databases. One cool thing about Beam, and other similar Haskell libraries, is that it makes the user responsible for opening and managing the database instead of magically doing it for you under the hood. This is actually good because it means that I can always fallback to sending raw queries with sqlite-simple if I don’t like the queries generated by Beam, so there is nothing to worry about.

Learning Beam wasn’t easy for me. It uses lots of language extensions, and needs a lot of boilerplate that I can’t really understand. Nevertheless, I am happy with the result. I tried reading a few tutorials, but I always ended up reading the official docs because the things that I needed to do weren’t as simple as the ones that tutorials usually do, plus Beam’s documentation site is pretty neat.

The first thing that I had to do was to create the database schema. I wanted to be in full control of it so I generated it with SQL instead of letting Beam do it. I ended up with this:

create table posts(
  id integer primary key, 
  string_id text
);

create table readers(
  id integer primary key, 
  ip_address text,
  user_agent text 
);

create table likes(
  id integer primary key, 
  reader_key__id integer, 
  post_key__id integer,
  foreign key(reader_key__id) 
    references readers(id) 
    on delete cascade,
  foreign key(post_key__id) 
    references posts(id) 
    on delete cascade
);

I only have three tables. The first one is for posts, and excluding the primary key, it only has one field: the ID of the post, which is just a string with the last segment in its path.

The second table is for readers. It stores IP addresses and user agent strings. The reader table also has an index that ensure that each ip_address and user_agent pair must be unique, which means that for this system a reader is a unique combination of IP address and user agent. This is totally not real, but it is a good approximation for me.

The third table, “likes,” is just a relation between many readers and many posts. I let users like the same post as many times as they want because the ip_address and user_agent combination of the “readers” table is not unique at all. A like is merely a combination of a user and a post, somebody liked something. The “readers” table’s only purpose is to provide a little more data about the person who liked the post.

Then I needed to implement this exact same schema with Beam. This has a lot of boilerplate, and I’m definitely not the best to explain it, but you can see it here:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE MultiParamTypeClasses #-}

module Database.Schema (
  BlogDB(_blogLikes, _blogPosts, _blogReaders),
  Like,
  LikeT(Like, _likeReaderKey, _likePostKey),
  Post,
  PostT(Post, _postStringId),
  Reader,
  ReaderT(Reader, _readerId, _readerIpAddress, 
          _readerUserAgent),
  blogDb) where

import qualified Data.Text.Lazy as Text
import Database.Beam

-- specify types for the tables
data PostT f = Post
  { _postId :: Columnar f Int
  , _postStringId :: Columnar f Text.Text
  } deriving (Generic)

data ReaderT f = Reader
  { _readerId :: Columnar f Int
  , _readerIpAddress :: Columnar f Text.Text
  , _readerUserAgent :: Columnar f Text.Text
  } deriving (Generic)

data LikeT f = Like
  { _likeId :: Columnar f Int
  , _likeReaderKey :: PrimaryKey ReaderT f
  , _likePostKey :: PrimaryKey PostT f
  } deriving (Generic)

-- table types that I can actually use
type Post = PostT Identity
type Reader = ReaderT Identity
type Like = LikeT Identity


-- derive instances 
deriving instance Show Post
deriving instance Show (PrimaryKey PostT Identity)
instance Beamable PostT
instance Beamable (PrimaryKey PostT)

deriving instance Show Reader
deriving instance Show (PrimaryKey ReaderT Identity)
instance Beamable ReaderT
instance Beamable (PrimaryKey ReaderT)

deriving instance Show Like
instance Beamable LikeT
instance Beamable (PrimaryKey LikeT)

instance Table PostT where
  data PrimaryKey PostT f = PostId (Columnar f Int) 
                            deriving Generic
  primaryKey = PostId . (_postId :: PostT f 
                                 -> Columnar f Int)

instance Table ReaderT where
  data PrimaryKey ReaderT f = ReaderId (Columnar f Int) 
                              deriving Generic
  primaryKey = ReaderId . (_readerId :: ReaderT f 
                                     -> Columnar f Int)

instance Table LikeT where
  data PrimaryKey LikeT f = LikeId (Columnar f Int) 
                            deriving Generic
  primaryKey = LikeId . (_likeId :: LikeT f 
                                 -> Columnar f Int)

-- types for the database
data BlogDB f = BlogDB
  { _blogPosts :: f (TableEntity PostT)
  , _blogReaders :: f (TableEntity ReaderT)
  , _blogLikes :: f (TableEntity LikeT)
  } deriving (Generic)

instance Database be BlogDB

blogDb :: DatabaseSettings be BlogDB
blogDb = defaultDbSettings

The important thing here is that the names of the record fields for table types must be prefixed with an underscore followed by the table’s name, otherwise Beam won’t be able to work with the existing schema. For example, the reader table has an ip_address column, so the equivalent record field is _readerIpAddress. Foreign keys in the likes table also need special names like post_key__id. Having to use underscores for column names may be a bit too opinionated, but I can live with it.

Writing queries with Beam

Once the schema is ready I can start working on the queries. The first thing to do is to insert data into the database. Here’s a function that inserts a new post row:

import Database.Beam
import Database.Beam.Sqlite
import qualified Database.Beam.Backend.SQL.BeamExtensions as BeamExt
import qualified Data.Text.Lazy as Text

insertNewPost :: Text.Text 
              -> SqliteM (PrimaryKey PostT Identity)
insertNewPost stringId = do
  [newPost] <- BeamExt.runInsertReturningList (_blogPosts blogDb)
                 $ insertExpressions [ Post default_ (val_ stringId) ]
return $ pk newPost

The post table only holds the string ID of the post, so that’s the only argument for this function. Here, the expression Post default_ (val_ stringId) is inserted, which just means “A new post row with an autoincrement ID, and the provided string ID.” This whole expression thing might be a little hard to understand, but it’s a really great way of storing rows with auto increment fields. In popular ORMs found in imperative languages, you usually have to set a 0 or -1 to an auto increment field during insertion, and it feels like a hack when compared to Beam’s approach. Finally, return $ pk newPost returns the new post’s primary key, which will be needed to insert a row to the likes table.

After inserting a few posts, they can be looked up with this function:

findPostByStringId :: Text.Text -> SqliteM (Maybe Post)
findPostByStringId targetStringId =
  runSelectReturningOne
    $ select
    $ filter_ (\row -> _postStringId row ==. val_ targetStringId)
    $ all_ (_blogPosts blogDb)

This function takes a string ID as argument and returns a record with that exact same ID, or Nothing if there’s none. Here you can see how Beam’s DSL starts to shine. The three main functions are select, filter_, and all_. Together they basically tell SQLite “I want you to select all the rows from the posts table that satisfy this filter.” The function runSelectReturningOne wraps the result in a Maybe instead of the typical List, which is exactly what I need here because string IDs are meant to be unique.

What about looking up the number of likes for a particular post? That’s a bit more complicated. The “likes” table is a many to many relation between unique posts and unique readers. This implies the need for the aggregate function COUNT.

getPostLikesCount :: Text.Text -> SqliteM (Maybe Int)
getPostLikesCount postStringId = runSelectReturningOne $ select $
  aggregate_ (\_ -> countAll_) $ do
    posts <- all_ (_blogPosts blogDb)
    likes <- oneToMany_ (_blogLikes blogDb) _likePostKey posts    
    guard_ (_postStringId posts ==. val_ postStringId)
    pure likes

The do block here is basically a query for all the rows to count. First take all posts. Then, for every post, find all the likes that reference that post. Finally only keep the rows whose string_id column is the one that we are interested in. Feeding that do block to aggregate_ (\_ -> countAll_) counts all the retrieved rows. This actually gets compiled to something like:

SELECT COUNT(*) AS "res0" 
  FROM "posts" AS "t0" 
  INNER JOIN "likes" AS "t1" 
  ON ("t1"."post_key__id")=("t0"."id") 
  WHERE ("t0"."string_id")=(?);

This is the same query that I would have written if I had been using SQL directly. Awesome!

With these “basic” blocks, I can build two higher abstractions: incrementLikesCount, which inserts a like to a post and then returns the updated likes count, and getLikesCount, which counts the number of likes for a particular post. These two functions are the ones that I expose so that they can be used in my Scotty actions.

Using Beam queries in Scotty actions

Now that the queries are ready it’s time to use them in the Scotty actions that handle the HTTP requests. Beam queries run on IO, but Scotty actions run on ActionM, so we are going to need some lifting. Too much IO can be harmful, so I created a new typeclass for my queries, and implemented an instance for ActionM that uses Scotty’s liftAndCatchIO to run the queries.

class DBClient m where
  incrementLikesCount :: Connection 
                      -> LikeInfo -> m (Maybe Int)
  getLikesCount :: Connection 
                -> Text -> m (Maybe Int)

instance DBClient ActionM where
  incrementLikesCount conn likeInfo = 
    liftAndCatchIO $ Q.incrementLikesCount conn likeInfo
  getLikesCount conn postStringId = 
  liftAndCatchIO $ Q.getLikesCount conn postStringId

getLikesCount is fairly easy to understand. It takes a SQLite connection and the string ID of the post whose likes count we are interested in. incrementLikesCount is not so obvious because its second argument is LikeInfo, which I haven’t talked about yet. LikeInfo is mere record containing all the data needed to store that like: IP address, user agent and post ID. So, incrementLikesCount takes a SQLite connection, and the necessary data to insert before the likes count can be incremented.

Now that the queries can be easily called from Scotty actions we can go back to the route handlers postLike and getLikes. Remember those? They handle POST /like and GET /like respectively. This is the final implementation postLike:

getNewLikeInfoFromRequest :: ActionM (Either APIError LikeInfo)
getNewLikeInfoFromRequest = do
  ipAddress <- getIpAddress
  userAgent <- getUserAgent
  postStringId <- param "stringId"
  let readerInfo = ReaderInfo <$> ipAddress <*> userAgent
  return $ LikeInfo postStringId <$> readerInfo

respondWithLikesCount :: Int -> ActionM ()
respondWithLikesCount likesCount = text $ pack $ show likesCount

postLike :: SQLite.Connection -> ActionM ()
postLike conn = do
  likeInfo <- getNewLikeInfoFromRequest
  whenValid likeInfo $ \validLikeInfo -> do
    Just likesCount <- incrementLikesCount conn validLikeInfo
    respondWithLikesCount likesCount

First, a LikeInfo value is built from the request. getNewLikeInfoFromRequest is an action that attempts to read all the necessary data from a request, and it can either return an error, or the correct data. Then, I use this helper function whenValid that calls the passed lambda function with valid data when likeInfo has a Right constructor. Otherwise it sends the HTTP response with the APIError’s bad status code. When the request data is valid, incrementLikesCount is invoked and an HTTP response with plain text containing the updated likes count is sent back to the client.

getLikes is way simpler, here’s the implementation:

getLikes :: SQLite.Connection -> ActionM ()
getLikes conn = do
  postStringId <- param "stringId"
  Just likesCount <- getLikesCount conn postStringId
  respondWithLikesCount likesCount

It simply reads the string ID from the URL params, and then runs the getLikesCount query and responds with the likes count. I don’t need to check the value of postStringId, because if it is not present, then Scotty immediately throws an error and ends up returning a bad status code.

That’s all I need from this API for now. In the future I might want to add more endpoints for whatever new needs I might have. The next thing to do is to create a widget that lets the user interact with the server using these endpoints. The next post will be about how the like button at the bottom of this page was implemented with JavaScript using Mithril.