When writing a web service, I often lean towards using tools that are as minimal as possible. One pretty obvious reason for this is the avoidance of dependencies you either don't want or don't need in your project. Whilst I'm not someone who goes out of their way to avoid dependencies, this is a pain point particularly in Rust because of the cost of building them repeatedly rather than shipping extra interpreted files around (especially if you use "pure" build environments).

Another reason (although due to personal preferences) is that I find code typically looks less mystical when you have less dependencies involved. If you take some of the larger frameworks in popular languages, it can sometimes be hard to figure out how certain code gets called, and even harder to work out what it's being called with and the contracts which come along with it. This is not to say that everyone should write everything from scratch, but rather avoid misdirection in code when possible. Having someone contribute to your codebase should not require them having to learn about 25 projects they're also not familiar with, in my opinion.

As a combination of these reasons, and the fact that many of the services I write in Rust tend to be smaller web services, it's unsurprising that I rarely find myself using a full web framework. Instead I tend to lean more towards using the "building block" libraries which are so common in Rust. I find that they help teach principles a little better, whilst providing just enough to be productive. If you do want a framework though, you should check out Gotham (which I am currently a member of), Rocket and Actix-Web.

An Introduction to Hyper

When writing a web service or client in Rust, Hyper is pretty much the default choice. Of course there are many other options out there, but Hyper tends to be the one that people opt for because it's safe and (relatively) straightforward, whilst staying extremely efficient and scalable. Although the README states that the API is still likely to change prior to v1.0, it's used by many engineers in production code, and is actually the foundation for a number of web frameworks floating around (Gotham included).

The library itself is pretty low-level in that the APIs are based more around the HTTP specification itself, rather than providing more "friendly" APIs. It tends to provide a solid foundation for those wishing to write their own systems around it, perhaps a little similar to something like Express in the Node.js space (but with a lot less sugar!). It's based on Tokio, a strong foundational library providing asynchronous abstractions in the Rust ecosystem, making it very efficient and lightweight.

Hyper is pretty much my go-to for a web service at this point; although it's low-level, many of those working with scalable services/applications in the Rust space will probably come across working with Hyper at some point. This makes it pretty likely that anyone working on the code will either already be familiar with it, or find value in becoming familiar with it - and of course the same applies to interacting with both Tokio and Futures (one of the abstractions involved in asynchronous execution).

There are many examples in the Hyper repository, but one of particular note is the example of a basic web API. I find that this is a good example of how you can easily trace the execution flow from request receipt through to the actual code handling the request. Perhaps evident though, routing your requests to the code handling them can be a bit messy. Everything is kinda in one (large) block, and you have to manually match on the parts and (in the case of path parameters) slice them up to get values back out. So I decided that this warranted writing a library named Usher.

Why Write a Library?

My biggest reason for building this library was the missing ability to easily fetch parameters out of paths. Although it's easy to do this manually for any given path, it's non-trivial to do it in a common way without basically doing what I did and writing a router. Added to this was the snowballing amount of code that resulted for any project with more than "a few" configured paths. Rather than write a router in every project I work on with over a dozen paths, I decided to simply open source the one I did write in case it's helpful for anyone else.

However, although written with Hyper in mind, Usher was written in a way intended to be generic to many use cases. There's actually no Hyper specific code in the library itself, and the only code based around HTTP itself is provided as an "extension" over the main routing mechanism. It can be toggled off completely via a Cargo flag at compile time. It provides a general routing tree which can be used to lookup a path, and retrieve a potential value associated with the path. It's actually really straightforward, but makes a bunch of stuff easy. Of course, there are other libraries out there which provide similar behaviour, so why write another one?

One of the issues with parameterized routing in particularly is that everyone is familiar with different styles of definition. If you come from Node.js, you might want /:id to represent an "id" parameter. If you come from Java, you might want {id}. You might want to support some sort of custom syntax, perhaps to accept paths which only adhere to some sort of Regex constraint. The list goes on. The way that Usher tries to resolve this is by providing two traits, which are used to control how you compare against paths. They allow you to use any syntax you wish, and have a relatively low cost as they're pre-calculated before any routing takes place. These traits are called Parser and Matcher.

Trait Based Routing

It is these two traits that provide the flexibility you receive when using Usher. Together they provide a pretty simple way of defining how you want to process routes, and provide control over performance characteristics for your typical input. As the values implementing these traits are going to (usually) be part of your codebase, it's also pretty simple to see exactly why a certain path is routed the way it is.

The major difference in the two is the context in which they're used; the Parser values are used at insertion time (i.e. creation time) to determine the most appropriate Matcher value to use at runtime. A Matcher value is the type which compares incoming paths to determine matches. Interestingly, we could technically use a single trait to do the job of both Parser and Matcher, but the separation makes the API a little more robust to changes that might happen in future, and provides a nice separation of concerns (such as the fact we can throw away all Parser types at runtime). These two traits are pretty simple:

pub trait Parser: Send + Sync {
    /// Attempts to parse a `Matcher` out of a segment.
    fn parse(&self, segment: &str) -> Option<Box<Matcher>>;
}

pub trait Matcher: Send + Sync {
    /// Retrieves a potential capture from a segment.
    fn capture<'a>(&'a self, _segment: &str) -> Option<Capture<'a>> {
        None
    }

    /// Determines whether an incoming segment is a match for a base segment.
    fn is_match(&self, segment: &str) -> bool;
}

The Parser has a single method which is used to try and parse a Matcher value out of an incoming path segment. The return type is an Option because you might not be able to find an associated matcher, for example if the provided segment doesn't match some syntax you expect. At router creation time, you provide a list of Parser types which are tested in order to determine the matchers to use at runtime. This allows for multiple Parser types to keep the logic of any given type small, whilst staying easy to extend. The first parse call to return a Matcher type will be the Matcher used at runtime for the segment. If no calls to parse return a value, an error will be thrown (as it's something technically broken in your code, rather than something which can sometimes be successful).

The Matcher types are used to compare against incoming segments at runtime, so they receive the paths you have access to in your application. This is in contrast to Parser types which might have some form of templated syntax. The is_match method is called to determine if the segment matches the rules defined, returning true if so. To allow for parameters, capture will be called in a success case to fetch a potential parameter from the segment. Naturally because a capture is optional, there's a default implementation to represent as such (which just returns None). Below is an example of types implementing these traits for simple static path segments:

/// A `Parser` struct for static values.
///
/// There's no state required here; we just need the explicit type
/// to allow us to implement the `Parser` trait and pass it around.
struct StaticParser;

impl Parser for StaticParser {
  /// Because any static value can be compared directly, we can always
  /// construct a `StaticMatcher` from the segment. It would be important
  /// to place this parser last in priority as it would match everything!
  fn parse(&self, segment: &str) -> Option<Box<Matcher>> {
    Some(Box::new(StaticMatcher {
      inner: segment.to_owned()
    }))
  }
}

/// A `Matcher` struct for static values.
///
/// The inner value contains the static `String` we're going to
/// compare against at runtime. This could probably be a reference
/// if we wanted to save a few bytes of memory in production.
struct StaticMatcher {
  inner: String,
}

impl Matcher for StaticMatcher {
  /// Here we just have to compare the value given to us by the
  /// `StaticParser` against the incoming value at routing time.
  fn is_match(&self, segment: &str) -> bool {
    self.inner == segment
  }
}

There's very little to this beyond some direct binary matching. It's important to note that this will match any route, and so it's important that this parser be placed last in the list of parsers. Don't worry though; as matching in this way is so fundamental, a static matcher ships with Usher itself. There's no need to implement this logic in every project you work on.

Extending Usher for HTTP

To allow for additional use cases, the core of Usher doesn't actually involve anything related to HTTP, rather the generic design allows you to store a custom type (T) as a leaf in the tree. In the case of a web service, routing is usually a function of both the path as well as the HTTP request method. Usher provides a type that operates with this in mind; HttpRouter<T>. This type is locked behind the web Cargo feature as it requires extra dependencies that the core does not require.

The HTTP router is a very thin layer of sugar around the base Router<T>, to provide an API that's friendlier to the concept of HTTP requests. The leaves are still generic, as we don't want to enforce a specific API handler signature on users. Under the hood this type is actually a small binding around a Router<HashMap<Method, T>>, which allows us to carry out lookups which are based on both a path and a HTTP method. Looking for a leaf in the tree requires both of these values, and initially routing is done as usual using the path. If the path resolves to a value (in this case HashMap<Method, T>), the value is then checked for another value associated with the provided method. Only then will a value be returned to the caller.

In addition to supporting the extra context of a lookup, the HttpRouter<T> offers a more familiar API for adding routes. As each leaf in the tree requires a method as well as a path, the "default" way to do this would be via tree.insert(Method::GET, "/some/path"). As there's a finite set of usual HTTP methods though, we can do better than this, and so Usher offers the following syntax:

router.get("/", get_root);
router.post("/", post_root);
router.put("/", put_root);
router.delete("/", delete_root);

Each defined HTTP method from the http crate has an associated convenience function, to make it quicker to define routes for a given method. This doesn't save that much over the insert format above, but it's a little more ergonomically pleasing. It's also a little more familiar for those coming from other languages which have libraries which follow similar syntax.

Combining Usher and Hyper

Putting both Hyper and Usher together is actually very simple, because there's no real "magic" to how it works. For demonstration purposes, we're going to deal with synchronous API handlers which simply return a Response<Body> as they provide signatures that are a little more beginner friendly. For those who want to see an asynchronous example using Future types, there is an example of this in the Usher repository). Before we integrate Usher, we'll create a very simple Hyper server which accepts all requests and simply responds with a 404 ("Not Found") response.

use hyper::rt::{self, Future};
use hyper::service::service_fn_ok;
use hyper::{Body, Request, Response, Server, StatusCode};

fn main() {
    // Create our address to bind to, localhost:3000
    let addr = ([127, 0, 0, 1], 3000).into();

    // Construct our Hyper server.
    let server = Server::bind(&addr)
        .serve(|| {
            // Create our service function to return a 404.
            service_fn_ok(move |req: Request<Body>| {
                let mut response = Response::new(Body::empty());
                *response.status_mut() = StatusCode::NOT_FOUND;
                response
            })
        })
        .map_err(|e| eprintln!("server error: {}", e));

    // Log the port we're listening on so we don't forget!
    println!("Listening on http://{}", addr);

    // Initialze the actual service.
    rt::run(server);
}

This does nothing more than return a 404 on all requests made to the server, which is perfect as a default behaviour to check that everything is working. You can see that the actual code overhead of using Hyper is extremely small and relatively easy to understand what's happening.

To extend this code using Usher, we need to construct a router inside the serve closure and populate it with our routes. For this example we'll register a single route on the root path to return a message about the path/method used in the request, to demonstrate that the routing code is working correctly.

// ... omitted (see above)

.serve(|| {
    use usher::http::HttpRouter;
    use usher::prelude::*;

    // Construct our HTTP router using only a static parser.
    let mut router = HttpRouter::new(vec![
        Box::new(StaticParser)
    ]);

    // Register a handler on the "/" path, for GET requests.
    router.get("/", Box::new(|req: Request<Body>| {
        let message = format!(
            "Received {} request on path {}",
            req.method(),
            req.uri().path()
        );

        Response::new(message.into())
    }));

    // ... omitted (see above)
})

// ... omitted (see above)

This creates a new HTTP router using static matching as the only matching type. As we're using simple functions as our handlers, the compiler is able to infer the types of the values stored in the router. You could provide a more complicated type to allow for different types of handlers on the same router though.

Once we have our tree created and configured (however you wish to go about it), all that remains is to bake it into the actual service function, rather than just returning a 404 all the time. We will need to keep the 404 code around though, because that's what we'll use in the case the request cannot be routed to a handler:

// Create our service function to route requests.
service_fn_ok(move |req: Request<Body>| {
    // First we have to extract the method and path.
    let method = req.method();
    let path = req.uri().path();

    // Then we look for our handler in the router.
    match router.handler(method, path) {
        // If a handler is found, we just have to call it!
        //
        // In our example we don't care about any captures,
        // but this is where handling for those would also occur.
        Some((handler, _captures)) => handler(req),

        // If no handler matches, keep the 404 response!
        //
        // This acts as a default response to pass back when we
        // can't find a handler to deal with a request.
        None => {
            let mut response = Response::new(Body::empty());
            *response.status_mut() = StatusCode::NOT_FOUND;
            response
        }
    }
})

At this point the Hyper server will route requests to your configured route handlers, and return the responses created by them. All that's required to add extra routes is to attach them to the router in the same way we did with the example route. A full code example is available in the repository, so make sure to check it out and play around with it. The one in the repository also includes a demonstration of working with path parameters, which we skipped over in the example above.

Request for Feedback

And... that's about it! We're using Usher at work in two web services to great effect; we're yet to observe any obvious issues or overhead (even if it is early days). That said, if you find any issues or have any suggestions, please file issues or PRs on the GitHub repository. As this is my first "real" library written in Rust, I'm particularly interested in any feedback related to the API and whether it's idiomatic or not. Additionally, if you have any suggestions on how we can lower the footprint of Usher, or increase the performance, please do let me know. Finally, if you do use Usher in your projects, please share them!