Max Polun's blog

Everything I know about tests in rust

I see testing questions pop up quite a bit on /r/rust. Rust’s testing is rather different from how people do testing in other languages, mostly due to the low-level nature of the language and being highly static. I’m going to cover a lot of stuff that you can find in the official rust documentation — sometimes just having a different description of the same thing can make it easier to understand. I also have some experience writing web services in rust, so that’s a perspective whos specific problems I’ll have some solutions to. I’m also going to go beyond how to test, and talk about higher-level program structuring. I’ve set up a table of contents so you can skip to whatever part you’re interested in:

# This code block gets replaced with the TOC

The basics

Writing tests in rust is easy:

fn add(a: i32, b: i32) -> i32 {
  a + b
}

#[cfg(test)]
mod tests {
  use super::*;

  #[test]
  fn adding_works() {
    assert_eq!(add(2, 2), 4)
  }
}

Even if you don’t know rust you can probably figure out what this is doing at a high level, but let’s break down some of the boilerplate and why it exists:

The module declaration

#[cfg(test)]
mod tests {
  // ...
}

This is creating an inline module — it’s the same as having mod tests and a file named tests.rs (or tests/mod.rs). You don’t need to use a module, you can have your tests in the same module you define your code in, however it’s convinient to define a module like this for several reasons. We don’t have to name it tests or anything either. You can have several test modules if you want.

The #[cfg(test)] is conditional compilation — it’s really just there to suppress warnings about unused code when building. Rust will warn you of any unused code, and anything in the tests module shouldn’t be used in your runtime code. This module isn’t some special pattern to rust, so to prevent rust from warning you, we don’t compile it unless it’s a test. I’m not going to go into the details, but #[cfg(some condition)] is how you do conditional compilation in rust. For now all we need to know is #[cfg(test)] means compile only when running tests. You can use this on other code — I’ll often define testing helper functions that are #[cfg(test)]. One last little thing I might as well mention is you can write this as

mod tests {
  #![cfg(test)] // <- notice the #! -- that means apply this attribute not to the next item, but to the parent: mod tests in this case
  // ...
}

We also have use super::*; — this isn’t required, it’s pulling in everything from the parent module into the tests module, so we can just refer to add instead of parent::add. You can bring in individual items too (e.g. use super::add;), but I usually do use super::* to bring in everything. This is pretty much my only use of a glob import except in the case of a specifically designed prolog module for some library (I’ll occasionally do a use diesel::prolog::*). I believe you can access non-pub items this way as well, but I never do that — if there’s something you want to test it should probably be in a module and export a public API — at least pub(crate), and reaching into the private innards of some data is a bad practice.

The test itself

OK now on to the test proper. For a reminder here’s what it looks like

#[test]
fn adding_works() {
  assert_eq!(add(2, 2), 4)
}

We’ve got the #[test] attribute that tells cargo test that this function should be run when testing. The function should take no arguments and return no value. If the function panics the test is a failure, otherwise it’s a success. Rust comes with a few assert! macros that will panic, depending on your defined conditions:

  • assert!(boolean, optional message) will panic if the boolean is false, and will print the message if you gave one. Technically this is all you need
  • assert_eq!(param1, param2, optional message) will panic if param1 != param2 using PartialEq (so you can use it with e.g. floats, though that’s probably not what you want). It’ll print out each item, but won’t do any fancy diffing. I’ve started using the pretty_assertions crate to get better error messages
  • assert_ne!(param1, param2, optional message) works the same as assert_eq!, but panics if the two params are equal.

Test failure on panic is good because then you can use .unwrap() as an assertion. A test like

let unit_under_test = create();
unit_under_test.do_stuff().unwrap()

is a totally valid test. In fact for tests is the only time I wish that .unwrap() was a little more convinient. It’s also the only time I wish Strings were a bit more convinient to create.

You can also have a test that should panic. Just add a #[should_panic] attribute to it.

Integration tests

Compilation of integration tests

So far we’ve just been dealing with what rust calls unit tests (though they don’t need to be true unit tests). These are all compiled into one big crate together with the non-test code

 Unit Test binary crate
┌────────────────┐
│                │
│  Non-test code │
│                │
├────────────────┤
│                │
│ Tests          │
│                │
└────────────────┘

For integration tests, cargo will compile your crates not as tests (#[cfg(test)] code will not be compiled in), and each file in the tests/ folder is compiled as a seperate binary test crate. It’s common to share some code between the integration test files with a common.rs file that gets included in each test binary via mod common;.

┌────────────┐
│            │
│ tests/a.rs │   link
│            ├────────────────────┐
├────────────┤                    │
│ common.rs  │                    │
└────────────┘                    │


┌────────────┐               ┌───────────┐
│            │               │           │
│ tests/b.rs │    link       │ Production│
│            ├──────────────►│  code     │
├────────────┤               │           │
│ common.rs  │               └───────────┘
└────────────┘                    ▲

┌────────────┐                    │
│            │    link            │
│ tests/c.rs ├────────────────────┘
│            │
├────────────┤
│ common.rs  │
└────────────┘

Compiling this many crates can be slow if you have a lot of tests. For a small, single project it probably doesn’t matter too much, but when it gets big, there’s a few things you can do to speed up compilation:

  1. Any generic code exposed via your library crate will be instantiated once per type (which is normal), once per test crate (which you might not be aware of), so things that reduce the amount of code instantiated for each generic usage can be helpful as well. For example you can have a non-generic function do most of the heavy-lifting and have a wrapper function that does type conversions. This way most of your code is generated once for the lib crate, and just the wrapper is generated for each test crate
  2. Keep your common module small since it’ll go into each test crate.
  3. As a last resort, just have a single tests crate. One file directly under tests/, all other files are in a subdirectory and are included in the test crate via mod.

How to integration test

In these integration tests you can either

  1. use your library crate just as your library consumers would
  2. run your binary crates.

Option 1 is always preferred — calling library functions is way faster and easier compared to starting a new process for every test. In fact, I almost never test a binary directly. I’ll define all functionality in a library crate and the binary will just pass the arguments into that library crate.

An example

I’ve written a few actix-web web servers, and a pattern I’ve settled on for doing integration tests is to have my lib crate define a function that sets up all of the routes via configure. This way I can have minimal boilerplate in my binary crate for the server, and my tests can pass in parameters for e.g. the database. You can also return an http server, but I’ve found that getting the types right to be tricky and somewhat fragile, so I use configure which doesn’t have any return value so you don’t have to worry about the types :-).

// in lib.rs
pub struct StaticConfig {
  database_url: String
}
pub fn config_api(cfg: &mut web::ServiceConfig, static_config: StaticConfig) {
  let db_pool = get_pool_from_url(static_config.database_url);
  cfg.service(
    web::scope("/")
      .data(db_pool)
      .service(
        web::scope("/users")
          .route("", web::get().to(users_index))
          .route("", web::post().to(users_create))
          .route("/{user_id}", web::get().to(users_get_by_id))
          .route("/{user_id}", web::get().to(users_get_by_id))
      )
  )
}

// in main.rs
#[actix_rt::main]
async fn main() {
  HttpServer::new(move || {
    App::new()
      .wrap(Logger::new())
      .configure(|cfg| {
        let database_url = std::env::var("DATABASE_URL").expect("DATABASE_URL");
        config_api(cfg, StaticConfig {
            database_url
          })
      })
  }).bind('0.0.0.0:8000')
    .unwrap().run().await.unwrap()
}

// in tests/users_test.rs
#[actix_rt::test]
async fn test_can_create_user() {
  let db = common::test_db();
  let mut app = test::init_service(
    App::new().configure(|cfg|
      config_api(cfg, StaticConfig {
        database_url: db.database_url.clone()
      }))
    )).await;
  let req = test::TestRequest::post()
    .uri("/users")
    .set_json(&json!({
      "username": "MaxPolun"
    }))
    .to_request();

  let user = test::read_response_json::<User>(&mut app, req).await;
  assert_eq!(user.username, "MaxPolun");
}

I haven’t looked into it too much, but it might be possible to have a setup like this using the #[get(route)]/#[post(route)] macros, but I haven’t tried to. The boilerplate is slightly annoying, but I’d rather have the boilerplate and be able to set up integration tests the way I want, so I haven’t looked too closely at it. I haven’t looked at every rust web framework, but you should be able to do an equivalent setup, and do something similar for CLIs or GUIs.

Doctests

I’ll keep the section on doctests short, the basic idea is that any code examples in your documentation is compiled and executed as a test, in order to make sure they all stay valid. It’s a good idea, but nontrivial doctests are annoying and awkward to write. They serve their purpose of making sure examples stay working, but I don’t think they’re useful as a testing strategy, except in the case of the smallest libraries.

Here’s an example:

pub struct Example {}

impl Example {
  /// Make a new Example
  ///```
  ///assert_eq!(Example::new(), Example {})
  ///```
  pub fn new() -> Self { Self {} }

  /// Here's when they get annoying
  ///```
  /// # #[whatever_runtime::async_main]
  /// # fn main() -> Result<(), Box<dyn Error>> {
  /// # let example = Example::new();
  /// assert_eq!(example.more_complex().await?, 5)
  /// #}
  }
  ///```
  pub async fn more_complex(&self) -> Result<usize, SomeError> {todo!()}
}

Notice the triple layer of quoting (///, ```, and #) in the more complex example. It’s not a problem per say, but it adds too much overhead to really be useful for tests.

Consider doctests as a way to ensure your docs have working examples, not as part of your tests.

Organizing tests

Setup and teardown

One thing rust tests do not have is setup and teardown. You see these in most testing frameworks, but there’s no feature for this in rust. Now there are tools that will add this feature to rust tests, but you don’t really need it. All you need is regular rust functions, structs, and traits.

Setup is the easy part. Just define a function:


fn setup() {}

#[test]
fn test_with_setup() {
  setup();
  assert!(true)
}

You can write some reusable setup helpers. They’re just regular functions, so you can pass in params, return values, etc. Pattern matching is helpful here. I’ll often have a module named test_util for test functions that are used all over my project (e.g. getting a test db connection).

Now you can’t just use functions for teardown. If the test fails there will be a panic and it won’t be executed. Most of the time this is fine actually — since you’re not mutating a shared variable in your setup, and generally making new variables for each test — teardown is something you don’t usually need. Sometimes, however, you need to manage an external resource (e.g. database connection, tempfile) and you don’t want to either use up that resource when running your tests, or accidentally couple your tests together. The solution is to just use standard rust idiom again, but here you need RAII implemented with a struct with a Drop impl.

struct TestDb {
  database_url: String
}

impl Drop for TestDb {
  fn drop(&mut self) {
    Database::connect(&self.database_url).truncate_all_tables() // or whatever your db cleaning method is
  }
}

fn get_db() {
  return TestDb {database_url: get_the_url_somehow()}
}

#[test]
fn integration_test_using_db() {
  let db = get_db();
  let server = start_server(&db.database_url);

  request_to_server().unwrap()
  // database will be truncated here, making it ready for the next test
}

Drop will be called even if there’s a panic. Now if you panic during the panic, then you’ll just abort and skip cleanup, but that’s generally a possibility in testing frameworks — if you abort early cleanup won’t run. You should still try to avoid this if you reasonably can — it’s always better to not need a teardown step. For unit tests hitting a database, you can often just create a new connection and start a test transaction instead, however this doesn’t work when you’re writing an integration test that might make multiple requests.

dealing with big test modules

When you’ve got a large module, and a lot of tests, it can be pretty annoying to switch back and forth from the top to the bottom of a file. IDE features can help, but I still find that it can get overwhelming. There are several options options for splitting it up:

  • Multiple test modules

Here you have multiple things you are testing in the parent module, and you put a test module after each one. Even if you just have one struct in the file, if it has impls for multiplle traits you can split it up this way, that means your tests and non-test code are kept relatively close:

struct User {
  username: String
}

impl User {
  fn new() -> Self {...}
}

#[cfg(test)]
mod tests_inherent {
  user super::*;

  #[test]
  fn new_returns_a_valid_user() {
    ...
  }
}

impl Display for User {
  ...
}

#[cfg(test)]
mod tests_display {
  user super::*;

  #[test]
  fn display_stringifies_a_user_just_right() {
    ...
  }
}

This works, but it doesn’t scale up that much better than just putting one big mod tests {} at the bottom of the file. I only use it when there’s a bunch of traits I’m implementing for a single struct, and each trait is mostly independent and each trait has a non-trivial implementation.

  • Multiple non-test modules

You can also split your code up into multiple modules on the production code side of things, and have a mod tests {} for each one of those. This works best when each thing you’re testing is logically, but not physically related (e.g. each route handler for the same resource makes more sense than multiple methods on a single struct).

If you have a struct with several methods, and you’d like to have seperate files to test each method, you can have multiple impl blocks like so:

mod method_a;
mod method_b;

struct User {
  username: String
}

// in method_a.rs
impl User {
  pub fn a(&self) -> u32 {
    12345
  }
}

#[cfg(test)]
mod tests {
  user super::*;

  #[test]
  fn test_a() {
    assert_eq!(User::new().a(), 12345)
  }
}

// in method_b.rs
impl User {
  pub fn b(&mut self) {
    self.username += "method b"
  }
}

#[cfg(test)]
mod tests {
  user super::*;

  #[test]
  fn test_a() {
    let u = User::new();
    u.b();
    assert_eq!(u.username, "testusermethod b");
  }
}

If you are using a trait as a testing seam, you may end up with traits with too many methods, most of which you ignore for most tests. In this case, you can split up your trait by method, and furthur into files:

mod method_a;
mod method_b;

struct UserImpl {
  username: String
}

// in method_a.rs
pub trait MethodA {
  fn a(&self) -> u32
}

impl MethodA for UserImpl {
  fn a(&self) -> u32 {
    12345
  }
}

// in method_b.rs
pub trait MethodB for UserImpl {
  fn b(&mut self)
}

impl MethodB for UserImpl {
  fn b(&mut self) {
    self.username += "method b"
  }
}

This adds a lot of boilerplate though — trait definition for each method, and each consumer of these traits must say which methods they want to use. Something to keep in mind if you really need it, but not needed most of the time.

  • Seperate test file(s)

Nothing forces you to have your test modules inline, you can define a test module in a seperate file via

#[cfg(test)]
mod tests;

I like the default of tests in the same file as the production code, but having seperate files may be simpler. The flexibility of rust’s test system is one of the nice things about it — you can use all of the tools you have for structuring normal rust code in your tests. I’ve never done this but I could see it if keeping everything in one implementation file makes sense (shared private functions, etc) but there are several disparate tests that are mostly unconnected. I could also see doing it for property-based testing (aka quickcheck).

Writing testable code

The goals of writing testable code are to write your code such that your unit tests:

  1. Are isolated to the single unit under test
  2. Are fast
  3. Are as close to reality without compromising on the first to items

Why are these the goals? So you can write many unit tests and get fast feedback when you change code.

  1. Isolation means you can quickly determine the cause of a failing test
  2. Fast tests mean you can run all of your tests often
  3. Reality (as much as possible) is to make sure your tests are useful for finding regressions.

Not all code (and not all projects) need to be testable in this sense. Sometimes it’s fine to have to test the whole system, but that is less true the larger your system is. If you’ve got a microservice with 2 routes, having to do all testing through the HTTP interface isn’t too much of a problem, but in a larger service with 8, 10, 12 routes, you’re going to start to hate firing up the test suite if everything goes through HTTP and hits the database. Better to have a minimal set of full-system integration tests, and do the detail testing of edge-cases in isolated unit tests. Of course the problem is that small systems often evolve into large systems without updating the test strategy.

Seams

In the book Working Effectively with Legacy Code, Michael Feathers defines a seam as a place where you can change the behavior of your code without editing it. This is important for writing tests because you want to have different behavior in tests than your production code in order to isolate your unit.

Parameter seams

The simplest seam is a parameter — you can send different parameters into a function or method, and get different results. This is why pure code is easy to test. You can pass in whatever parameters and assert based on the results

fn add(a: i32, b: i32) -> i32 {
  return a + b
}

#[test]
fn test_add() {
  assert_eq!(add(2, 2), 4)
}

However

  1. Sometimes your dependency is not a parameter, but is accessed globally (whether a global variable, or a global function)
  2. Just because you take a parameter doesn’t mean it can be swapped out.

So a plain parameter is fine for mostly-pure data: numbers, strings, simple enums, Vecs/arrays/Hashmaps/structs of these simple data with no methods, and not too deeply nested (you can test deeply nested pure data, but it just becomes unwealdy). One testing strategy is to move most of your business logic that needs careful testing to act on pure data like this, and mostly not unit test your impure code — just integration tests for the impure part. This works ok your overall program mostly reads data at the start, processes it, and then returns output. However many programs require a lot of back-and-forth interaction with state (either internal — things like internal caches or in-memory datastores, or external — like a filesystem, database, network, or UI) and only testing your stateful interactions via integration tests for these types of programs starts to look like you’re not doing unit testing at all.

I’d recommend this pure-data approach mainly for things like compilers — each step in the process takes something as input, and prodces an output, mostly purely (sometimes multiple steps take and output the same format, sometimes they output a different format). In a case like that you can test all of your difficult logic with pure data, and just test that your I/O works with a much smaller, simpler set of integration tests.

Trait seams

So given we’re taking parameters, what’s the best way to swap out a production dependency for a test double? As usual with rust, it’s Traits.

Sometimes we can use traits that already exist. For example let’s say that you’ve got your adder already, and you want to test that you can print what you’ve added. Your first code looks like

fn add_and_print(a: i32, b: i32) {
  println!("{}", add(a, b))
}

This works ok, but you can’t really test it easily. println! takes an implicit dependency on stdout, so let’s make that explicit:

fn add_and_print(stdout: Stdout, a: i32, b: i32) {
  write!(stdout, "{}\n", add(a, b))
}

Now our dependency is explicit, but we still can’t test it too well. How do we send a test double in place of stdout?, well the write! macro works in terms of the Write trait, so we can use any type that impls Write:

fn add_and_print<Out: Write>(out: &mut Out, a: i32, b:i32) {
  write!(out, "{}\n", add(a, b))
}

#[test]
fn test_add_and_print() {
  let mut buf = String::new(); // String impls `Write`
  add_and_print(&mut buf, 2, 2);
  assert_eq!(buf, "4");
}

Now we can easily test this code doing IO by swapping out stdout for a different implementation, we have our seam. This is using compile time trait seams, but we can do it at runtime too:

fn add_and_print(out: &mut dyn Write, a: i32, b:i32) {
  write!(out, "{}\n", add(a, b))
}

#[test]
fn test_add_and_print() {
  let mut buf = String::new();
  add_and_print(&mut buf, 2, 2);
  assert_eq!(buf, "4");
}

The code even looks the same, outside of the type signature.

For your own traits, you can write structs yourself to implement different scenarios you’re testing, or use a mocking library:

trait UserRepository {
  fn create(&self, username: String) -> Result<User, UserCreateError>
  fn get_all(&self) -> Result<Vec<User>, GetUserError>
}

// for use in cases where you're testing the success result:
struct FakeUserRepository {
  users: Rc<RefCell<Vec<User>>>
}

impl UserRepository for FakeUserRepository {
  fn create(&self, username: String) -> Result<User, UserCreateError> {
    let user = User::new(username);
    self.borrow_mut().push(user.clone());
    Ok(user)
  }
  fn get_all(&self) -> Result<Vec<User>, GetUserError> {
    Ok(self.borrow().clone())
  }
}

// when you're testing errors
struct ErrorUserRepository {
  err: Box<dyn Error>
}

impl ErrorUserRepository {
  pub fn set_err(&mut self, err: Box<dyn Error>) {
    self.err = err
  }
}

impl UserRepository for ErrorUserRepository {
  fn create(&self, username: String) -> Result<User, UserCreateError> {
    Err(self.err.downcast().unwrap())
  }
  fn get_all(&self) -> Result<Vec<User>, GetUserError> {
    Err(self.err.downcast().unwrap())
  }
}

// For mixed cases you'll need some sort of custom implementation that tracks calls
// or use a library -- here's an example using https://github.com/DavidDeSimone/mock_derive
#[test]
fn test_mixed() {
  let mock = MockUserRepository::new();
  let method = mock.method_get_all()
    .first_call()
      .set_result(Ok(vec![User::new(), User::new()]))
    .second_call()
      .set_result(Err(DatabaseError));
  mock.set_get_all(method);
  assert_eq(do_something_with_users(mock), Err(DatabaseError))
}

Traits are the go-to seam for testing in rust. They explicitly are designed to let you swap out implementations. Generally in rust, compile time (impl rather dyn) polymorphism is preferred, though both have their place. There are a few things to keep in mind though:

  • For now, traits can’t be async (or use impl Trait)

You can still return a Future though, and there’s the async_trait crate that will translate async methods in a trait to ones returning a Pin<Box<dyn Future + Send + 'async>>. This mostly works like you want, but it has a bit of runtime cost. The cost is pretty reasonable for the type of things I typically use rust for, but it might be costly in performance critical or highly constrained environments. If you want to avoid the overhead of async_trait you can always use an associated type like so:

trait MyAsyncTrait {
  type Future: Future<Output = i32>

  fn async_method(&mut self, param: i32) -> Self::Future
}

This adds some boilerplate, and it’s a bit of a PITA every time you need to implement it, but no runtime overhead. If a specific implementation (e.g. your test double) doesn’t care about the small performance cost you can even use async/await syntax with this:

struct Factorial;

impl MyAsyncTrait for Factorial {
  type Future = Pin<Box<dyn Future<Output = i32>>>;

  fn async_method(&mut self, param: i32) -> Self::Future {
    Box::pin(async { // this is async block syntax
      async_function(param).await
    })
  }
}

One thing to keep in mind for these sorts of traits though, is that the implementor can do work before returning the future. e.g.

struct Factorial;

impl MyAsyncTrait for Factorial {
  type Future = Ready<i32>; // ready is a future that immediatly resolves its value

  fn async_method(&mut self, param: i32) -> Self::Future {
    ready(factorial(param)) // <- computes the factorial before returning the future
  }
}

This is something to be aware of, but it’s mostly fine. I’ve only seen issues when interacting with tracing.

  • You still need to wire up your real impl somehow

If you’re using traits as testing seams pervasively, you’ll be passing around a lot of formerly implicit dependencies as explicit dependencies, and all of your code will be generic. Getting these all set up correctly in production can result in a lot of boilerplate and sometimes difficult to do correctly.

Dependency injection frameworks can help with this, I haven’t used any in rust though. For a typical web framework, you can set the correct types at the point where you register a route handler e.g.

      .route("", web::get().to(users_index::<RealUserRepository>))

However this can break down if the type needs to be constructed (if you can just call Default::default(), then this might be fine). Dependency injection is a huge topic and this post is long enough as is, so I’ll leave this for now.

Conditional Compilation seam

We’ve seen conditional compilation before — #[cfg(test)]. We can swap out an alternate type in tests like so

pub struct RealUserRepository {
  conn: DbConnection
}

pub struct FakeUserRepository {
  users: RefCell<Rc<Vec<User>>>
};

#[cfg(test)]
pub type UserRepository = FakeUserRepository

#[cfg(not(test))]
pub type UserRepository = RealUserRepository

Then you can write tests against RealUserRepository, but all of the code that depends on a UserRepository will always get a fake in unit tests (but not integration tests notably).

This has some upsides and downsides. You don’t have to define a trait (and deal with the current limitation of traits), and it’s simple to wire up — just wire it up as normal. You can use functions instead of structs, and don’t have to change your code in any way.

On the downside, now you can only have one test double for all of your tests. This means it has to be somewhat complex and full-featured so you can test all scenarios. There are libraries to help with this — foux is one, though I’ve never used it.

I think this method shines if you have a natural uniform interface in your code. What do I mean by uniform interfice? One example would be sending messages to actix actors — you can send many types of messages via the addr.send(msg) interface, and get different results based on what message you sent. This means if you use actix actors to do a lot of the heavy lifting in your app, and you have a conditional compilation-based seam for actors, you can mock any communication to them using the same test double. Actix also provides this test double via Mocker. So you could have

#[cfg(not(test))]
type DbActor = RealDbActor
#[cfg(test)]
type DbActor = Mocker<RealDbActor>

Then you can write a test like

let mock_actor: Addr<DbActor> = Mocker::mock(Box::new(|msg, _ctx| {
      let msg = msg.downcast_ref::<CreateUser>().unwrap();
      let result: Result<User, UserCreateError> = Ok(User::new());
      Box::new(Some(result))
    })).start();
// create_user_route is the route handler and is the unit of the test
let response = create_user_route(web::Data::new(mock_actor), web::Json(UserCreateRequestBody {username: "test"})).await.unwrap();
assert_eq!(response.status(), http::StatusCode::Ok());
assert_eq!(parse_http_response::<User>(&response).username, "test");

How does this work? Well, internally the Mocker uses the Any type to allow you to do dynamic type checking using the downcast family of methods. If your types don’t match up you’ll get a panic and a failed test. This works great for tests — it’s very flexible and lets you easily mock anything you send to an actor without messing with your types for non-test code. It’s a bit boilerplate-y, but that can be fixed with a library. The same test above would be

// setup the actor, assert that the right message has been set, and retur the value you need for your test
let mock_actor = simple_mock_actor::<RealDbActor, CreateUser, _>(|_msg| { Ok(User::new()) });
let response = create_user_route(web::Data::new(mock_actor), web::Json(UserCreateRequestBody {username: "test"})).await.unwrap();
assert_eq!(response.status(), http::StatusCode::Ok());
assert_eq!(parse_http_response::<User>(&response).username, "test");

What do I personally do?

I try and structure as much of my code as possible using pure data, so I don’t need mocking, I just have simple functions from input to output. However web services are like 90% IO so I do usually still have to mock quite a bit. I tend to favor using actix actors and using conditional compilation to swap out the real actor for the mocker. This lets my database interaction code be fairly simple — I usually use diesel which is synchronous and actix lets me make the database access async, as well as not having to worry about making my database layer mockable. My request handlers communicate with the mock actor for the database, but actix isn’t only good for database access — you can use it for any service you want to mock out and/or make async from synchronous. I’m sure other actor systems work well too, and using threads + a channel can be used to do the same thing, but since I usually use actix_web, actix is a nice choice.

Comparing Event sourcing with redux

Background

I’ve long thought that event-sourcing (ES) was an interesting way to structure a back-end application, and I use redux very often in my day job as a front-end developer. Recently I’ve been looking more closely at event-sourcing and I think putting the information I’ve learned into front-end terms might be useful for others.

This is mostly for front-end engineers interested in event-sourcing. If you don’t know redux jargon, you might find this hard to follow.

What are these things?

Redux is a framework for managing state in javascript UI applications, generally used with react on the web, but it can work in other situations as well — code outside of redux (your react app for example) dispatches actions which are sent to reducers which return a new copy of the application state. How state is transferred out of redux depends on what you’re hooking it up to, but with React (the most common situation by far) connected components subscribe to the redux state, and use selectors to choose what parts of the state they’re interested in. Workflows that require more than just dispatching an action can use middleware, which can observe all actions and the state when actions are dispatched, and dispatch additional actions based off of that information, or external state. Common middleware includes the thunk middleware, which lets you write functions that can dispatch multiple actions asynchronously; sagas which are coroutines that react to actions being dispatched by dispatching other actions or executing code; and observables which allows similar things as sagas, but work via rxjs observables.

ES is bit less well defined — it’s a pattern not a framework, but generally your ES system receives commands, the system determines which aggregate receives that command, and reconstructs its state by replaying all events related to that aggregate (the aggregate can start from the beginning of its existence, or from a snapshot which requires fewer events to be replayed). The aggregate can then validate the command and emit new events which are saved to the event store, and emitted on the event bus. Queries are done via a projection which is a current model of the state, in the form that the query can most efficiently respond. Workflows that require coordination between multiple aggregates use sagas which listen to events on the bus, and can issue additional commands as required by the business logic, or react to external state.

Big picture

Both ES and redux can be thought of as ways of implementing the overall Command/Query Responsibility Segregation (CQRS) pattern — in this pattern you have totally different objects to make state mutations (Commands) and to read state (Queries). Making this separation gives you a lot of flexibility since often you want different guarantees from commands and queries. Both also have a similar strategy for making this separation — state changes take place using events. Events are nice for this purpose because they’re a pure data structure — you can easily serialize them, version them, record them, etc. Additionally both are fairly amenable to a purely functional approach, though neither strictly requires it.

There are some big differences though — redux mutates state in response to new events, but doesn’t store the events. ES stores events as the single-source of truth (there may be other stored state, but it’s disposable and can be rebuilt from the events). This could be thought of as a straightforward translation since redux is a framework for front-end UI apps, and ES is generally a server-side thing.

The biggest difference though is that ES is way more opinionated than redux. Redux is a general state-management system that can be used in a wide variety of situations — the only inherent downside is boilerplate. ES however is explicitly meant for distributed, eventually-consistent systems. You can still have an event stream (for e.g. auditing) without doing ES, but ES by design will lead to certain high-level trade offs which may or may not be reasonable. ES does bring advantages with that though — an ES system should be able to be distributed, and unlike keeping a separate event stream you know you can always process all events correctly since you need to be able to for your system to work.

Bad analogy time

If you want a short version of how to translate redux to ES, here’s a rough correspondence

ReduxES
ActionEvent
ReducerAggregate
SelectorProjection
Any external code that can dispatch multiple actions (Thunks, can be done in a hook, or React callback)Command
sagassagas

Let’s go into a bit more depth on how these are all bad correspondences

Events: unlike actions, events are not the inputs to the system, they are the outputs. So in redux, we dispatch actions in order to update the redux state. This is kind of like how an aggregate updates its internal state in response to events, but events are added to the stream by issuing commands, not by just any code in the application. We especially don’t want aggregates creating new events when applying events, because then we’ll get new events added every time we replay the events.

Aggregates: aggregates are kind of like reducers, but one big difference is that aggregates only receive events that are scoped to them. A reducer-like function is a totally reasonable way to implement an aggregate, but also keep in mind that an aggregate just determines the state for a single instance of the type of object an aggregate is, but a reducer can manage many objects of that type (an aggregate doesn’t have to correspond to a database table, or a single class, so you could have multiple children of an aggregate — all the comments on a blog post, for example)

Projections and selectors are pretty similar — they’re cached versions of the true data in a form close to how the UI wants it. The difference is just that like all redux stuff the selector is in-memory (and memoized if using reselect), while a projection is usually stored in some sort of datastore (optimized to be fast to retreive).

Commands are the biggest thing missing from redux — redux actions are a cross between events and commands, but since commands can produce multiple events they’re also like thunks, or if you don’t use thunks they’re also like callbacks that dispatch multiple actions. Much like thunks, commands can be used as a place for throwing in side-effects

Dealing with background tasks, or making commands in response to events that occur is a concern with both, and sagas are a solution that can be used with both. Redux-sagas are clearly influenced by sagas as used in ES, so that’s probably why.

Well, it's been a while

I haven’t update this since 2016, it’s 2021 now. Hmm… was there something in the interveining years that maybe stressed me too much to find the time to post? The world may never know. Anyways, I’ve been meaning to reveamp this as a Gatsby site for a while. Nothing wrong with jekyll, but I like working with react.

On the future of computing

I’ve been listening to the accidental tech podcast recently and one of John Siracusa’s hobbyhorses is that something like iOS has to be the way forward for computing, as opposed to classic desktop OSes. While I think everyone agrees that desktop OSes aren’t going to get much more popular than they are now, my problem with this statement is that it’s so vague that it doesn’t mean anything. Being like iOS can mean many things, and the only really concrete thing he throws out is no exposed filesystem, which I doubt is going to be an essential part of the future since iOS is the only OS that doesn’t expose the filesystem. I mean is the future more like iOS? chromeOS? android? windows 10? All of these seem possible to me. And iOS being the most extreme of these in many ways makes me suspect that it’s not what the future’s going to look like.

So it annoys me when he says this (especially coming from someone so usually precise as he is), but just posting a compaining blog is boring, so I’m going to break down what seperates iOS from desktop OSes and what I think how important these are to the future. My perspective is as a developer, how are you going to use these platforms to write code? Right now you really can’t do that in general on iOS, android, chromeOS, or windows store apps.

Touch

OK, so touch interfaces are definitely part of the future. But are a mouse and keyboard part of the future? I think that keyboards probably are. They’re just so much more efficient than virtual keyboards for lots of text input that I can’t imagine them going away. Mice could go away, but if they do they’ll probably be replaced by pens or somethign else for precise input. ALso keep in mind that touch can exist in a desktop OS. The only problem is legacy applications with too small of touch targets. Touch is in some ways more accessible than mouse/keyboard, but also less accessible in other ways (compare gestures to keyboard shortcuts. How do you show that there are gesture shortcuts in the same way to you can show keyboard shortcuts now?).

I suspect that the platform of the future will come in all sorts of formfactors, with different input devices: phones and tablets will use touch almost exclusivly, laptops will have keyboards and possibly mice/touchpads, and convertable devices of various sorts will exist and support differnent devices at different times.

Sandboxing

This (like touch) can exist in desktop OSes, and in fact a web browser is really a form of sandboxing, letting you download applications from the internet safely. Sandboxing is going to be a part of the future since it’s part of the present, however I do think that having a way to escape the sandbox is important for lots of workflows (certainly developers need it). Maybe letting apps escape the sandbox should be hard enough that regular users won’t do it, but if you want developers on the platform of the future there needs to be a way to do various unsafe things. Maybe it even voids the waranty, but it should be possible.

Appstores

Most platforms have an appstore (at least one) at this point. They’re going to be with us for the forseeable future. However I’m not sure that an iOS style appstore with people reviewing every version and banning whole classes of apps is sustainable for the platform of the future. At the very least in order for the platform to be one that is suitable for doing real work, you need to be able to add custom apps. I believe that iOS does have a provision for this for corporate apps, but I think long term there needs to be some option for this for non-corporate apps so that things like developer tools can be installed. I also think it’s totally inappropriate for the only source of software for a system to censor not just things that my be objectionable for legal reasons (porn, emulators, etc), but even just applications centered around competing platforms. Remember that android magazines have been rejected from the appstore just for talking about different platform. That is not really acceptble for the platform of the future.

Non user-visible filesystem

I’ll be short here: I think it was an interesting experiment for there to not be a user visible filesystem on iOS, but I think it’s a failure. No other OS has replicated this aspect of iOS, and although the sitution has greatly improved after the introduction of extensions in iOS 8, it’s still a major issue. How can you have a workflow where multiple apps access different files in the same project without them being able to access the same shared files? You can copy between them for small files but that becomes a problem fast for large files, and putting all of the functionality in one big app is not feasible.

I don’t think the filesystem has to be exposed by default. Android has a user accessible filesystem, but you can use a lot of functionality without it. However if you need it, it’s there.

Fullscreen / tiled apps

This is as opposed to classic windowing systems. Now obviosly you don’t want fully overlapping windows on a phone screen (or even a tablet screen really. 2 apps tiled next to each other is really the limit as far as tablets go), however I don’t think that there will be no big screen devices. Once you get to a certain screen size you really do want to be able to overlap windows. Android recently showed off an experimental overlapping windows mode, and I think that will get use and opens up the possibility of android laptops.

So I think that the platform of the future will be responsive: small screens will have one fullscreen app. Larger ones will be able to tile 2 or 3 apps, and then after a certain point overlapping windows.

No backwards compatibility

Apple has no problem with killing of features and APIs it doesn’t want apps to use anymore. It’s possible some non-updated versions of iOS appstore launch apps will still work, but there’s no garauntee. On the other hand, microsoft and google have done huge amounts of work to ensure backwards compatibility. This isn’t just a nice to have thing, but to be the primary platform of the future, you can’t rely on apps getting updated for better or worse. The stronger sandboxing of modern platforms hopefully makes making changes without breaking compatibility easier, but there’s still a level of responsibility that a platform owner needs to avoid breaking all of their working (but maybe old) apps.

No peripherals

This is another area where I think future platforms will be more like current PCs than current mobile devices — you’ll need a lot of peripherals. However, hopefully at some point we get a better wireless protocol than bluetooth (or a future version of bluetooth that doesn’t suck at least), because that’s probably how most peripherals will connect.

Conclusion

Overall I think the main distinguishing feature of modern OSes is not their limitations, but how they’re written: they are sandboxed and have a more complex lifecycle than a classic Mac or Windows app — the OS can start or stop them at will. Over time mobile OSes will grow to have the functionality of classic OSes, but keep the security and power savings of their current incarnation.

The Two types of TDD

[This recent artical](http://iansommerville.com/systems-software-and- technology/giving-up-on-test-first-development/) and it’s hacker news comments Have gotten me thinking about testing and TDD, specifically how there are different reasons why people do it. I think it helps to go back and look at the two “Schools” of TDD to understand what each one was trying to get out of their tests and what the strengths of the two approaches are relative to each other (and relative to non-TDD methodologies).

First, a history lession: TDD first originated in the [Chrystler Comprehensive Compensation Project](https://en.wikipedia.org/wiki/Chrysler_Comprehensive_Com pensation_System). In fact many of the practices we now consider to be good agile practices came out of the “Extreme Programming” community that really found their footing at the C3 project. The TDD that was done by this project was focused on preventing regressions. This means the tests are fairly realistic, and only test interfaces, not internal details. This is known as “Detroit School” TDD.

After TDD became more popular, a technique for doing TDD that relied heavily on test doubles (stubs, mocks and spies) was described in the book Growing Object Oriented Software Guided by Tests. In this methodology, tests were primarily a design tool. Since the authors of the book were both from London, this is known as the “London School”.

Most people doing TDD don’t consiously follow any particular school, however the Detroit school makes intuitive sense to most people: test help prevent bugs, right? However the Detroit school is prone to the types of problems described in the original artical: You end up with lots of duplicate tests, tests need maintenance as much as code, and refactors that change the interface of components get harder and harder as there’s more tests. Early design decisions get frozen in the codebase.

The Detroit school isn’t perfect, but it does shine at building reliable components. Since the tests are relistic and redundant there’s a good chance that you’ll know if something breaks. Internal refactors are also easy to do. Only interfaces are tested and so anything that doesn’t change them is safe.

The London school avoids the problems of Detroit school: in the London school you use mocks and dependency injection to avoid having any redundancy in your tests. The goal is that if you make one change, all breakage should be isolated to that one component. This makes your tests easy to maintain, and encourages a certain type of design. However because you have to specify the behavior of your dependencies via test doubles refactors involve much more work. This is less of a problem than you’d expect. A London school practicioner would say that if you need to change your interface you should just throw your old code away and rewrite everything. London school code is usually fast to write so it’s a viable strategy. They do have end to end tests that test for regressions, but unit tests aren’t really looking for them.

So these two schools of thought lead to a lot of the confusion about TDD. If you are doing Detroit school TDD you should avoid test doubles as much as possible. Likewise if you are doing London school TDD you shouldn’t be doing any refactors, just throwing away old code. Trying to combine these schools of thought within a single component usually leads to pain.

But what about using different types of tests for different components? Most applications have code that’s tied to the implementation (things like database code, networking, logging, etc) and code tied to the problem domain (aka your application logic and/or business logic). Implementation code should be stable and reliable, which to me sounds like Detroit school TDD. The higher level logic is going to change more often, doesn’t have many dependencies between high level components, and can probably be thrown away and rewritten if there are significant changes, which sound sideal for London school TDD. Now not everything falls neatly into these two buckets so you’ll have to decide how you’re going to test those components, and you should probably chose either Detroit or London school testing for each one. I do think that this way of looking at testing is a helpful way to think about it.

So why do TDD at all? TDD does always involve more work to build and maintain your testsuite. Not having tests has well known problems: fear of changes since any change can break anything and you might not find out until a tester happens to discover it. Test-after seems better, but has the same test- maintenence issues as TDD but since you’re writing the tests after you don’t have as much confidance that they’re working. TDD is not without issues, but I still think that it’s the best way to write nontrivial software.

How to be concurrent

There are lots of ways that different languages do concurrency, and I want to talk about the general ways they do it, without getting bogged down in language details.

So what is concurrency? It’s not parallelism, that’s for sure. It’s at it’s simplest the ability to do work in the background while not pausing work in the foreground. Some forms of concurrency can use parallel hardware resources (CPU cores, etc), but not all.

I’m going to clasify low-level concurrency features (as opposed to high-level patterns that can use multiple features at once) along the following axes:

  1. Shared/Seperate memory

If two concurrent tasks share an memory, then sending data between them is trivial but it’s possible to corrupt data that isn’t protected somehow. The protection can either be locking of some sort, transactions, or just explicit switching between tasks.

  1. Allows parallelism / no parallelism

Concurrency is not parallelism, but if you have parallel hardware (multiple cores, etc) it can often make sense to do parallel computation with the same abstraction you use for concurrency. However the downsides are the need for additional synchronization which can wash out any advantages you get from parallelism.

  1. Implicit / explicit task switching

If your tasks switch implicitly, you have to protect any data that can be shared between different tasks. Explicit task switching removes that need, but has boilerplate and can cause global slowdowns if a single task does not yield.

Forms of concurrentcy

Processes

  • Seperate memory
  • Allows parallelism
  • Implicit task switching

Processes are extremely safe to use. You can’t share data, and you can’t freeze the system through negligence (though deadlock is always an option). However, these process can be quite heavyweight in imperative programming (they can be lighter weight in a functional system because zero copying is necessary in order to send messages between processes)

Examples:

  • OS processes (heavyweight but general. Literally any language can use seperate OS processes)
  • Erlang processes (lightweight, but tied tightly to a particular system and language)

Threads

  • Shared memory
  • Can allow parallelism (depends on language/implementation)
  • implicit task switching

Moving from the safest interface to the least safe, threads can extremely easily to corrupt your memory. For this reason some languages reduce the risk with a global lock (python’s GIL or ruby’s GVL). I think threads work very badly with dynamically typed languages because all writes are read/writes. That makes correct locking extremly difficult. You still need to lock any shared data.

However threads are extremely flexible. It’s what most other types of concurrency (including processes, inside the OS) are implemented with.

Examples:

  • OS threads (supported by most languages)
  • Goroutines

Async functions

  • Shared memory
  • No parallelism
  • Explicit task switching

This is what javascript uses. You schedule some task (usually some form of IO) and wait for it to complete or fail. No async tasks are completed until you either ask for them (in lower level languages), or all of your code has returned (in higher level languages, especially javascript).

Examples:

  • poll/select/epoll/kqueue
  • javascript
  • event machine/twisted/tornado/etc

Why do most forms of concurrency fit one of these groupings? Let’s look at the others:

  • Seperate memory
  • No parallelism
  • explicit task switching

This just seems to not have any benefits: you can’t share data, you can’t do anything in parallel, and you have to explicitly switch tasks all the time. If you’ve got seperate memory there’s no reason to not allow implicit task switching and parallelism.

  • Seperate memory
  • No parallelism
  • implicit task switching

This is a bit better. Erlang used to be like this (only one thread was multiplexed between processes), but it’s really just a matter of technology to allow parallelism. Again, if you have seperate memory you might as well allow parallelism. That said, this is a perfectly reasonable initial implementation.

  • Shared memory
  • No parallelism
  • implicit task switching

Running go with GOMAXPROCS=1 is basically this. Same with greenlets. You still need to protect your data from access by multiple threads, but in practice less is required, you can get away with being sloppy. It’s kind of like the old erlang processes: you don’t lose anything by being parallel so you might as well do it down the line, though it’s more of a tradeoff here than a pure win.

Variants

These general categories of concurrency features have different tradeoffs, but those can be changed somewhat by implementation choices. The fundamentals don’t really change, but what’s cheap or expensive can change:

Processes

  • Lightweight processes

If you multiplex many processes onto a small, fixed numer of OS threads/processes, you can make processes mor elightweight. The tradeoff with lightweight versus full processes is that lightweight process generally cannot call C code easily and directly, but they use less memory.

Threads

  • Lightweight threads

Lightweight threads are multiplexed onto a small number (usually equal to the number of CPUs) of hardware threads. They have similar tradeoffs as lightweight processes — they make interaction with the OS and hardware more difficult, but use less memory so more can be started.

  • Static verification

This is rust’s big trick. Rust’s rules of ownership disallow data races at compile time. In order to share data between threads you need a mutex or other protection, and this is impossible to mess up in safe rust. This makes more ambitious use of threads feasible. However it increases the complexity of the language and can only catch a subset of concurrency problems (in rust’s case, only data races).

Async

  • Promises/Futures

Promises (or Futures) are the representation of some value that will be available eventually. They provide a good abstraction for building async combinators on top of, which raw callbacks do not. Callbacks are more general, but promises are a good basis for dealing with common concurrent patterns.

  • Async/await

First coming from C#, but now spreading to many languages, this makes async programming look serial, but keeps all task switching explicit. It can also be faked if you have a coroutine abstraction. The tradeoff here is language complexity vs development efficiency.

In-depth examples

Erlang

Erlang is intended to be used in highly reliable systems. It does this by having many processes that are isolated from each other and a tree of processes monitoring each other, so that lower level process are restarted by higher level processes. This leads to a lightweight process model: you don’t want processes to have hidden dependencies on each other, because then you can’t kill and restart them if something goes wrong, and you want to be able to start a truly huge number of processes. Erlang is deeply affected by this concurrency model — no types that cannot be efficiently serialized and sent between processes that are possibly on different machines exist in erlang. This makes erlang extremely well-suited for what it was designed for: highly reliable networking infrastructure, but less well suited for many other types of programming.

Go

Go was designed as a reaction to C++, and draws some inspiration from erlang, specifically it has goroutines which are lightweight threads. Unlike erlang however, goroutines are not prohibited from sharing memory (socially it’s recommended to communicate by message passing, but sharing memory is allowed, and easy to do by mistake). This takes away many of both the benefits and drawbacks of erlang’s model. This has the side-effect of making Go more of a Java competitor, rather than a C++ competitor: interacting with the system (as in, calling C) has lots of overhead and complexity. That said, having threads be cheap makes many nice patterns feasible that would be prohibitively slow in other languages. Go also does provide good tools for communicating using message passing, and strongly recommends it’s use. This has the effect of having concurrency be much like the rest of the language: simple, pragmatic, but full of boilerplate and pitfalls.

Rust

Rust is also a reaction to C++, but has much stronger compile-time abstractions (as opposed to Go having almost all run-time abstractions). For concurrency, rust experimented with many different forms: for a long time it supported go style lightweight threads, however now it only supports native threads built in (though like all languages you can spawn additional OS processes, or use async functions). The advantage of rust over C++ in concurrency is that rust enforces proper memory accesses at compile time. This adds some complexity to the language (though rust gets great bang for the buck: the same compile time check to ensure proper memory use with threads, also ensures proper memory use within a thread), and can be hard to learn, but matches the way that systems programmers generally already write code. This makes rust a true systems language: low runtime overhead, interacting with the system is basically free, and but more difficult to program in than higher-level languages.

Node

Node’s answer to concurrency issues is to just always be single-threaded, and use async functions for all concurrency. In fact, it doesnt have blocking functions for many IO operations (and even ones it does have are rarely used). This infamously leads to giant chains of callbacks, though these days promises and async/await can help with this dramatically. It does split all javascript functions into sync and async functions, something that has to be kept in mind always while writing node code. The plus side is that it doesn’t make any promises it can’t fufill, unlike other dynamic languages (like python and ruby which offer threads but have locks on running all python/ruby code). Since there’s almost no blocking IO, it also means that each node process can handle quite a bit of IO, making it great for networking applications or web servers. However node doesn’t have a great story for handling computation heavy code yet. You can spawn a different OS process, but it’s still not an easy operation. At some point node may introduce a lightweight process, but node is probably never going to offer shared memory concurrency.

nginx

nginx is a great example of how to combine different concurrency models. It spawns a thread for each CPU, and then within each thread uses async functions to do ant actual IO. This makes for a highly efficient system: it can handle lots of connections, but unlike somethign like node, if there’s some heavy computation that needs to happen at some point other threads will pick up the slack while one thread is blocked. Node can work around the issue sometimes with multiple processes, but multiple threads

Conclusions

This is more of an overview than anything, but I hope that it helped you understand what different types of concurrency are available, and what the different tradeoffs are. You could write a whole book about this topic.

My own opinion has shifted over time to think that lightweight threads and processes are over-hyped. They aren’t bad ideas, but it’s not a pure win like so many portry it as.

Isomorphic Javascript is just Progressive Enhancement done right

I was just at Fluent this week, and I had an interesting thought, spurred by several things, but really crystalized when I saw this talk by Eric Meyer.

So, the (perhaps badly named) concept of Isomorphic Javascript is usually sold as a performance optimization for loading time in single-page applications, which is one benefit it provides. However it actually fixes the problem with single-page apps — they break the web. A single-page app that does not render on the server (isn’t isomorphic) doesn’t just degrade when javascript doesn’t work, it’s totally broken. Like, blank page. This is a problem on any page, but practically, it’s biggest on the open web (not behind a login). Closed sites (and especially enterprise sites/apps) can usually get away with doing various odd things, even though they probably shouldn’t. Things like web spiders, users on crappy mobile connections, users behind odd firewalls, these usually matter more on the open web. (Accessibility is also easier when rendering on the server, but can be made to work with javascript-only sites)

The thing is, older jquery-based progressivly enhanced sites had a number of problems:

  1. You either had to have double the rendering code, or have your page look and work dramatically different without javascript
  2. You might have a page that was technically usable, but in practice terrible without JS — datepickers are the most common thing I can think of. In a typical jquery-type date-picker progressive enhancement situation, there’s a text input with a particular format you need to use, which is much mor epainful to use than a datepicker.
  3. As you move more logic into the client, maintenence and code orginization becomes a problem that traditional tools like jquery plugins just can’t solve.

The first-attempt solution to these issues was with the various first-gen javascript app frameworks. Angular 1, Backbone, Ember 1, etc. These frameworks were developed with the closed web in mind — enterprise apps, or at least ones that needed a login. I’m not sure the creators of those frameworks envisioned things like blogs using these frameworks, and indeed, it has caused problems when they do. They were tightly coupled to the actual DOM, so though they could be made to prerender the page with enough work, it wasn’t easy. Various frameworks attempted to make rendering on the client and server equally easy, but it wasn’t really until React came out that the idea went mainstream. Now all of the next-gen frameworks (including Angular 2 and Ember 2) will be much easier to render isomorphically.

Which brings me to my point: Isomorphic javascript is just progressive enhancement done right. You always serve up a usable page, but you can do it without sacrificing all the benefits of single-page apps. Of all the ways to do progressive enhancement, it’s the most:

  1. Accurate — the markup will be the same because it’s rendered by the same code
  2. Maintainable — one codebase, one rendering path
  3. Quickly rendering — we can use all the tricks of traditional html rendering sites to get the page to render fast
  4. Quickly updating — user input is captured instantly and processed by javascript when available.

Isomorphic javascript can actually do things that are usually infeasible to do in typical progressive enhancement as well — it can render the page with your open modal or datepicker in it on the server, and have it work exactly like when javascript is working. None of this comes for free — testing and hard work is still needed — but things become feasible that weren’t before.

EPR: A utility for simplifying node paths

So let’s say you’ve got a node project, with a structure somewhat like this:

- project/
  - package.json
  - server.js
  - lib/
    - file1.js
    - file2.js
    - models/
      - model1.js
      - model2.js
  - spec/
    - file1Spec.js
    - file2Spec.js
    - models/
      - model1Spec.js
      - model2Spec.js

Your require statements in your specs can easily get very ugly:

var model1 = require('../../lib/models/model1')

They’re also fragile — if you move either your spec file or your implementation file, you’ve got to update your requires. This is a good argument for using lots of small modules that can be broken out — if a module lives in your node_modules folder then requireing it is always easy:

var file1 = require('file1')

The problem is that when you’re writing an app lots of the code can’t really be seperated out to tiny modules — it’s app-specific. There have been a few suggestions on how to address this problem, but epr is my attempt at solving it in a nice, repeatable way.

EPR works by making symlinks in your node_modules folder. It gets the list of symlinks to create from your package.json file. So for the above example, you could add the following to your package.json file:

{
  "epr": {
    "file1": "lib/file1.js",
    "file2": "lib/file2.js",
    "models": "lib/models"
  }
}

You could have requires like the following:

require('file1')
require('file2')
require('models/model1')
require('models/model2')

no relative paths present, and you never need to update any requires — you just need to update your package.json if you move one of your files.

So check out epr, if you’re using node and are annoyed by relative paths.

NOTE 2019: Revisiting this, I think there are better solutions to this problem now. EPR still works, and if anyone makes a PR or requests a bugfix, I’ll take a look at it, but I’m not using it anymore.

RISC vs CISC doesn't matter for x86 vs ARM

If you’ve been following tech lately, you’ve probably heard people talking about the competition between x86 chips (mainly from Intel), and arm chips. Right they’re used mostly for different things — phones and tablets have arm chips, desktops, laptops, and servers have x86 chips — but Intel’s trying to get into phones and arm vendors want to get into servers. This promises lead to some exiciting competition, and we’re already reaping the power benefits of Intel working on this in desktops and laptops. However whenever this comes up, people bring up that arm is RISC and x86 is CISC, presenting RISC like it’s a pure advantage and that x86 must be crippled because it’s CISC. This doesn’t matter and hasn’t for a long time now, but let me explin why.

RISC means Reduced Instruction Set Computing, and it really comes out of the 80s, and it describes a certain style of instruction set (ISA) for a computer. The instruction set are all the low-level commands the CPU supports, so it might have things like “load this value from memory”, or “add these two numbers together”. The ISA doesn’t say how those commands have to be implemented though. Despite the name, the one thing that really seperates RISC from other types of instruction set is not the number of different instructions, but that most instructions to only one thing — they don’t have different addressing modes. On traditional architectures you’d have instructions that do the same thing, but can work on different types of operands. For example you might be able to add 2 registers together, or add memory and a register, or add memory to memory. This could become extremely complex, and arguably reached the height of it’s complexity in the VAX ISA. The VAX was very nice to write assembly code in, but the vast majority of those addressing modes weren’t needed when you use a language like C, and the compiler is responsible for making sure you load data when you need to.

The big argument that RISC proponents made was that you could cut out many of these addressing modes, and focus on making your basic operations fast, resulting in a faster overall chip. Since most modes in something like the VAX were rarely used they were usually microcoded and slow, so you had to know which modes were fast anywaysdefeating a lot of the point of having so many complex modes. RISC proponents dubbed traditional ISAs as CISC (Complex Instruction Set Computing), it’s not a term that anyone would use for their own work. RISC was very successful in the 80s — ARM started then, DEC (the makers of the VAX) made the alpha, Sun made the SPARC, and even IBM got into the action with POWER. However this was mostly in “big” chips (ARM being the big exception), the other story of the 80s was the growth of the micros — tiny chips that were cheap enough for individuals to buy started coming out in the 70s, and by the 80s there were lots of computer using them: think of IBM PCs (using x86), Commodore 64s (using the 6510, a varient of the 6502 which was used in the Apple II and NES as well), the original Apple Macintosh, and the Amiga (the mac and amiga both used the motorola 68k family). All of these were using what we’d consider CISC chips — they had various addressing modes. Nothing crazy like the VAX, but it was always the outlyer in ISA complexity. All of these ISAs still exist, but most are only used in tiny embedded chips (other than x86). Of those computer ecosystems, the PC took over the world, and the mac survived, but still is a small portion of the computer market (and uses x86 anyways these days after using a RISC chip for a while).

So with that story set, why doesn’t RISC and CISC matter anymore? Well there are 2 big reasons 2 things: out of order execution (ooo), and the fact that an ISA doesn’t specify how a chip is implemented. Out of order execution was the end result of a lot of things people were trying to do with RISC chips in the 80s — each instuction basically executes asynchronously and the CPU only waits for the results of an instruction if it’s being used by somethign else. This makes the ISA matter a lot less because it doesn’t really matter if you load data and use it in one instruction or 2. As a matter of fact, since the late 90s Intel has been internally splitting it’s CISC instructions into RISC-like micro-ops, which points out how the whole RISC vs CISC thing is pointless these days.

That doesn’t mean that ISA doesn’t matter, but the devil is really in the details now. x86 is honestly a bit of a mess these days, and decoding it is more complex than decoding ARM instructions (or really any other extant ISA). ARM also just updated it’s ISA for 64 bits, and from what I’ve heard it sounds like they did a really good job, basically making the totally generic RISC ISA with no weird stuff that makes it hard to use. X86 was never even close to the complexity of somethign like the VAX, so avoided a lot of it’s problems. RISC chips are also not without strange things that hurt them down the line — they often exposed internal details of their early implementations, which they had to emulate in later faster version. So if you want to compare the x86 and arm ISAs, that’s actually an important and interesting comparison to make, but the acronyms RISC and CISC don’t actually add anything.

osc - play with the web's audio generation capabilities

I’ve been playing around with the WebAudio api for a bit and come up with a nice little demo program that shows the basic capabilities of the OscillatorNode interface (plus some fun canvas programming). It’s not a serious project, but it is fun. I’m calling it osc, and you can also check out the source code.