Capy and Boost.Cobalt: A Comparison

Both libraries use C++20 coroutines for asynchronous programming. The differences begin with the foundation.

Cobalt is a coroutine layer built on Boost.Asio. It adds coroutine syntax — promise, task, generator — on top of Asio’s existing I/O infrastructure. Asio is not coroutines-only. It supports callbacks, futures, and coroutines equally. Cobalt inherits this foundation. It can add coroutine types on top, but it cannot change what lies beneath.

Capy is a coroutine-native I/O foundation designed from the ground up. The design started from the ideal use case and worked backward to the implementation. The concept hierarchy, the type-erased wrappers, the allocator model — these fell out naturally from use-case-first design, without compromise.

The Dimovian Ideal

An I/O library should make the implementation completely invisible to its consumers. Public headers declare the interface — types, functions, contracts. All platform-specific machinery lives in the translation unit. No implementation detail leaks into the consumer’s code.

Capy achieves the Dimovian Ideal. The proof is in example/asio/.

The Header

api/capy_streams.hpp is the public interface. It contains zero Asio includes:

#include <boost/capy/ex/execution_context.hpp>
#include <boost/capy/ex/executor_ref.hpp>
#include <boost/capy/io/any_stream.hpp>

#include <utility>

namespace boost { namespace asio { class io_context; } }

class asio_context : public capy::execution_context
{
    struct impl;
    impl* impl_;

public:
    using executor_type = capy::executor_ref;

    asio_context();
    ~asio_context();

    net::io_context& context() noexcept;
    executor_type get_executor() noexcept;
    void run();
};

std::pair<capy::any_stream, capy::any_stream>
make_stream_pair(asio_context& ctx);

Asio appears only as a forward declaration. The context uses pimpl. The factory returns capy::any_stream — a type-erased stream that hides the concrete socket type entirely.

The Translation Unit

api/capy_streams.cpp is where every Asio header lives. The concrete asio_socket wraps tcp::socket. The concrete asio_executor wraps io_context::executor_type. All of it is invisible to consumers of the header.

The Algorithm Code

any_stream.cpp demonstrates the result. It includes api/capy_streams.hpp and Capy headers. No Asio headers. None.

capy::task<>
writer(capy::any_stream& stream, std::size_t total)
{
    char buf[128];
    std::memset(buf, 'X', sizeof(buf));

    std::size_t written = 0;
    while(written < total)
    {
        std::size_t chunk = (std::min)(sizeof(buf), total - written);
        auto [ec, n] = co_await stream.write_some(
            capy::make_buffer(buf, chunk));
        if(ec)
            co_return;
        written += n;
    }
}

capy::task<>
reader(capy::any_stream& stream, std::size_t total)
{
    char buf[128];

    std::size_t read_total = 0;
    while(read_total < total)
    {
        auto [ec, n] = co_await stream.read_some(
            capy::make_buffer(buf));
        if(ec)
            co_return;
        read_total += n;
    }
}

writer() and reader() operate on capy::any_stream&. They don’t know what I/O backend produced the stream. They never need to know.

What Cobalt Does Instead

Cobalt’s cobalt::io namespace provides wrappers around Asio I/O objects. These wrappers expose concrete Asio types through their interfaces. A cobalt::io::steady_timer is an asio::basic_waitable_timer. A cobalt::io::socket is an asio::basic_stream_socket. The wrappers preserve direct access to the underlying Asio types.

Consumers of Cobalt I/O objects must include Asio headers. The backend remains part of the public interface.

A library written against Capy’s type-erased streams can be relinked against entirely different stream implementations. TCP today. QUIC tomorrow. A test mock in CI. The polymorphism is the same as what templated Asio code achieves — except the library does not need a recompile. The binary is the interface. Drop in a new .so or .dll that implements the stream contract, relink, and behavior changes.

Templates can achieve this by type-erasing every customization point. The cost makes it impractical.

Aspect Capy Cobalt

Backend includes in header

None (forward declaration only)

Required

Implementation hiding

Pimpl + type-erased returns

Concrete Asio types exposed

Algorithm code depends on backend

No

Yes

Relink without recompile

Yes

No

ABI stability across implementations

Yes

No

Stream Concepts

Capy defines seven coroutine-only stream concepts. Cobalt inherits Asio’s AsyncReadStream and AsyncWriteStream, which are hybrid concepts supporting callbacks, futures, and coroutines. Cobalt’s cobalt::io wrappers simplify the API and Cobalt defines stream abstractions (write_stream, read_stream, stream) as abstract base classes, a distinct approach from Capy’s concept-based hierarchy. Cobalt’s wrappers still include full Asio headers. See Write Stream Design for a detailed comparison of the two approaches.

Capy’s concepts form a refinement hierarchy that emerged naturally from use-case-first design:

  ReadStream                WriteStream
  (partial reads)           (partial writes)
       |                         |
       v                         v
  ReadSource                WriteSink
  (complete reads)          (complete writes + EOF)


  BufferSource              BufferSink
  (zero-copy pull)          (zero-copy prepare/commit)

BufferSource and BufferSink implement callee-owns-buffers I/O. The source provides buffers; the caller processes them in place. No copies. Memory-mapped files, hardware DMA buffers, and kernel-provided memory all work naturally through this pattern.

Concept Capy Cobalt

ReadStream

Yes

WriteStream

Yes

Stream

Yes

ReadSource

Yes

WriteSink

Yes

BufferSource

Yes

BufferSink

Yes

Type-Erased Streams

Traditional approaches to type erasure in Asio focus on the lowest-level elements: the completion handler, the executor, the allocator. This is not the right layer. Type-erasing these individually adds overhead at every customization point while still leaving the stream type concrete and visible.

Capy type-erases the stream itself. This is possible because coroutines provide structural type erasure — the continuation is always a handle, not a template parameter. When the library is coroutines-only, one virtual call per I/O operation is the total cost. The completion handler, executor, and allocator do not need individual erasure because they are not part of the stream’s operation signature.

Cobalt defines stream abstractions (write_stream, read_stream, stream) as abstract base classes in cobalt/io/stream.hpp, taking a different approach from Capy’s concept + type-erased wrapper model. See Write Stream Design for a side-by-side analysis.

The wrappers compose. any_buffer_source also satisfies ReadSource — natively if the wrapped type supports both, synthesized otherwise. any_buffer_sink also satisfies WriteSink. You pick the abstraction level you need.

  Concept              Type-Erased Wrapper
  --------------------+------------------------
  ReadStream     ----->  any_read_stream
  WriteStream    ----->  any_write_stream
  Stream         ----->  any_stream
  ReadSource     ----->  any_read_source
  WriteSink      ----->  any_write_sink
  BufferSource   ----->  any_buffer_source  ----> also satisfies any_read_source
  BufferSink     ----->  any_buffer_sink    ----> also satisfies any_write_sink

This is how the Dimovian Ideal is mechanically achieved.

Type-Erased Wrapper Capy Cobalt

any_read_stream

Yes

any_write_stream

Yes

any_stream

Yes

any_read_source

Yes

any_write_sink

Yes

any_buffer_source

Yes

any_buffer_sink

Yes

Mock Streams and Testability

When algorithms operate on type-erased interfaces, testing becomes deterministic. Capy provides mock implementations for every stream concept. Cobalt defines stream abstractions as abstract base classes but does not provide mock implementations for testing. See Write Stream Design for a comparison of the two stream designs.

Capy’s mock types:

  • test::read_stream, test::write_stream — partial I/O mocks

  • test::stream — connected pair for bidirectional testing

  • test::read_source, test::write_sink — complete I/O mocks

  • test::buffer_source, test::buffer_sink — zero-copy mocks

test::fuse injects errors systematically at every I/O operation point. test::run_blocking executes coroutines synchronously for deterministic unit tests. max_read_size and max_write_size simulate chunked delivery. expect() validates written data.

Tests run without sockets or network access, eliminating non-determinism.

Testing Feature Capy Cobalt

test::read_stream

Yes

test::write_stream

Yes

test::stream (connected pair)

Yes

test::read_source

Yes

test::write_sink

Yes

test::buffer_source

Yes

test::buffer_sink

Yes

Error injection (fuse)

Yes

Synchronous execution (run_blocking)

Yes

Chunked delivery simulation

Yes

Data validation (expect)

Yes

Threading Model

Cobalt is single-threaded by design. One executor per thread. Channels are restricted to a single thread — Cobalt’s own documentation states: "Channels can be used to exchange data between different coroutines on a single thread." Primitives cannot be shared between threads.

Capy supports multi-threaded execution. thread_pool distributes work across threads. strand serializes execution without blocking OS threads. The Executor concept is open — implement your own.

Threading Capy Cobalt

Multi-threaded execution

thread_pool

No

Serialized execution

strand

Single-threaded only

Executor model

Concept-based (open)

Single-threaded (closed)

Cross-thread channels

Yes

No

Primitives shareable across threads

Yes

No

Context Propagation

Cobalt stores executor context in thread-local variables. Coroutines access it via this_coro::executor. This works on a single thread with a single executor. This design is scoped to single-threaded, single-executor configurations.

Capy introduces the IoAwaitable protocol and uses it for context propagation. When you co_await, the caller passes its execution environment to the child structurally:

auto await_suspend(std::coroutine_handle<> h, io_env const* env);

No thread-local state. No ambient context. The executor and stop token flow forward through the call chain via the io_env parameter.

Context Propagation Capy Cobalt

Mechanism

await_suspend(h, env)

Thread-local variables

Works with strands

Yes

No

Works with multiple executors

Yes

No

Stop token delivery

Structural (io_env)

this_coro::cancellation_state

Cancellation

Both libraries propagate cancellation automatically through coroutine chains. Both support OS-level cancellation of pending I/O operations (CancelIoEx on Windows, IORING_OP_ASYNC_CANCEL on Linux).

Capy uses std::stop_token, propagated via the IoAwaitable protocol’s io_env parameter. The token flows forward structurally alongside the executor.

Cobalt uses Asio’s cancellation_signal and cancellation_slot. Propagation is wired automatically in await_suspend via forward_cancellation. this_coro::cancellation_state provides filtering control over which cancellation types pass through.

Cancellation Capy Cobalt

Token type

std::stop_token

asio::cancellation_signal

Propagation

Automatic (io_env)

Automatic (slot/signal wiring)

Filtering

Application-level

this_coro::cancellation_state

OS-level cancellation

Yes (via Corosio)

Yes (via Asio)

Buffer Sequences

Capy adopts Asio’s buffer sequence model — ConstBufferSequence, MutableBufferSequence — because it works. Capy’s buffer types are fully compatible with Asio’s. You can pass Capy buffers to Asio operations and vice versa, seamlessly. Then Capy extends the model with additional types and algorithms, while still achieving the Dimovian Ideal — none of this requires exposing Asio headers to consumers.

Cobalt does not provide buffer sequence types or dynamic buffer support. Users who need these features use Asio’s types directly, inheriting the DynamicBuffer_v1/DynamicBuffer_v2 split.

Capy has one DynamicBuffer concept. The v1/v2 split in Asio exists because of a fundamental ownership problem: when an async operation takes a buffer by value and completes via callback, who owns the buffer? The original design had flaws, and the fix created two incompatible versions. By going coroutines-only, Capy avoids this entirely. The coroutine frame owns the buffer. Parameters have their lifetimes extended by the suspended frame, and the awaitable lives in the frame alongside them. There is no decay-copy, no ownership transfer, no ambiguity. One concept is sufficient.

Buffer Feature Capy Cobalt

ConstBufferSequence

Yes

Via Asio

MutableBufferSequence

Yes

Via Asio

DynamicBuffer

Unified

None (use Asio directly)

flat_dynamic_buffer

Yes

circular_dynamic_buffer

Yes

buffer_pair

Yes

slice

Yes

front

Yes

consuming_buffers

Yes

buffer_array

Yes

Byte-level trimming

Yes

Allocator Control

Cobalt sets up a thread-local PMR resource via main or thread. All coroutines on that thread share it. Every awaitable embeds a fixed SBO buffer:

// cobalt/op.hpp
struct awaitable : awaitable_base
{
    char buffer[BOOST_COBALT_SBO_BUFFER_SIZE]; // default: 4096
    detail::sbo_resource resource{buffer, sizeof(buffer)};
};

If the buffer is exhausted, allocations fall back to the upstream PMR resource or operator new. The buffer size is a compile-time constant. Changing it requires recompiling the library.

Capy leaves these decisions to the user. run_async(executor, allocator)(my_task()) sets the allocator before the task is created. The task’s operator new reads it from thread-local storage. This is a small, flexible customization point that permits usage patterns the authors did not anticipate: per-connection arenas, bounded pools, tracking allocators, per-tenant memory budgets. The allocation strategy is a deployment decision, not a library decision.

recycling_memory_resource provides zero-overhead recycling after warmup. Memory isolated per connection. Reclaimed instantly on disconnect.

Allocator Control Capy Cobalt

Granularity

Per-task

Per-thread

Allocation model

Forward-flow

Thread-local PMR

Per-connection arenas

Yes

No

Recycling allocator

recycling_memory_resource

Custom allocator support

run_async(ex, alloc)

Global setup only

Deterministic freeing

Yes

Non-deterministic on MSVC

Execution/Platform Separation

Cobalt is coupled to Asio’s io_context. The execution model and the platform abstractions are one thing.

Capy separates them. The execution model — executors, cancellation, allocation — lives in Capy. Platform abstractions live in Corosio, a companion library that provides native TCP sockets, acceptors, TLS streams, timers, DNS resolution, and signal handling — all built on Capy’s IoAwaitable protocol with native IOCP and epoll backends. You can test Capy’s execution model without a network stack. You can swap the I/O backend without changing your application code.

Architecture Capy Cobalt

Execution model

Capy (independent)

Coupled to io_context

Platform abstractions

Corosio (separate library)

Asio (same dependency)

Testable without I/O backend

Yes

No

Swappable backends

Yes

No

Coroutine Overhead

To measure the overhead that coroutines add to a real workload, an experimental JSON serializer drives output through a chain of co_await calls instead of direct function calls. Each JSON value type — null, bool, integer, double, string, array, object — is handled by a coroutine that writes fragments through a write_sink. The baseline is boost::json::serialize, a highly optimized non-coroutine implementation.

The input is numbers.json from the Boost.JSON benchmark suite. Results are best-of-four runs, Clang 20, -O3, Windows x64:

Serializer Time vs baseline

boost::json::serialize

317 us

1.0x

capy::task

537 us

1.69x

cobalt::promise

1,361 us

4.29x

cobalt::task

26,079 us

82.3x

Capy’s coroutine-driven serializer runs at 1.69x the baseline. Cobalt’s promise is 4.29x. Cobalt’s task is 82x.

The Capy implementation:

namespace {

template<class WS> task<> serialize(json::value const& v, WS& ws);

template<class WS> task<> write(std::nullptr_t, WS& ws) {
    co_await ws.write("null", 4);
}
template<class WS> task<> write(bool v, WS& ws) {
    if(v) co_await ws.write("true", 4);
    else  co_await ws.write("false", 5);
}
template<class WS> task<> write(std::int64_t v, WS& ws) {
    char buf[32];
    auto r = std::to_chars(buf, buf + sizeof(buf), v);
    co_await ws.write(buf, r.ptr - buf);
}
template<class WS> task<> write(std::uint64_t v, WS& ws) {
    char buf[32];
    auto r = std::to_chars(buf, buf + sizeof(buf), v);
    co_await ws.write(buf, r.ptr - buf);
}
template<class WS> task<> write(double v, WS& ws) {
    char buf[32];
    auto r = std::to_chars(buf, buf + sizeof(buf), v);
    co_await ws.write(buf, r.ptr - buf);
}
template<class WS> task<> write(std::string_view v, WS& ws) {
    co_await ws.write("\"", 1);
    co_await ws.write(v.data(), v.size());
    co_await ws.write("\"", 1);
}
template<class WS> task<> write(json::array const& v, WS& ws) {
    co_await ws.write("[", 1);
    bool first = true;
    for(auto const& x : v) {
        if(!first) co_await ws.write(",", 1);
        first = false;
        co_await serialize(x, ws);
    }
    co_await ws.write("]", 1);
}
template<class WS> task<> write(json::object const& v, WS& ws) {
    co_await ws.write("{", 1);
    bool first = true;
    for(auto const& x : v) {
        if(!first) co_await ws.write(",", 1);
        first = false;
        co_await write(x.key(), ws);
        co_await ws.write(":", 1);
        co_await serialize(x.value(), ws);
    }
    co_await ws.write("}", 1);
}
template<class WS> task<> serialize(json::value const& v, WS& ws) {
    return visit([&](auto const& v) { return write(v, ws); }, v);
}

struct write_sink {
    std::string r;
    task<> write(void const* p, std::size_t n) {
        r.append(static_cast<char const*>(p), n);
        co_return;
    }
};

} // namespace

std::string serialize_capy_task(json::value const& jv) {
    write_sink ws;
    capy::test::run_blocking()(serialize(jv, ws));
    return std::move(ws.r);
}

Every co_await ws.write(…​) call creates a coroutine frame, suspends, resumes, and destroys. This is the worst case for coroutine overhead — many tiny operations that complete synchronously. In a real application where I/O operations take microseconds or milliseconds, the coroutine machinery becomes negligible.

Summary

Feature Capy Cobalt

Design methodology

Use-case-first, coroutines-only

Coroutine layer on hybrid Asio

Implementation hiding

Dimovian Ideal achieved

Backend types exposed

Stream concepts

7 coroutine-only (refinement hierarchy)

Asio’s (hybrid)

Type-erased streams

7 wrappers

None

Mock streams

7 mock types + fuse

None

Threading

Multi-threaded (thread_pool, strand)

Single-threaded

Context propagation

Structural (await_suspend parameters)

Thread-local

Cancellation

std::stop_token, automatic, OS-level

cancellation_signal, automatic, OS-level

Buffer sequences

Extended, unified DynamicBuffer

None (use Asio directly)

Allocator control

Per-task, forward-flow

Per-thread, global setup

Execution/platform

Separated

Coupled

Relink without recompile

Yes

No