Minimal Site Monitor in Rust

Web Tools

⏱️ 1h 0min
πŸ“¦ 6 modules
Rust

Is your site still up? Teach Rust to keep watch

A site monitor is the smallest useful network tool you can build: it pings a URL on a timer and tells you whether the server is alive and how slow it's being. Yet packed into those few lines is a tour of the most important ideas in modern Rust β€” blocking vs. async I/O, the Tokio runtime, error handling that survives the night, and a real CLI. We'll grow one from a single GET request into a tool you can leave running in a terminal for a week, one small piece at a time.

Start blocking, on purpose

The first version does the dumbest possible thing: one request, one answer, on the current thread. We reach for reqwest, the standard HTTP client crate, with its blocking feature enabled.

Start with the target and a client:

static SITE: &str = "https://knowledge.dev/";

fn main() {
    let client = Client::new();
}

Client::new() builds a reusable client that holds a connection pool β€” you create it once and share it, never one per request. Keeping SITE as a static for now means the URL is baked into the binary; we'll hand that choice to the user much later. Notice main is an ordinary synchronous function β€” no runtime, no async, nothing clever yet.

Now fire the request and print what came back:

    let resp =
        client.get(SITE).send().unwrap();
    println!("{SITE} - {}", resp.status());

.get(SITE) builds a GET request and .send() blocks the thread until the server answers β€” the whole program simply parks here until bytes arrive. The .unwrap() panics on any failure; that's deliberate scaffolding we'll tear out later. Starting blocking matters pedagogically: it's one straight line of control flow, and it makes the motivation for async obvious the moment we want to do two things at once.

Why async, and what Tokio buys us

A monitor spends almost all its time waiting β€” for DNS, for the TCP handshake, for bytes to trickle back. That's I/O-bound work, and parking a whole OS thread on each wait is wasteful. Async lets one thread juggle many in-flight waits. In Rust that means the Tokio runtime plus reqwest's async client, which you get simply by dropping the blocking feature.

The signature changes first:

#[tokio::main]
async fn main() {
    let client = Client::new();
}

#[tokio::main] rewrites this into a normal main that boots an async executor under the hood, and main itself is now async. The Client::new() line is unchanged β€” the async client has the same constructor β€” but the client it returns now speaks futures instead of blocking calls. That single attribute is the door into the whole async world.

The request grows one keyword:

    let resp =
        client.get(SITE).send().await.unwrap();
    println!("{SITE} - {}", resp.status());

The visible behaviour is identical β€” one request, one line printed β€” but .send() now returns a future and .await drives it. That .await is the yield point: while the network is busy, the runtime is free to run other tasks. The compiler enforces it β€” forget the .await and you get a Future, not a response, and a type error. This is the pivot the whole tool rotates on.

A heartbeat: loop, sleep, measure

A one-shot check is a debugging aid. A monitor checks forever. We'll wrap the request in an infinite loop, pause between rounds with Tokio's async sleep, and time each request with Instant so latency comes for free.

First, name the interval:

const INTERVAL: Duration =
    Duration::from_secs(3);

A const is fine here because the value is fixed at compile time, and Duration::from_secs(3) reads better than a bare 3 floating in the code. Pulling it out as a named constant also marks the one knob we'll later expose to the user. Three seconds is a friendly default β€” frequent enough to feel live, gentle enough not to hammer the server.

Now the loop body, starting the clock and sending:

loop {
    let start = Instant::now();
    let resp =
        client.get(SITE).send().await.unwrap();
}

Instant::now() reads a monotonic clock β€” one that never jumps backwards when the system time is adjusted β€” so the elapsed time we compute later is always sane. We snapshot it before .send() so the measurement spans the entire round trip. The .unwrap() is still here, a known liability we'll fix in the next section.

Finally, report and pause:

    let took = start.elapsed();
    println!("{} - {:.0?}", resp.status(), took);
    sleep(INTERVAL).await;

start.elapsed() is the latency, and {:.0?} formats the Duration compactly as something like 120ms. Then sleep (from tokio::time) returns a future that completes after the duration without blocking the thread β€” unlike std::thread::sleep, it hands the runtime back so it can do other work while we idle. Now the terminal shows a rolling pulse, exactly what an uptime tool should feel like.

Surviving the night

Here's the bug that turns a demo into a disappointment: that lingering .unwrap(). One DNS blip, one timeout, one refused connection, and the whole monitor dies β€” precisely when you most wanted to know the site was down. A monitor must treat failure as data, not as a crash.

Every fallible call in Rust returns a Result β€” Ok(value) or Err(error) β€” and a match forces us to handle both arms:

let start = Instant::now();
match client.get(SITE).send().await {
    Ok(resp) => {
        let took = start.elapsed();
        println!("{} - {:.0?}", resp.status(), took);
    }
    Err(e) => println!("ERROR - {e}"),
}
sleep(INTERVAL).await;

We stopped unwrapping and started observing. On Ok we have a real resp in hand, so we pull the status and the elapsed time and print the same healthy line as before. On Err, the {e} formatter uses the error's Display impl, which for a reqwest error already includes the URL and the underlying cause β€” no extra logging code needed. Crucially, both arms fall through to the same sleep and loop again, so a transient outage produces a logged ERROR line and a recovery on the next cycle β€” the difference between a prototype and something you trust overnight.

Configuration belongs to the user

So far the URL and interval are baked into the binary. To recompile every time you want to watch a different site is absurd. We hand control to the user with clap and its derive feature, which turns a plain struct into a full argument parser β€” flags, defaults, validation, and --help text generated for you.

Declare the struct and the first flag:

#[derive(Parser)]
#[command(about = "Simple async site monitor")]
struct Args {
    #[arg(short, long,
        default_value = "https://knowledge.dev")]
    url: String,
}

#[derive(Parser)] is what wires the struct to clap, and #[command(about = ...)] becomes the one-line description in --help. On the field, short gives -u, long gives --url, and default_value makes the flag optional by supplying a fallback. So the URL is now data the user provides, not a static we compiled in.

Add the interval field the same way:

struct Args {
    // ...
    #[arg(short, long, default_value_t = 3)]
    interval: u64,
}

This field becomes -i/--interval, and default_value_t = 3 supplies a typed default β€” the _t variant takes a real u64 rather than a string clap has to parse. A u64 is the natural type for "seconds" and feeds straight into Duration::from_secs. Putting this struct in its own src/args.rs module keeps main.rs about monitoring, not parsing.

Wire it into main:

let args = Args::parse();
let interval =
    Duration::from_secs(args.interval);

Args::parse() reads the process arguments, applies the defaults, and exits with a friendly message if something is malformed β€” all from that one derive. We drop the old static SITE and const INTERVAL, then feed args.url and the computed interval into the loop. The result is a genuine CLI:

$ cargo run -- --url https://example.com -i 5
Monitoring https://example.com every 5s
200 OK - 132ms

…and --help works without a line of extra code.

Where to go next

The tool is honest, but production wants more:

  • Watch many sites at once. This is where async finally flexes: spawn a Tokio task per URL with tokio::spawn, and one thread happily supervises hundreds of targets.
  • Timeouts. A hung server can stall a poll indefinitely. Client::builder().timeout(...) caps how long any request may wait.
  • Treat 500s as failures. A 200 is healthy; a 503 is not. Branch on resp.status().is_success() instead of trusting any response.
  • Alert, don't just print. Pipe a state change (upβ†’down) into a webhook, an email, or a log aggregator. The polling loop is already the perfect place to detect the transition.

Why build it

A site monitor is small enough to finish in an afternoon and rich enough that, by the end, async/await, the Tokio runtime, Result-based error handling, and clap stop being words you've read and become things you've wired together. You started with a thread blocked on one request and ended with a resilient, configurable service β€” and you can see every line that got you there.

Practice