Build a Calculator in Rust

Interpreters

⏱️ 6h 40min
📦 40 modules
Rust

Parsing math the way computers do: a calculator REPL in Rust

Type (2.5 + 3) * 4 - 10 / -2 and get 27. That one line hides three classic problems: splitting text into meaningful pieces, respecting operator precedence, and computing a result without panicking on garbage. Build it once and you've built a tiny compiler front-end. We'll grow it one small piece at a time.

The shape of the whole thing

A calculator is a three-stage pipeline, and the top-level function reads exactly like one — one stage per line:

pub fn calculate(
    input: &str,
) -> Result<f64, CalcError> {
    let tokens = tokenize(input)?;
    let rpn = to_rpn(tokens)?;
    eval_rpn(rpn).map(|n| n.0)
}

The input is a borrowed string slice — we only read it, never own it — and the output is a Result whose error half is our own CalcError enum (defined below). Each stage feeds the next: tokenize turns "2+3" into [2, +, 3], to_rpn reorders that to [2, 3, +], and eval_rpn reduces it to 5.0. The ? after the first two stages short-circuits the moment anything fails, so a malformed number never reaches the evaluator. The final .map(|n| n.0) unwraps our Number newtype back into a plain f64 for the caller.

Tokenizing: text in, meaning out

Raw input is a soup of characters. The tokenizer scans it once and emits Tokens — and a Token is one of exactly four things:

pub enum Token {
    Number(Number),
    Op(Operator),
    LParen,
    RParen,
}

A token is either a number, an operator, or one of the two parentheses. Modelling these as an enum means the later stages can match on a token and have the compiler check that every case is handled. Nothing else can sneak through as a "token".

The scan itself walks a Peekable iterator over the characters, matching on each one:

while let Some(&ch) = chars.peek() {
    match ch {
        // ...one arm per kind of character
    }
}

peek lets us look at the next character before deciding to consume it. That distinction is the whole trick for multi-character tokens: we need to know what kind of thing is coming before we commit to reading it.

Numbers are the case that spans several characters:

'0'..='9' | '.' => {
    let n = consume_number(&mut chars)?;
    tokens.push(Token::Number(n));
}

A digit or a . means a number is starting, so we hand the iterator to consume_number, which keeps pulling characters until it hits a non-digit. The range pattern '0'..='9' matches any ASCII digit in one stroke, and the ? propagates a parse failure like 1.2.3 as a clean error.

Parentheses and whitespace are simpler:

'(' => {
    tokens.push(Token::LParen);
    chars.next();
}
')' => {
    tokens.push(Token::RParen);
    chars.next();
}
c if c.is_whitespace() => {
    chars.next();
}

Each paren pushes its token and then calls chars.next() to actually advance past the character we only peeked at. Whitespace pushes nothing — it just advances — so spaces in the input simply vanish. Forgetting that chars.next() would loop forever on the same peeked character.

Everything else is an error:

_ => return Err(
    CalcError::UnexpectedCharacter(ch),
),

The wildcard arm catches any character we don't recognize and returns a typed error instead of crashing. Notice we never call unwrap() on parsing — bad input is data, not a bug, and the function's signature already promised we'd report it.

Errors are values, not panics

The lazy version of this program would unwrap() everywhere and explode on 1 / abc. The honest version makes every failure a variant of one enum:

pub enum CalcError {
    InvalidNumber(String),
    UnexpectedCharacter(char),
    UnmatchedOpenParen,
    UnmatchedCloseParen,
    DivisionByZero,
    NotEnoughOperands,
    InvalidExpression,
}

Each variant names a distinct way the calculation can go wrong, and several carry data — the offending string or character — so the message can be specific. Because every stage returns Result<_, CalcError>, the compiler forces callers to handle failure rather than letting it slip by.

To turn a variant into something a human reads, implement Display — the body just matches each variant to a sentence:

fn fmt(
    &self,
    f: &mut Formatter,
) -> fmt::Result {
    match self {
        DivisionByZero =>
            write!(f, "Division by zero"),
        // ...one arm per variant
    }
}

Display is the standard trait behind {} formatting, so once it exists the REPL can println!("Error: {e}") on any CalcError. The signature is fixed by the trait: we receive a Formatter to write into and return a fmt::Result. Each arm writes a fixed phrase, except the data-carrying variants, which interpolate their payload — the error type owns its own wording.

Parsing a number folds into this same error type via FromStr, delegating the hard part to the standard f64 parser:

fn from_str(
    s: &str,
) -> Result<Self, CalcError> {
    s.parse::<f64>()
        .map(Number)
        .map_err(|_| {
            CalcError::InvalidNumber(
                s.to_string(),
            )
        })
}

Implementing FromStr is what makes s.parse() work for our Number, and choosing CalcError as the associated Err type means a failed parse already speaks our domain's error language. We lean on the built-in f64 parser for the hard part, then .map(Number) wraps a success into our newtype. The .map_err replaces the standard parse error with InvalidNumber, keeping the original text so the message can show exactly what failed — now num_str.parse() anywhere in the codebase yields our error for free.

A Number newtype that behaves like a number

We could pass raw f64 around. Instead we wrap it in a one-field tuple struct:

pub struct Number(pub f64);

This "newtype" costs nothing at runtime but buys a distinct type the compiler can reason about — you can't accidentally mix a Number with some other f64. It also gives us a home for domain-specific methods.

The first such method handles a float gotcha:

impl Number {
    pub fn is_zero(self) -> bool {
        self.0.abs() < f64::EPSILON
    }
}

Comparing floats with == 0.0 is fragile, so is_zero asks instead whether the value is within a tiny tolerance of zero. f64::EPSILON is that tolerance — the smallest meaningful gap between floats near 1. The division guard later will lean on this.

To keep the evaluator readable, we teach Number real arithmetic by implementing the std::ops traits:

impl Add for Number {
    type Output = Self;
    fn add(self, rhs: Self) -> Self {
        Self(self.0 + rhs.0)
    }
}

Implementing Add is what lets us write a + b on two Numbers directly. The method just unwraps both inner floats, adds them, and re-wraps the result. We implement Sub, Mul, Div, and Neg the same way — and the payoff lands in the evaluator, where a + b reads like math instead of Number(a.0 + b.0).

Shunting Yard: human notation into machine notation

Here's the heart of it. 2 + 3 * 4 is 14, not 20 — multiplication binds tighter. Humans encode that priority with precedence rules; computers prefer Reverse Polish Notation, where order is the rule: 2 3 4 * + has no precedence and no parentheses. Dijkstra's Shunting Yard algorithm does the conversion with an output queue and an operator stack.

The cleverest part is how precedence is represented. We define the operators in a deliberate order:

pub enum Operator {
    Add,
    Sub,
    Mul,
    Div,
}

Then we derive PartialOrd, and Rust orders enum variants by declaration position. Listing them low-to-high precedence means Add < Mul holds automatically — no comparison function, no precedence table. The lower-binding operators simply come first.

When a new operator arrives, we peek at the top of the operator stack and decide whether it should move to the output:

while let Some(Token::Op(top)) =
    op_stack.last()
{
    if top >= op {
        output.push(op_stack.pop().unwrap());
    } else {
        break;
    }
}

The while let keeps looking at the top operator without removing it — last() borrows, it doesn't pop. If the stacked operator binds higher or equal, we pop it and emit it; otherwise we break and leave it in place. That top >= op test — using the ordering we derived for free — is exactly what guarantees * is emitted before a following +. The unwrap() is safe because the while let just confirmed last() returned a value.

Once the higher-precedence operators are flushed, the new one goes on, completing the arm:

Token::Op(ref op) => {
    // ...flush loop above
    op_stack.push(token);
}

The current operator now sits on top of the stack, waiting until something of equal-or-higher precedence (or the end of input) flushes it in turn. This push-after-flush dance is the entire precedence mechanism.

Parentheses override all of that. A closing paren pops back to its partner:

Token::RParen => loop {
    match op_stack.pop() {
        Some(Token::LParen) => break,
        Some(op) => output.push(op),
        None => return Err(
            CalcError::UnmatchedCloseParen,
        ),
    }
},

We pop operators into the output until we reach the matching LParen, which we discard — its only job was to mark a boundary. Running off the end of the stack (None) means there was no opening paren to match, so we report UnmatchedCloseParen. A leftover ( discovered at the very end is the mirror error, UnmatchedOpenParen.

Evaluating RPN is almost trivial

RPN's whole point is that evaluation needs no cleverness — just a stack. Push numbers as you see them; on an operator, pop two, combine, and push the result back. Start with popping the operands:

let b = stack
    .pop()
    .ok_or(NotEnoughOperands)?;
let a = stack
    .pop()
    .ok_or(NotEnoughOperands)?;

We pop twice, and ok_or turns an empty stack into a NotEnoughOperands error instead of a panic — that's what catches input like 1 +. The order matters enormously: the second pop is the left operand. Get it backwards and a - b silently becomes b - a.

Division gets a guard before it runs, while the other three operators are unguarded:

let result = match op {
    Operator::Div => {
        if b.is_zero() {
            return Err(DivisionByZero);
        }
        a / b
    }
    Operator::Add => a + b,
    Operator::Sub => a - b,
    Operator::Mul => a * b,
};

We check b.is_zero() — using the tolerance-based method from earlier — before dividing, so 1 / 0 returns a clean error rather than the float inf. Only after the guard passes do we perform a / b, which dispatches to our Div impl on Number. Addition, subtraction, and multiplication can't fail on finite floats, so they map straight to the operator impls we wrote.

Whatever the branch produced gets pushed back onto the stack, completing the arm:

Token::Op(op) => {
    // ...pop b, pop a, compute result
    stack.push(result);
}

The freshly computed value is now ready to serve as an operand for the next operator, exactly as RPN intends.

When the tokens run out, a well-formed expression leaves exactly one value behind:

match stack.len() {
    1 => Ok(stack[0]),
    _ => Err(CalcError::InvalidExpression),
}

One value means the expression balanced perfectly. Anything else — zero values, or several stranded numbers like 1 2 3 + — means the input was malformed, so we return InvalidExpression rather than guessing.

The REPL wrapper

The user-facing loop is deliberately dumb: read a line, hand it to calculate, print the result or the error message.

match calculate(input) {
    Ok(result) => println!("= {result}"),
    Err(e) => println!("Error: {e}"),
}

A success prints = 27; a failure prints Error: Division by zero straight from the Display impl. The loop itself doesn't know a single thing about tokens or precedence — all the logic lives in the library, and main is just I/O.

Two small touches make it pipe-friendly:

print!("> ");
stdout().flush()?;

We flush after printing the > prompt because print! (no newline) otherwise sits in a buffer and the prompt never appears. Breaking the loop when a read returns zero bytes handles end-of-file cleanly, so piping a file of expressions into the program just works.

Where to go next

The core is honest, but a real expression engine grows in a few obvious directions:

  • Functions and variablessin(x), pi, let r = 5. Add token kinds and a small symbol table; the pipeline shape doesn't change.
  • Right-associative operators — exponentiation (2 ^ 3 ^ 2 = 512, not 64) needs the precedence check to use > instead of >= for that one operator.
  • Better numbers — swap f64 for a decimal type if you care about 0.1 + 0.2; because everything goes through Number, you change one struct.
  • Span-tracked errors — remember where a bad character was so you can underline it.

Why build it

A calculator is the smallest program that forces you to think like a language implementer: separate scanning from parsing from evaluation, model your errors as data, and lean on the type system instead of fighting it. A few hundred lines later, "tokenizer", "RPN", and "precedence" stop being words from a textbook — they're functions you wrote. The next time you read about a real compiler, you'll recognize the skeleton.

Practice