Four ways to get or insert in a Rust map

The Rust standard library contains two map types: HashMap and BTreeMap. Both of them implement the entry API to allow efficient insert-or-update behavior. I will give examples using HashMap, but BTreeMap implements the same API, too.

The Simplest Case: Inserting the default

This one is pretty trivial. If the value type implements Default with the right behavior, you can just use Entry::or_default.

Here’s a snippet that consumes an iterator and returns a map containing the counts of each element:

let mut counts = HashMap::new();
for x in iter {
    *counts.entry(x.to_owned()).or_default() += 1;
}

For example, if iter is "abracadabra".chars(), then counts ends up being {'b': 2, 'a': 5, 'd': 1, 'c': 1, 'r': 2}.

Still Simple Case: A very cheap constructor

If the value you want to insert isn’t the default, but has a literal representation or is otherwise cheap, you can use Entry::or_insert.

The example is a little contrived (you could easily use Entry::or_default and increment before printing instead of after) but it shows how it works:

let mut numbers = HashMap::new();
for x in "abracadabra".chars() {
    let number = numbers.entry(x).or_insert(1);
    println!("{x}: {number}");
    *number += 1;
}

Output:

a: 1
b: 1
r: 1
a: 2
c: 1
a: 3
d: 1
a: 4
b: 2
r: 2
a: 5

Moderately Complicated: An expensive constructor

Suppose now that the value type has a constructor, but you want to avoid running it when the value already exists. In this case, use Entry::or_insert_with.

This example is again contrived (Entry::or_default would also work here) but it shows how it works:

let mut words_by_length = HashMap::new();
for word in "Four score and seven years ago".split_ascii_whitespace() {
    words_by_length
        .entry(word.len())
        .or_insert_with(|| Vec::with_capacity(4))
        .push(word.to_owned());
}
println!("{words_by_length:?}");

Output:

{5: ["score", "seven", "years"], 3: ["and", "ago"], 4: ["Four"]}

Hard case: A fallible or async constructor

Here’s the reason for this blog post’s existence: if the constructor isn’t a synchronous, infallible function, you have to reach for the full generality of the entry API to get the job done.

This writes lines into many files, where (file, line) pairs arrive in random order, but the files are kept open to avoid the overhead of opening each one multiple times. I’ll show this one in its full glory, including imports and function boilerplate:

use std::collections::hash_map::Entry;
use std::collections::HashMap;
use std::fs::File;
use std::io::prelude::*;

fn main() -> std::io::Result<()> {
    let mut files = HashMap::new();

    for (path, line) in [
        ("a.txt", "Hello"),
        ("b.txt", "Hello"),
        ("b.txt", "This is file B"),
        ("a.txt", "This is file A"),
        ("c.txt", "This is file C"),
    ] {
        let file = match files.entry(path.to_owned()) {
            Entry::Occupied(o) => o.into_mut(),
            Entry::Vacant(v) => v.insert(File::create(path)?),
        };
        writeln!(file, "{line}")?;
    }

    Ok(())
}

The thing that tripped me up (twice) about this example is that it feels like OccupiedEntry::get_mut is the right thing to use in the Entry::Occupied branch because it has the same return type as VacantEntry::insert, but that actually results in the borrow not living long enough:

error[E0597]: `o` does not live long enough
  --> examples/fallible.rs:17:39
   |
16 |         let file = match files.entry(path.to_owned()) {
   |             ---- borrow later stored here
17 |             Entry::Occupied(mut o) => o.get_mut(),
   |                             -----     ^         - `o` dropped here while still borrowed
   |                             |         |
   |                             |         borrowed value does not live long enough
   |                             binding `o` declared here

For more information about this error, try `rustc --explain E0597`.

Final Thoughts

I feel like this may be pointing to a gap in the documentation somewhere, but I’m undecided about where I think the documentation needs to go. For the moment, I’m happy having written this down where I can find it again, and I hope it’s helpful for you, too.

I’ve posted fully-runnable examples if you want to see all the detail.

Update History

2024-09-08

  • Updated “expensive” example from Vec::with_capacity(1) to Vec::with_capacity(4), as suggested by @mo8it.

links

social