First steps in Rust
I’ve been reading about Rust for the better part of 3 years. After having been exposed to Haskell, I begun writing my C++ programs in a more Haskell-y way, which I quickly found to feel very shoe-horned in (std::tuple
and std::get<N>
, std::optional
’s interface), and syntactically not pleasant at all. Things like ranges help, but this doesn’t address the core problem, which is that these features should be a part of the language and not of the standard library.
Rust came into the scene promising to be the result of lessons learned from C++ and Haskell, with reasonable defaults and a good macro system.
After reading about Rust for the better part of 3 years I finally got around to trying to write a program in it.
Program
I was watching a Jane Street mock interview and they presented an interesting problem: you are given a list of unit equivalences of the form:
m = 3.28 ft
ft = 12 in
hr = 60 min
min = 60 sec
and a list of queries of the form:
2 m = ? in
13 in = ? m
13 in = ? hr
where the task is to find the appropriate missing value (?
) for each query, or “not convertible” if the query is meaningless.
The video follows the interviewee as he works through his solution, but I paused the video before he started and gave it a try myself.
My first idea was to store the unit equivalences in a hash-table and then query the table recursively, multiplying the unit values along the way.
If we model the problem as a graph with the units as nodes and the conversion value between them as the edge weight, as the interviewee did. This is essentially a DFS on the graph.
For starters, I’ll define some convenience type aliases:
1type Fact = (String, f32, String);
2type Query = (f32, String, String);
3type UnitMapper<'a> = HashMap<&'a str, Vec<Edge<'a>>>;
4
5struct Edge<'a>
6{
7 unit: &'a str,
8 value: f32,
9}
where the shape of Fact and Query come directly from the way the input is provided, and UnitMapper
is the adjacency list representation of our graph. Nodes/units are &str
for simplicity.
For the solution, we just have to implement a simple DFS:
1fn answerQueryDFS((base_value, base_unit, target_unit): &Query,
2 list: &UnitMapper
3) -> Option<f32>
4{
5 if ! list.contains_key(&target_unit.as_str()) {
6 return None;
7 }
8 let mut visited: HashSet<&str> = HashSet::from([base_unit.as_str()]);
The destructuring in the parameter list is definitely one of the better things imported from Haskell.
The actual recursion part is straightforward:
1 fn rec_dfs<'a>(from_unit: &'a str,
2 value_accum: f32,
3 list: &'a UnitMapper,
4 target_unit: &str,
5 visited: &mut HashSet<&'a str>)
6 -> Option<f32>
7 {
8 if from_unit == target_unit {
9 return Some(value_accum);
10 }
11 for Edge { unit: new_unit,
12 value: new_unit_val } in list.get(from_unit)? {
13 if ! visited.contains(new_unit) {
14 visited.insert(from_unit);
15 if let ret@Some(_) = rec_dfs(new_unit, value_accum * new_unit_val, list, target_unit, visited) {
16 return ret;
17 }
18 }
19 }
20 None
21 }
22
23 rec_dfs(base_unit, *base_value, list, target_unit, &mut visited)
24}
It seems we cannot capture a closure in itself, so we have to use a normal function to perform recursion; but functions can’t capture variables from an outer scope, so we have to pass everything we need as explicit arguments. This isn’t difficult, but it looks syntactically heavy, despite it being just language bureaucracy.
Note the type destructuring in for
and if let
. So nice!
The interviewee’s solution is smarter than mine: he did the graph search breadth-first, which is a better solution since we want to get from the source unit to the target unit with as fewer multiplications along the way as possible, from both a correctness and a performance point-of-view.
The Rust solution almost looks like pseudo-code, if we ignore the syntax around handling types.
1fn answerQueryBFS((base_value, base_unit, target_unit): &Query,
2 list: &UnitMapper
3) -> Option<f32>
4{
5 if ! list.contains_key(&target_unit.as_str()) {
6 return None;
7 }
8 let mut frontier = VecDeque::from([Edge::new(base_unit, *base_value)]);
9 let mut visited: HashSet<&str> = HashSet::new();
10
11 while let Some(Edge { unit: current_unit,
12 value: current_value }) = frontier.pop_front() {
13 if current_unit == target_unit {
14 return Some(current_value);
15 }
16 if ! visited.contains(current_unit) {
17 visited.insert(current_unit);
18 for Edge { unit: new_unit,
19 value: new_unit_val } in list.get(current_unit)? {
20 frontier.push_back(Edge::new(new_unit, new_unit_val * current_value));
21 }
22 }
23 }
24 None
25}
Again, note the type destructuring in the while let
and for
. Multi-level, even! Syntax shouldn’t generally matter to an experienced programmed, but if your language’s syntax reduces friction in writing a certain style of programs, it will naturally promote more programs written in that style. It’s much easier to write in Rust:
1let x = opt?;
2...
than it is to write the equivalent in C++:
1if (opt) {
2 return std::nullopt;
3}
4auto x = opt.value();
5...
No wonder Rust touts algebraic datatypes. C++ also has them, but since they’re so unnatural to write programs in, it’s as if they didn’t exist, giving Rust a perceived feature over C++.
Rust impressions
Rust’s highs are really high:
Like: lifetimes and the borrow checker. It’s just intuitive to me… I really can’t understand why people have trouble with it. Maybe I’ve just not written a large enough program where this really annoys me. Dangling references are always an error in your program, so having their lifetime be checked at compile-time – like types – is just smart. The compiler, more often than not, has this information, so why not use it? In the situations where you do need to specify a lifetime, what you’re doing is not trivial and you can’t reasonably assume the compiler can deduce what you’re doing, so it’s just a matter of making explicit to the compiler what’s already implicit in your head. Plus, if you really need to go around the borrow checker, you can use
std::cell
.Love: Variables are const by default. I don’t think I need to expand too much on this… the less mutable state your program has, the easier it is to understand, and the fewer bugs it has… why shouldn’t everything be const everywhere? Also, unlike in C++, a variable being const doesn’t impede moves!
Love: Algebraic data types. This was one the the things I most appreciated about Haskell, and the feature that initially drew me in to Rust. The ease of use of ADTs (both constructing or destructuring) makes programming in a semi-functional style something that feels natural and is encouraged.
Like: The language isn’t afraid to use its own standard library. This is one of my main gripes with C++: the language seems very afraid to just return standard containers. You see some of this here or there (e.g.,
std::unordered_map::find
returning astd::pair
), but these are the exceptions. In Rust, string split returns something that can easily be converted to aVec
; why can’t we have the same in C++?
And its lows are not that low:
Dislike: some things are implicit which could be explicit: below, in the
readFacts
function, I have to specify*target_value
on lines 52/53 – which makes sense, sincetarget_value
is a reference – but afterward I can just write1.0 / target_value
on lines 57/58, without explicitly dereferencingtarget_value
. I get what’s happening (I think): the compiler knows that division can only work with values and not references, so it dereferencestarget_value
automatically for us instead of giving us a type error; however, if I write in my editor a variablex
of a given type&X
, I don’t want the exact same symbolx
to sometimes mean/have typeX
. I’ll take that type error, thank you very much!Dislike: compiler opinions. there is definitely a default way of doing things and the compiler will nag you if you deviate from it. I like to write variables in snake_case, functions in camelCase, and types in PascalCase, and this makes the compiler nervous. I can turn these warnings off, but I would prefer not having to do it in the source-code.
Strongly dislike: the standard library documentation. I am used to cppreference, and rust-lang’s std just doesn’t cut it for me. It’s a bit hard to read, and I’d like some more details at times, or at least a right-side index which is ordered the same way as its topics appear on the page! It also appears to be incomplete: you can use
operator[]
in aHashSet
, but the documentation doesn’t say that it implements the Index trait.
All in all, this was a very pleasant introduction to Rust. I left with the feeling that I want to explore it more. I have some ideas for some programs I want to write (keep an eye out on the website!), so I think I should try to write them in Rust just for the fun of it.
The complete code is down below:
Complete code listing
1#![allow(non_snake_case)]
2/*
3example facts:
4 m = 3.28 ft
5 ft = 12 in
6 hr = 60 min
7 min = 60 sec
8example queries:
9 2 m = ? in -> answer = 78.72
10 13 in = ? m -> answer = 0.330 (roughly)
11 13 in = ? hr -> "not convertible!"
12*/
13
14use std::collections::{HashMap, HashSet, VecDeque};
15use std::io;
16use std::str::FromStr;
17
18
19type Fact = (String, f32, String);
20type Query = (f32, String, String);
21type UnitMapper<'a> = HashMap<&'a str, Vec<Edge<'a>>>;
22
23
24struct Edge<'a>
25{
26 unit: &'a str,
27 value: f32,
28}
29impl<'a> Edge<'a>
30{
31 fn new(unit: &'a str, value: f32) -> Self
32 {
33 Edge { unit, value }
34 }
35}
36impl<'a> std::fmt::Display for Edge<'a>
37{
38 fn fmt(&self, f:&mut std::fmt::Formatter) -> std::fmt::Result
39 {
40 write!(f, "Edge({}, {})", self.unit, self.value)
41 }
42
43}
44
45
46fn readFacts<'b>(facts: &'b Vec<Fact>) -> UnitMapper
47{
48 let mut unit_map: UnitMapper = HashMap::new();
49 for (base_unit, target_value, target_unit) in facts {
50 unit_map
51 .entry(base_unit)
52 .and_modify(|e| e.push(Edge::new(target_unit, *target_value)))
53 .or_insert(vec![Edge::new(target_unit, *target_value)]);
54
55 unit_map
56 .entry(target_unit)
57 .and_modify(|e| e.push(Edge::new(base_unit, 1.0 / *target_value)))
58 .or_insert(vec![Edge::new(base_unit, 1.0 / *target_value)]);
59 }
60 unit_map
61}
62
63
64fn answerQueryBFS((base_value, base_unit, target_unit): &Query,
65 list: &UnitMapper
66) -> Option<f32>
67{
68 if ! list.contains_key(&target_unit.as_str()) {
69 return None;
70 }
71 let mut frontier = VecDeque::from([Edge::new(base_unit, *base_value)]);
72 let mut visited: HashSet<&str> = HashSet::new();
73
74 while let Some(Edge { unit: current_unit,
75 value: current_value }) = frontier.pop_front() {
76 if current_unit == target_unit {
77 return Some(current_value);
78 }
79 if ! visited.contains(current_unit) {
80 visited.insert(current_unit);
81 for Edge { unit: new_unit,
82 value: new_unit_val } in list.get(current_unit)? {
83 frontier.push_back(Edge::new(new_unit, new_unit_val * current_value));
84 }
85 }
86 }
87 None
88}
89
90
91fn answerQueryDFS((base_value, base_unit, target_unit): &Query,
92 list: &UnitMapper
93) -> Option<f32>
94{
95 if ! list.contains_key(&target_unit.as_str()) {
96 return None;
97 }
98 let mut visited: HashSet<&str> = HashSet::from([base_unit.as_str()]);
99
100 fn rec_dfs<'a>(from_unit: &'a str,
101 value_accum: f32,
102 list: &'a UnitMapper,
103 target_unit: &str,
104 visited: &mut HashSet<&'a str>)
105 -> Option<f32>
106 {
107 if from_unit == target_unit {
108 return Some(value_accum);
109 }
110 for Edge { unit: new_unit,
111 value: new_unit_val } in list.get(from_unit)? {
112 if ! visited.contains(new_unit) {
113 visited.insert(from_unit);
114 if let ret@Some(_) = rec_dfs(new_unit, value_accum * new_unit_val, list, target_unit, visited) {
115 return ret;
116 }
117 }
118 }
119 None
120 }
121
122 rec_dfs(base_unit, *base_value, list, target_unit, &mut visited)
123}
124
125
126fn main() -> io::Result<()>
127{
128 let mut buffer = String::new();
129 let mut facts: Vec<Fact> = Vec::new();
130 let mut queries: Vec<Query> = Vec::new();
131
132 loop {
133 match io::stdin().read_line(&mut buffer) {
134 Ok(0) => { break; }
135 Ok(_) => {
136 let line: Vec<_> = buffer.split_whitespace().collect();
137 if line.len() == 4 { // fact: a = x b
138 facts.push((
139 String::from(line[0]),
140 f32::from_str(line[2]).unwrap(),
141 String::from(line[3]),
142 ));
143 }
144 else if line.len() == 5 { // query: x a = ? b
145 queries.push((
146 f32::from_str(line[0]).unwrap(),
147 String::from(line[1]),
148 String::from(line[4]),
149 ));
150 }
151 else {
152 println!("Read invalid fact/query: {buffer}");
153 }
154 buffer.clear();
155 }
156 Err(e) => { return Err(e); }
157 }
158 }
159
160 let fact_table = readFacts(&facts);
161 for query in &queries {
162 if let Some(conversion_value) = answerQueryBFS(&query, &fact_table) {
163 println!("BFS: {} {} are {} {}", query.0, query.1, conversion_value, query.2);
164 }
165 else {
166 println!("BFS: no conversion");
167 }
168 if let Some(conversion_value) = answerQueryDFS(&query, &fact_table) {
169 println!("DFS: {} {} are {} {}", query.0, query.1, conversion_value, query.2);
170 }
171 else {
172 println!("DFS: no conversion");
173 }
174 }
175
176 Ok(())
177}
Tags: #programming #rust