As you might have gleaned from my last post, RC is pretty unique environment, and it takes a lot of work to figure out what to make of it. I think I’m starting to get the hang of how to spend my time effectively here.
What did I do in weeks 2-4?
My main focus for this batch is figuring out how to use machine learning to improve computer networking protocols. I don’t have a particular deep background in either domain, so the last couple weeks have largely been about getting ramped up. I also made pair programming a priority, which has been an awesome way to stay productive and get to know other RC-ers better.
- Started learning Rust, and ended up writing a bunch of blog posts(people really seemed to like this one). Had a couple good pairing sessions with Henry where we got started with the basics, and with Kevin, where we worked on solving some problems in concurrency
- Made this PR in smoltcp, a userspace TCP library written in Rust. Required reading a bunch of Linux kernel code (documented what I learned here).
- Started building an “ssh” implementation in C with Elad
- Rory and I started building a DNS Resolver in Rust
- Built a procedural map generator with Emanuel and Zach as part of the RC art jam
- Spent some time learning about how NATs work and gave a talk about it
- Attended a couple linear algebra sessions
I spent a bunch of time at the machine learning study group trying to better understand neural networks better. This ended up not feeling super productive, so I’ve dropped that for now. Since we’ve started working on different projects, Jenn, Nick and I ended up disbanding the group.
In the process of working on these projects, I’ve learned a lot about what projects/learning techniques work best.
When does the learning happen?
My goal for a bunch of these projects has been fairly nebulous–“understand computer networks better”. My hope was that by picking a few different protocols to implement I would get a better understanding for how those protocols work and get more used to the vocabulary of computer networking. However, through working on these projects, my perspective on what it means to understand computer networks has changed. Rather, I care a lot less about whether I understand every acronym and protocol I come across in networking documents, and a lot more about whether I trust myself to solve interesting problems in computer networks.
This is a problem I’ve been thinking about in the context of math for a while. Math textbooks typically have a chapter of content followed by a bunch of exercises that demonstrate concepts in the previous chapter. Being able to successfully complete these exercises probably means that you understand that particular chapter well. However, it doesn’t necessarily mean that you are equipped to contextualize it and use it in a “real” problem presented outside of the context of that chapter. In other words, it doesn’t necessarily mean that your mathematical problem-solving ability has improved.
The nice thing about programming is that any task you pick up is an opportunity to be a better practitioner of your programming language, or improve your debugging ability. However, not all projects require doing the work that makes you better at problem solving in a particular domain.
The “ssh” implementation I worked on with Elad ended up being a fantastic exercise for learning about networks, and is one I would highly recommend to anybody trying to get better at systems programming.
We decided pretty early on to not actually implement the official ssh spec, since it would take forever and mostly not be that interesting. Our goal was to build software to accomplish the same task and allow a client to remotely execute bash commands on another machine, with all communication end-to-end encrypted.
We’re building this in C and took the approach of on the server
waiting for incoming connections, and when a connection is made
fork off a new process and execute bash. The original
process will read data from the socket, pass it along to the
and then take the output of the
bash process and pass it back to the socket.
One of the problems we encountered while working on this was figuring out
how to handle bash commands that don’t produce any output. This ended up being complicated because we
were writing to the bash process using
write, and then immediately calling
read. If a command like
cd .. was run, this doesn’t produce any output,
read blocks indefinitely.
Working through this problem was challenging, and we ended up cycling through a couple different solutions, none of them perfect. This ended up requiring us to learn a lot more than we were expecting about Linux syscalls and concurrency, and in the process I think we gained a much better intuition for what kinds of solutions are possible for problems like this.
It wasn’t obvious to me that I would learn that much about networks in the process of building this project. I knew from the start that we would “simply” be using the C socket APIs to communicate between the server and client, APIs that I’ve used before. This actually led me to being a little hesitant about embarking on this in the first place, because I wasn’t sure I would learn that much new about any real networking protocols in the process of doing this.
However, if framed differently, and instead of asking if this will help me learn the details of any networking protocol, asking if the project will make me better at solving problems in computer networks, this ended up being a pretty ideal project.
It’s hard to predict what projects will have moments like this–you only really discover them once you’ve started making progress on a certain project. That being said, I definitely feel a lot better about the time that I spent than I would have if I had instead spent my time reading the details of different protocols.
This is a big shift from other educational environments I’ve been in. In college, there is a big focus on comprehensiveness–a lot of courses have a list of a material that they aim to cover in some short amount of time. It’s easier to assess and measure what people learn when you are judging on understanding some fixed canon of material. However, now I think that striving for comprehensiveness isn’t really worth it–especially if it’s at the cost of not doing projects that can really further intuition and problem-solving ability.
My aversion to re-learning things
In addition to getting better at understanding computer networks, I’ve also been focused on understanding machine learning (ML) algorithms.
We decided at our ML study group to break off and start working on individual projects. I started out initially by deciding to build a classifier using Tensorflow, but didn’t feel super satisified using these libraries without actually having a good intuition for how the algorithms work. To that end, I ended up watching a bunch of 3Blue1Brown videos and reading this great page on matrix calculus.
However, I think I was fundamentally hamstrung by the fact that I don’t have the sharpest calculus and linear algebra fundamentals.
It’s interesting that that’s case, given that I’ve tried many times and failed to make any progress learning ML. I think one of my problems here is that I tell myself that I “know” calculus and linear algebra already, since I took classes on these in school, and it feels like a shameful waste of time to have to revisit these and do the work again. Life is too short to take linear algebra twice!!
I’m realizing now that this a silly mentality to have, and that having such a strong aversion to “re-learning” things is getting in the way of me of actually learning these important concepts. The root cause of this is probably all the time I’ve spent in Silicon Valley startups obsessed with speed (Not that this is a bad thing, just not a great mindset for learning math properly).
Machine learning is full of hard concepts, and it’s going to take time to fully grasp them. Peter Norvig’s “Learn Programming in 10 Years” post I think sums this up well–expecting to be able to learn hard concepts in short, expedited periods of time is a little foolish.
Will definitely be spending some time shoring up my knowledge on these more fundamental areas of math. Some of this also applies to thinking about other domains too–you are never “done” learning, and covering material you’ve dealt with can still be an extremely valuable experience.
Thinking like a programming language designer
I also spent a bunch of time in the last couple weeks learning a new programming language, Rust. I’m still getting used to common idioms and approaches in the language, but I’ve made a bunch of a progress towards getting comfortable with it.
One of the fun things about being at RC is that there are a lot of folks interested in programming languages and PL-design in general, and this has definitely had an impact on how I think about learning new languages. One fellow RC-er in particular, Adam, who’s been spending a bunch of time implementing the SKI combinator in different statically-typed languages, inspired me to be rigorous about the capabilities of languages. Rather than just thinking about the “happy paths” in a programming language and focusing on the idioms in common frameworks like Ruby on Rails, I’ve made a more concerted effort to learn about the “edges” of the language. For a statically-typed language like Rust, what are the possibilities of the type system and how complex a type can you express? With the ownership/borrow system, are there any approaches that can break the guarantees that the language makes?
In the next couple weeks, I have two main priorities–my primary project of using ML to generate TCP congestion control algorithms, and doing some projects to further my knowledge of calculus and linear algebra.
I’m also excited to pair with Lucy this coming week on a FFT-related project, and to finish up the DNS resolver and SSH implementation that I’ve been working on with Rory and Elad.
Thanks for reading and making it this far! Stay tuned for more random musings on getting better at programming.