* Fooled By Randomness (NN Taleb): Taleb is a complicated personality, but this book gave me a heuristic for thinking about long-tails and uncertain events that I could never have derived myself from a probability textbook.
* Designing Data Intensive Applications (M Kleppmann): Provided a first-principles approach for thinking about the design of modern large-scale data infrastructure. It's not just about assembling different technologies -- there are principles behind how data moves and transforms that transcend current technology, and DDIA is an articulation of those principles. After reading this, I began to notice general patterns in data infrastructure, which helped me quickly grasp how new technologies worked. (most are variations on the same principles)
* Introduction to Statistical Learning (James et al) and Applied Predictive Modeling (Kuhn et al). These two books gave me a grand sweep of predictive modeling methods pre-deep learning, methods which continue to be useful and applicable to a wider variety of problem contexts than AI/Deep Learning. (neural networks aren't appropriate for huge classes of problems)
* High Output Management (A Grove): oft-recommended book by former Intel CEO Andy Grove on how middle management in large corporations actually works, from promotions to meetings (as a unit of work). This was my guide to interpreting my experiences when I joined a large corporation and boy was it accurate. It gave me a language and a framework for thinking about what was happening around me. I heard this was 1 of 2 books Tobi Luetke read to understand management when he went from being a technical person to CEO of Shopify. (the other book being Cialdini's Influence). Hard Things about Hard Things (B Horowitz) is a different take that is also worth a read to understand the hidden--but intentional--managerial design of a modern tech company. These some of the very few books written by practitioners--rather than management gurus--that I've found to track pretty closely with my own real life experiences.
Compiler engineer here. In practice, compilers for higher-level languages often have a lot of difficulty getting anywhere close to the efficiency of comparable C code. If you take Python, for example, you have to do a lot of inlining to eliminate various abstractions. Inlining is actually non-trivial. Yes, inlining, by itself, is an easy program transformation, but knowing where to inline to get the best performance is very hard. If you inline too much, you increase code size, and you lose performance. You have to know precisely which inlining decisions will pay for themselves, and your large codebase might have tens of thousands of call sites and a call hierarchy 40 functions deep. Python also adds the fun little problem that you can redefine any function at run time, which makes it hard for the compiler to know which function you're actually going to be calling. To complicate things further, inlining decisions affect other optimizations. For example, if you inline foo into bar, do you then also inline into bar the functions that foo is calling? Do you unroll the loops from foo into bar? Etc.
Also, there's an aspect that I feel is constantly overlooked, and this is the way that objects are laid out in memory. In Python, JavaScript, Ruby, you have a lot of pointers to objects. You get a lot of pointer-chasing as a result. This is BAD for caches. Each object you touch, each pointer you dereference means pulling in a new cache line. In C, you can design very compact and flat data structures. You can use 32-bit floats, 8-bit, or even one-bit integers if you want. You can have a struct within a struct, with an array inside of it, all without any pointer-chasing.
Modern CPUs are constrained by memory bandwidth, and it's very hard for any programming language to beat C on achievable memory efficiency. What's worse is that we have little to no academic compiler literature on automatically optimizing for memory efficient data layouts. It's an overlooked and understudied problem. You would have to prove that integer values lie within certain ranges (hard) and also do object inlining (collapsing objects into parents) which AFAIK is also hard and not done by any mainstream compiler.
So, yeah, keep thinking that a sufficiently-smart compiler will do everything for you. You will assuredly be very disappointed. Until we have strong AI, the sufficiently smart compiler is basically unobtainium. If you want efficiency, the best route is generally to have less layers of abstraction, or to only rely on compiler optimizations you know for certain will happen.
I can certainly see a lot of parallels with Oculus / Facebook.
Perhaps unusually, I actually wanted FB to impress itself more strongly on Oculus post acquisition because, frankly, Oculus was a bit of a mess. Instead, Oculus was given an enormous amount of freedom for many years.
Personally, nobody ever told me what to do, even though I was willing to "shut up and soldier" if necessary -- they bought that capability! Conversely, I couldn't tell anyone what to do from my position; the important shots were always called when I wasn't around. Some of that was on me for not being willing to relocate to HQ, but a lot of it was built into early Oculus DNA.
I could only lead by example and argument, and the arguments only took on weight after years of evidence accumulated. I could have taken a more traditional management position, but I would have hated it, so that's also on me. The political dynamics never quite aligned with an optimal set of leadership personalities and beliefs where I would have had the best leverage, but there was progress, and I am reasonably happy and effective as a part time consultant today, seven years later.
Talking about "entitled workers" almost certainly derails the conversation. Perhaps a less charged framing that still captures some of the matter is the mixing of people who Really Care about their work with the Just A Job crowd. The wealth of the mega corps does allow most goals to be accomplished, at great expense, with Just A Job workers, but people that have experienced being embedded with Really Care workers are going to be appalled at the relative effectiveness.
The communication culture does tend a bit passive-aggressive for my taste, but I can see why it evolves that way in large organizations. I've only been officially dinged by HR once for insensitive language in a post, but a few people have reached out privately with some gentle suggestions about better communication.
All in all, not a perfect fairy tale outcome, but I still consider taking the acquisition offer as the correct thing for the company in hindsight.
Yes. That's what Go is for. Go is for doing the things that Google does on servers, which mostly means web apps of one kind of another. It has all the parts for that, and they're well-exercised because they're doing billions of operations per second on Google's own work. It's hard-compiled, so you don't have all the extra overhead of the interpreted languages. Also, Go has a better async story than most web-oriented systems. Goroutines, or "green threads", do both the job
of threads and "async". In async land, if anything blocks or uses much CPU time, you're stalled. Not so in Go. A goroutine can block without stalling other goroutines.
Right now, I'm writing a client for a virtual world in Rust. There's a big need for concurrency, but, unlike web stuff, it's all tightly interrelated. There's a GPU to keep busy. There's network traffic of several different kinds. There are compute-bound tasks which need to be kept out of the frame refresh loop.
Rust is good for this. The existing C++ client is too much of a mess to make concurrent; people looked at it and gave up. In Rust, it's coming along nicely.
1. Learning (fundamental) stuff deeply is the way to become great.
2. But knowing how stuff works fundamentally at an abstract level (e.g. being able to write your own toy OS) is not directly economically useful (and/or very time consuming to get to that stage). So it won't get you a job.
3. So for starters learn just one thing that people actually use (say git) deeply at a concrete level. That is economically useful, because most devs will only have fairly superficial git skills, so you becoming the go-to person for git will provide economic value to the company.
4. At the same time, this is an effective way to bootstrap a more abstract deep understanding. E.g. if you really master git, you will also learn a fair amount of abstract concepts that go beyond the concrete tool (deep understanding of git implies the ability to implement at least a toy version yourself, which will teach you useful stuff; in particular if you can implement your own version of git you are probably already better than 90% of programmers).
5. Be mindful that you will be blinkered at this stage (because you only know a single thing and thus lack a basis for comparison), so don't become opinionated yet.
I think this a great early career strategy as long as you pick the right thing to deep-dive into. E.g. if you picked AspectJ or Agent Learning or CASE tools or the semantic web around 15-20 years ago (at the peak of their hype-cycle), this possibly wouldn't have worked out so hot. In particular, learning some things like Spring or Freudian Psychology will probably harm rather than help your intellectual development.
Making a good pick is hard if you lack experience; I'd say if in doubt pick something really everyone in your line of work uses but most people have not mastered and where there is value in mastery (and you can see how mastery might tie into things you want to learn more about at a fundamental level). Also, preferably pick something that has been around for at least 5 years (unless you are quite confident in your nose for trends).
I liked that. I am am 68 and although I 'retired' last year (from managing a deep learning team at Capital One), I stay active. Currently I am writing a combined iOS and macOS app (so I needed to spin up on Swift and SwiftUI, and how to use built in support for building deep learning models, instead TensorFlow which is what I am used to). I recently published a book on the very cool Hy Lisp programming language (you can read it free at https://leanpub.com/hy-lisp-python).
I get inspiration from my father. He is 98 and has taught himself 3D animation and video editing, and keeps himself very busy creating videos for the international right to die organization. (He was a physicist and is still a member of the National Academy of Sciences.)
So, for all you young 50 year olds, I say to you: it is not over until it is over :-)
A very nice book about assembly programming is "Assembly Language Step-by-Step: Programming with Linux, 3rd edition" (http://www.amazon.com/dp/0470497025).
The nice thing about this book is that it guides the reader at understanding how the machine works first, and only then to assembly programming.
The sad thing about this book is that it references 32 bit intel-compatible processors.
My guess is that the original author has grown old and is not interested in producing a fourth edition of such book.
On this matter, I would like to ask: is it worth learning assembly for the x86/32-bit instructions, now that pretty much every computer is built on the amd64 architecture ?
I had two long breaks between jobs. I've also had 4 different jobs in the last 5-ish years. So - I've had to do some prep every year. When I hit 1 year at any job - I start prepping again. I prepped probably 200 hours for the first job, 200 for the second, and then 100 for the third and fourth in the last 5-ish years. I pretty much never stay more than a year because it's the only way to get compensation and title to go up at all the companies I worked at.
I'd suggest always working on paper or whiteboard - never use a computer. (You have to emulate the environment) Set a timer for every problem. (1 hour max - aim for 45 minutes on medium/hard when starting out; lower it as you familiarize yourself with a subject) Start throwing data structures and algorithms at the problem (pattern matching them, basically). Solve it or don't. Learn something to take away from it. If you don't solve it, learn how to solve it, and where you were lacking. If you did solve it, be sure to check other solutions because sometimes yours is crap. Practice that thing you were lacking. (Needed to know BFS? Implement BFS from scratch 10 times using spaced repetition) If you're planning on 8 hour study sessions then focus on one type of problem at a time. (BFS, DFS, graph problems, tree problems, binary search, dynamic programming, etc.) Only do those kinds of problems in that session. Set it up like you would for a math class. You do one chapter in that session/day/week/whatever and proceed to do a few chapters (say a BFS chapter, a DFS chapter, and a binary search chapter)... Once you've done a few, give yourself a midterm covering the topics you learned. So give yourself like 2 DFS, 2 BFS, and 2 binary search problems and see if you can solve them in a timed environment. If you rocked the midterm then proceed on to do more chapters and more midterms. If you didn't then relearn those subjects before moving on. Finally, when you feel like you've learned everything then give yourself a final exam to see if you remember everything and can still do it right.
Basically - treat it like a math class. It's not just memorization (you need to memorize the strategies, that's for sure - you need to be really fast at implementing BFS, DFS, binary search, etc. That should be wrote memorization but with a deep understanding).
I follow regularly these people, these are mostly slow-going blogs or websites:
1. Fabrice Bellard: https://bellard.org/ This is not really a blog, but a traditional website with links to the work of its author (who is probably one of the top programmers ever). It updates rarely, but when it does, it is to expose yet another monumental piece of computing work that will blow your universe.
3. Uriel l'Étranger: http://cat-v.org/ The only self-described "philosopher of software" that I have ever seen. His website is of course not updated anymore, but you can re-read it many times and always learn something new.
Do a good job as a parent, and ignore any demand on you that threatens to compromise that. If switching jobs makes you a better parent, switch. You will get other opportunities to lead startups, and you will never get this time with your children back.
If you want smaller chunks, Denis Bakhvalov is starting to hit his stride with a nice series of blog posts: https://easyperf.net/notes/
If that's not enough, Ingve (who submitted this) has made literally thousands of submissions, and a surprising number are excellent articles on optimization: https://news.ycombinator.com/submitted?id=ingve
If you live in a major metropolitan area almost anywhere in the US, and your skillset is in tune with the market, you should be able to find some W2 or 1099 contract quite easily through local recruiters. Yeah they will take their cut, but you still should be able to negotiate close to market value. You will definitely get paid more than using something like Upwork.
I’ve been building up a curated list of local third party recruiters for years. I always engage with them when they reach out to me.
Even though they call the above “consulting”, it’s actually closer to just freelancing.
Set a fixed-size binary radix trie of 32-bit IP addresses, say 1000 entries. Track the nodes of the trie in a list, LRU order; insert an IP, its node goes to the top of the list.
When you exhaust the available nodes, reclaim from the bottom of the LRU list --- but first, find either a sibling for the node already in the trie, or a parent, or a sibling of what that parent would be, and "merge" the IP address you're losing.
(So in reclaiming 10.0.0.1/32, merge with 10.0.0.0/32 to make 10.0.0.0/31, etc).
Over time, "important" /32s --- really, important prefixes, period, not just /32s --- will "defend" their position towards the top of the LRU, while the noise will get aggregated up, into /16s, /15s, /4s, whatever.
What you're doing here is inferring prefix lengths (netmasks), which is kind of magical.
You can do the same thing with memory addresses in a debugger to infer (course-grained, but without much effort) data structures and allocation patterns. There are probably other integers you can do this with that nobody's thought of.
A systems programmer will specialize in subdomains much like any other programmer, but there are some characteristic skills and knowledge that are common across most of them. Systems programmer coding style is driven by robustness, correctness, and performance to a much greater degree than higher up the stack. Most systems programming jobs these days is C/C++ on Linux and will be for the foreseeable future.
- Know how your hardware actually works, especially CPU, storage, and network. This provides the "first principles" whence all of the below are derived and will allow you to reason about and predict system behavior without writing a single line of code.
- Understand the design of the operating system well enough to reimplement and bypass key parts of it as needed. The requirements that would cause you to to build, for example, a custom storage engine also means you aren't mmap()-ing files, instead doing direct I/O with io_submit() on a file or raw block device into a cache and I/O scheduler you designed and wrote. Study the internals of existing systems code for examples of how things like this are done, it is esoteric but not difficult to learn.
- Locality-driven software design. Locality maximization, both spatial and temporal, is the source of most performance in modern software. In systems programming, this means you are always aware of what is currently in your hardware caches and efficiently using that resource. As a corollary, compactness is consistently a key objective of your data structures to a much greater extent than people think about it higher up the stack. One way you can identify code written by systems programmer is the use of data structures packed into bitfields.
- Understand the difference between interrupt-driven and schedule-driven computing models, the appropriate use cases for both, and how to safely design code that necessarily mixes both e.g. multithreading and coroutines. This is central to I/O handling: network is interrupt-driven and disk is schedule-driven. Being explicit about where the boundaries are between these modes in your software designs greatly simplifies reasoning about concurrency. Most common concurrency primitives make assumptions about which model you are using.
- All systems are distributed systems. Part of the systems programmers job is creating the illusion that this is not the case to the higher levels in the stack, but a systems programmer unavoidably lives in this world even within a single server. Knowing "latencies every programmer should know" is just a starting point, it is also helpful to understand how hardware topology interacts with the routing/messaging patterns to change latency -- tail latencies are more important than median latencies.
The above is relatively abstract and conceptual but generalizes to all systems programming. Domains of systems programming have deep and unique knowledge that is specific to the specialty e.g. network protocol stacks, database engines, high performance graphics, etc. Because the physics of hardware never changes except in the details, the abstractions are relatively thin, and the tool chains are necessarily conservative, systems programming skills age very well. The emergence of modern C++ (e.g. C++17) has made systems programming quite enjoyable.
Also, the best way to learn idiomatic systems programming is to read many examples of high-quality systems code. You will see many excellent techniques in the code that you've never seen before that are not documented in book/paper form. I've been doing systems work for two decades and I still run across interesting new idioms and techniques.
Recent but great talk posted here, What Bodies Think About: Bioelectric Computation Outside the Nervous System [1] by Prof. Michael Levin. He talks about how long term low energy electrical networks between all cells in living organisms shape how the organism grows. I think what he talks about will be the future of medicine as it allows for an amazing degree of high level control over how animals grow.
Never heard of the two people you mention, but I'll share what I've observed:
The two biggest time sinks in everyday programming are 1.) communicating with everyone else and 2.) looking things up. If you can avoid these and have a quick analytical mind, you can basically program as quickly as you can type. So the secret to being insanely productive is to work as a solo programmer in a domain where you are thoroughly familiar with all the tools, enough that you have your API calls completely committed to memory.
If you look at most programmers who have a reputation for being insanely productive, they all follow this pattern. Fabrice Bellard works pretty much exclusively in C, mostly on signal-processing or low-level emulation. Rob Pike works all in Go using the plan9 stack. John Carmack is all games and 3D graphics, using OpenGL. Jeff Dean has two major interests - distributed systems and large-scale machine learning - and works basically exclusively in C++.
I read an interview with Guido van Rossum, Jeff Dean, and Craig Silverstein within Google where they were asked basically the same question, and the answer all of them gave was "The ability to hold the entire program in your head."
I find these “shorter work weeks are just as effective” articles to be nonsense, at least for knowledge workers with some tactical discretion. I can imagine productivity at an assembly line job having a peak such that overworking grinds someone down to the point that they become a liability, but people that claim working nine hours in a day instead of eight gives no (or negative) additional benefit are either being disingenuous or just have terrible work habits. Even in menial jobs, it is sort of insulting – “Hey you, working three jobs to feed your family! Half of the time you are working is actually of negative value so you don’t deserve to be paid for it!”
If you only have seven good hours a day in you, does that mean the rest of the day that you spend with your family, reading, exercising at the gym, or whatever other virtuous activity you would be spending your time on, are all done poorly? No, it just means that focusing on a single thing for an extended period of time is challenging.
Whatever the grand strategy for success is, it gets broken down into lots of smaller tasks. When you hit a wall on one task, you could say “that’s it, I’m done for the day” and head home, or you could switch over to something else that has a different rhythm and get more accomplished. Even when you are clearly not at your peak, there is always plenty to do that doesn’t require your best, and it would actually be a waste to spend your best time on it. You can also “go to the gym” for your work by studying, exploring, and experimenting, spending more hours in service to the goal.
I think most people excited by these articles are confusing not being aligned with their job’s goals with questions of effectiveness. If you don’t want to work, and don’t really care about your work, less hours for the same pay sounds great! If you personally care about what you are doing, you don’t stop at 40 hours a week because you think it is optimal for the work, but rather because you are balancing it against something else that you find equally important. Which is fine.
Given two equally talented people, the one that pursues a goal obsessively, for well over 40 hours a week, is going to achieve more. They might be less happy and healthy, but I’m not even sure about that. Obsession can be rather fulfilling, although probably not across an entire lifetime.
This particular article does touch on a goal that isn’t usually explicitly stated: it would make the world “less unequal” if everyone was prevented from working longer hours. Yes, it would, but I am deeply appalled at the thought of trading away individual freedom of action and additional value in the world for that goal.
I can highly recommend it to anyone who wants to learn more about the kernel. This book is particularly great because it does not assume any kernel programming background knowledge - superficial understanding of C should be enough.
Hey, my name is Keith Adams, and I'm an engineer at Facebook.
I work on the HipHop virtual machine, a dynamic compiler and runtime for the PHP language. It is the hardest challenge I've tackled professionally, and the people I'm working with are brilliant and work like animals.*
I think the larger significance of our work on HipHop is in the context of software as a whole. PHP is one of those "developer productivity languages," like Python, JavaScript, LUA, etc. And, though I was skeptical before I came here, developers really are more productive in these languages. It would blow your mind what world-class people working in this medium can accomplish in compressed timeframes. This means that increasing these languages' performance is incredibly leveraged; to the extent we're successful at making PHP faster, we change the set of problems that PHP can address. Moving a problem from the "must be solved in C++" category to the "can be solved in PHP" category frees up our fellow developers to get more done with their finite professional lives.
*Edit: By "like animals", I mean, "with a survival-level sense of urgency." It doesn't necessarily mean long hours.
fast.ai 1 is much better than Udacity's ML if you want to do purely Deep Learning. Udacity's DLF ND is IMO better than fast.ai 1. However, fast.ai 2 is way better than Udacity's DLF ND, and only their self-driving car has some Deep Learning models that are even more awesome (imagine driving using Deep Learning only). Take both fast.ai 1 and DLF ND if you are a beginner; after that jump on Stanford's CV & NLP courses; Oxford NLP might be a good idea as well. Then do UCL's Reinforcement Learning and you are set.
This is more of a quick tour of the basic API for libsodium than a workshop for learning cryptographic engineering. Many of the exercises have no cryptographic component at all; the remainder basically exercise the most basic libsodium sign/verify/secretbox functionality. None of the exercises explain the rationale behind any of the libsodium constructions, and (because libsodium) nonce-based authenticated encryption is used without explaining any of the details of what a nonce is and what the requirements are for generating and handling them. Finally, the service model for the crypto involved doesn't make a whole lot of sense; most of the exercises build a "tamper proof log" from a simple chained hash function, and ultimately encrypts that log using a key exposed to the server anyways.
Respectfully, I don't think this is how you should learn cryptography (certainly: you shouldn't call this kind of work "cryptographic engineering").
I'm talking my own book here a little (but just a little, since it's not like I make a dime from this) when I say that the better way to learn and understand cryptography is stuff like the Matasano Cryptopals challenges:
https://cryptopals.com/
These exercises will try to teach you crypto engineering by breaking cryptography, and without wasting much time structuring a trivial JSON interface. You'll understand what a nonce is by the end of set 3 because you'll have written exploits for otherwise sane cryptosystems that mishandle nonces. By the end of set 8 you'll have implemented invalid curve attacks and built and broken short-tag GCM AEAD encryption and, hopefully, be a little nauseous any time someone asks you to use crypto again --- which is the way it should be.
Even after you've undergone our cryptogaphic Ludovico Process, you still won't be a "cryptographic engineer". I've been testing and building exploits for random cryptosystems for over a decade and I'm nowhere close. The simple, blunt reality of it is that if you're going to build anything close to new with cryptography, you really do have to understand the math, and anyone who claims you can get to "securing a banking interface" without a detour through abstract algebra is, I think, doing you a disservice.
I recommend Shai Shalev-Schwartz and Shai Ben-David's Understanding Machine Learning: From Theory to Algorithms [0]. I've also used and found Mohri's Foundations of Machine Learning quite insightful [1]. Usually, between the two books, at least one's proof is easy to follow.
Get both unless you're only getting one, in which case, get Shai Shalev-Schwartz's.
Yeah, a doubly-linked list is basically the worst possible choice for your first program in Rust. This is because Rust likes all of your memory to have one clear owner. And this means that cyclic data structures are unusually difficult compared to almost any other first program you might choose.
This will walk you through multiple different kinds of linked lists, and show you how to implement each in Rust. Along the way, you'll learn more about Rust's ownership system than most working Rust programmers generally need to know.
Personally, I've written quite a bit of production Rust code, and I've done so without ever touching 'unsafe' or needing a linked list. If I really did need a linked list, I'd either grab a good one off crates.io, or just knock out a quick one using `unsafe` and raw pointers. I mean, Rust can do basically anything C does if I ask nicely. It's just that if I choose to write "C in Rust", then I get about same safety guarantees that C offers.
Ah, I think we're just arguing semantics then. I see that as very much a part of C++(new vs malloc only has some small subtle differences for instance).
I will say if you don't understand pointers you're in for a world of pain in C++.
* Designing Data Intensive Applications (M Kleppmann): Provided a first-principles approach for thinking about the design of modern large-scale data infrastructure. It's not just about assembling different technologies -- there are principles behind how data moves and transforms that transcend current technology, and DDIA is an articulation of those principles. After reading this, I began to notice general patterns in data infrastructure, which helped me quickly grasp how new technologies worked. (most are variations on the same principles)
* Introduction to Statistical Learning (James et al) and Applied Predictive Modeling (Kuhn et al). These two books gave me a grand sweep of predictive modeling methods pre-deep learning, methods which continue to be useful and applicable to a wider variety of problem contexts than AI/Deep Learning. (neural networks aren't appropriate for huge classes of problems)
* High Output Management (A Grove): oft-recommended book by former Intel CEO Andy Grove on how middle management in large corporations actually works, from promotions to meetings (as a unit of work). This was my guide to interpreting my experiences when I joined a large corporation and boy was it accurate. It gave me a language and a framework for thinking about what was happening around me. I heard this was 1 of 2 books Tobi Luetke read to understand management when he went from being a technical person to CEO of Shopify. (the other book being Cialdini's Influence). Hard Things about Hard Things (B Horowitz) is a different take that is also worth a read to understand the hidden--but intentional--managerial design of a modern tech company. These some of the very few books written by practitioners--rather than management gurus--that I've found to track pretty closely with my own real life experiences.