Hacker Newsnew | past | comments | ask | show | jobs | submit | rjammala's favoriteslogin

My favorite lens on this comes from Hamming:

> It is well known the drunken sailor whos taggers to the left or right n independent random steps will, on the average, end up about sqrt(n) steps from the origin. But if there is a pretty girl in one direction, then his steps will tend to go in that direction and he will go a distance proportional to n. In a lifetime of many, many independent choices, small and large, a career with a vision will get you a distance proportional to n, while no vision will get you only the distance sqrt(n). In a sense, the main difference between those who go far and those who do not is some people have a vision and others do not and therefore can only react to the current events as they happen.

Just a tiny bit of bias towards a direction will get you very far very fast.

I once modeled+visualised this with a bit of javascript[1] and it's quite surprising to see the huge difference from even a tiny multiplication factor on each random/probabilistic decision.

[1] https://swizec.com/blog/your-career-needs-a-vision/


When Jeff Hodges gave a presentation of his "Notes on Distributed Systems for Youngbloods"[1] at Lookout Mobile Security back in like 2014 or 2015, he did this really interesting aside at the end that changed my perception of my job, and it was basically this. You don't get to avoid "politics" in software, because building is collaborative, and all collaboration is political. You'll only hurt yourself by avoiding leveling up in soft skills.

No matter how correct or elegant your code is or how good your idea is, if you haven't built the relationships or put consideration into the broader social dynamic, you're much less likely to succeed.

[1] https://www.somethingsimilar.com/2013/01/14/notes-on-distrib...


Anecdote here, but when I was in grad school, I was talking to a PhD student i respected a lot. Whenever he read a paper, he would try to write the code out and get it working. I would take a couple of months but he could whip it up in a few days. He explained to me that it was just practice and the more you practice the better you become. He not only coded things quickly, he started analyzing papers quicker too and became really good at synthesizing ideas, knowing what worked and didn't, and built up a phenomenal intuition.

These days, I'm fairly senior and don't touch code much anymore but I find it really really instructive to get my hands dirty and struggle through new code and ideas. I think the "just tweak the prompts bro" people are missing out on learning.


I'm seeing the same question come up about general performance versus specialized performance, so let me offer a longer explanation here. This might be worth a blog post at some point.

We now live in a world of both readily available small specialized models and general models.

In the last couple of years, we've seen an explosion of capability in generative models built and trained to be performant on a general set of capabilities. In Google's case, this model is Gemini. Gemini can summarize text, count the number of ducks in an image, generate a pelican SVG, play Pokemon, play chess, and do so many other things. It can do this all with a vague set of inputs across many modes. For models of this scale (many billion parameters), it's quite incredible how, with even vague or misspecified inputs, the computer can still produce useful results in complex scenarios.

However, there is an entire ecosystem of generative models that are purpose-built for ONE specific task. The ones I worked on are typically referred to as Bayesian models. These are models that can give probabilistic estimates of how many customers a restaurant will get in a day, or given penguin dimensions, predict the probability of penguin species, or models that take measurements from composite material testing and estimate if your airplane will stay together in flight. With models this size, it's incredible how a model with tens or hundreds of parameters can assist humans in making better decisions. I write about this specifically in PPL book I wrote a coupe years back. Chapter 9 provides the most "real world" workflow.

https://bayesiancomputationbook.com/markdown/chp_09.html

If you look through all the chapters you can see examples of forecasting models, bike sharing demand estimators, and all sorts of other narrow tasks. The tradeoff at this small scale, though, is the models have to be designed bespoke to your situation, and once you build one, it only works in that narrow task. No one expects to be handed a small Bayesian model that is already perfect at their task; it's implicit that users will bring their own data to update the model parameters.

So with this said, Gemma 270m is between these two paradigms. It's not at Gemini-level general performance and never will be. But it's not as rigid as an "old school" PPL-style Bayesian model where you need to make one by hand for every problem. However since it needs to be shaped to match specific tasks, we did our best to design it to be a flexible starting point for LLM-style tasks and worked with partners to put it into the right frameworks and places for you all to be able to shape it to what you need it to be. As the adage goes, consider it to be a tool in the toolbox between fully custom truly tiny generative models with 10 parameters and general generative models with lots of capability. Maybe not everyone needs this tool, but now you all have the choice.

Stepping aside from the technology for a moment, as a model builder and open ecosystem advocate, you never quite know how the community will receive these models until you release them. I genuinely appreciate you all commenting here; it helps me get a sense of what's working and what to focus on next.

And thanks for being kind about my typos in these answers. Trying to answer as many questions as possible across HN and various other forums.


The text has some great explanatory diagrams and looks to be a very high quality overview of ML thru the lens of probability, with lots of math.

I was also recently impressed by Zhaos "Mathematical Foundation of Reinforcement Learning", free textbook and video lectures on YT : https://github.com/MathFoundationRL/Book-Mathematical-Founda...

If you dont have a lot of time, at least glance at Zhaos overview contents diagram, its a good conceptual map of the whole field, imo .. here :

https://github.com/MathFoundationRL/Book-Mathematical-Founda...

and maybe watch the intro video.


For the memory order not covered in this talk, see Fedor Pikus: https://www.youtube.com/watch?v=ZQFzMfHIxng

And this whole blog series: https://preshing.com/20120612/an-introduction-to-lock-free-p...


I read through the second edition, and would recommend as well [1] (has numpyro, pyro, pymc code). He also has a great lecture series on youtube [2]

[1] https://xcelab.net/rm/

[2] https://www.youtube.com/watch?v=FdnMWdICdRs&list=PLDcUM9US4X...


There was a HN discussion some time ago about choosing a textbook that is optimal for oneself (https://news.ycombinator.com/item?id=41016650), and this passage from the answer about deliberate practice stuck out to me:

In the field of talent development, there is absolutely no debate about the most superior form of training. It's deliberate practice: mindful repetition on performance tasks just beyond the edge of one's capabilities.

Deliberate practice is about making performance-improving adjustments on every single repetition. Any individual adjustment is small and yields a small improvement in performance – but when you compound these small changes over a massive number of action-feedback-adjustment cycles, you end up with massive changes and massive gains in performance.

Deliberate practice is superior to all other forms of training. That is a "solved problem" in the academic field of talent development. It might as well be a law of physics. There is a mountain of research supporting the conclusion that the volume of accumulated deliberate practice is the single biggest factor responsible for individual differences in performance among elite performers across a wide variety of talent domains. (The next biggest factor is genetics, and the relative contributions of deliberate practice vs genetics can vary significantly across talent domains.)


Here is my advice on learning C++. Don't learn it first. Learn C first, become good at it. Then ease your way into using certain functionality of C++ in your code, and so on. Don't try to bite it all off at once. I've been developing in C++ for over 30 years now, and I'm still learning something new about it.

Look at it like Mathematics. You learn Algebra, and Trig, but Calculus is a something different. If you don't have a solid understanding of the previous, you're not going to understand the symbolic manipulation of Calculus. It then gets worse from there as you go into Differential Equations.

IMO procedural and imperative programming should be understood before trying to grasp polymorphic, object-oriented and functional programming. There is a lot of abstraction, and training of your mind to abstract is a process.

Be careful, there are purists that exist out there, and they will scream and complain, but ignore them. Don't let others control how you think. A late professor of mine once told me something I thought I understood, but it didn't hit me until years later.

"Any program written in any language that produces the correct results is a correct program."

As long as it does what it is supposed to do, it doesn't matter how you get there.

Also, code purity is myth, and those seeking it are believers in mysticism and false aesthetics. There is good code organization, but when you're up against deadlines, good luck keeping it organized.

This memory safety issue is important, don't get me wrong, but it is completely over blown. SpaceX built incredible rockets and systems in C/C++. We have external tools to check for these issues, and they still need to be used with Rust.


There are many others that are better.

1/ The Annotated Transformer Attention is All You Need http://nlp.seas.harvard.edu/annotated-transformer/

2/ Transformers from Scratch https://e2eml.school/transformers.html

3/ Andrej Karpathy has really good series of intros: https://karpathy.ai/zero-to-hero.html Let's build GPT: from scratch, in code, spelled out. https://www.youtube.com/watch?v=kCc8FmEb1nY GPT with Andrej Karpathy: Part 1 https://medium.com/@kdwa2404/gpt-with-andrej-karpathy-part-1...

4/ 3Blue1Brown: But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning https://www.youtube.com/watch?v=wjZofJX0v4M Attention in transformers, visually explained | Chapter 6, Deep Learning https://www.youtube.com/watch?v=eMlx5fFNoYc Full 3Blue1Brown Neural Networks playlist https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_6700...


Author here. Seeing all of the positive comments about my book is really warming my heart. I appreciate everyone and I'm glad so many people have enjoyed the book. I put a ton of time and love into it and it's gratifying to see it had the effect I'd hoped for.

ian lance taylor, author of gold, major binutils contributor, and one of the core golang team, wrote a book on linkers as a series of 20 blog posts in 02008, which of course also includes using tools like objdump, nm, and readelf to explore compilation artifacts https://lwn.net/Articles/276782/

levine's book was the book in the 01990s


For a different but useful approach, one book I liked is:

How to Solve it by Computer - by R. G. Dromey, a lecturer or professor at a university in Australia, IIRC:

http://en.wikipedia.org/wiki/How_to_Solve_it_by_Computer

Acording to Wikipedia (and also mentioned in his book, IIRC), Dromey was inspired by the techniques of the book "How to Solve it" by George Polya, a famous mathematician, and applied some of his techniques to programming with data structures and algorithms.

http://en.wikipedia.org/wiki/George_P%C3%B3lya

But he doesn't just translate the Polya book from the domain of mathematics to software; he builds up the solution to each problem posed in his book, in an incremental way, sort of as he discovers or invents them, sharing the reasoning behind the decisions as he goes along. I found this to be one of the good features of the book.

I had the good fortune to read Polya's book too, early on in college - I think it was rare even then. It had gems like "solve the more general problem" (if there is one), which will automatically solve the specific problem you are working on, as just another special case, and some times the solution to the more general problem may turn out to be simpler than the solution to the specific problem. But the book has much more than that, in terms of problem-solving advice.


The basic problem, as I've written before[1][2], is that, after I put in Nagle's algorithm, Berkeley put in delayed ACKs. Delayed ACKs delay sending an empty ACK packet for a short, fixed period based on human typing speed, maybe 100ms. This was a hack Berkeley put in to handle large numbers of dumb terminals going in to time-sharing computers using terminal to Ethernet concentrators. Without delayed ACKs, each keystroke sent a datagram with one payload byte, and got a datagram back with no payload, just an ACK, followed shortly thereafter by a datagram with one echoed character. So they got a 30% load reduction for their TELNET application.

Both of those algorithms should never be on at the same time. But they usually are.

Linux has a socket option, TCP_QUICKACK, to turn off delayed ACKs. But it's very strange. The documentation is kind of vague, but apparently you have to re-enable it regularly.[3]

Sigh.

[1] https://news.ycombinator.com/item?id=10608356

[2] https://developers.slashdot.org/comments.pl?cid=14515105&sid...

[3] https://stackoverflow.com/questions/46587168/when-during-the...


Oh wow that's a really good question. I would pick the following but keep in mind it's important to go through all the different solutions.

1. Unique Paths 2. https://leetcode.com/problems/unique-paths-ii/. This problem can be solved with dynamic programming, graph traversal, and recursion.

2. Max Product of K Integers. Not exactly on Leetcode, but this one I really like. https://leetcode.com/problems/maximum-product-of-three-numbe.... It's essentially finding the maximum product obtained by selecting k numbers from a list of N integers (1 <= k <= N). This problem can be solved with a greedy algorithm, two pointer with sorting, using heap, and dynamic programming.

3. Word Ladder (https://leetcode.com/problems/word-ladder/) I think it would be wrong not to include a string problem. I think there are a lot of different ways to solve this one. You could use a trie or treat the strings as a graph. There are also a lot of optimizations you can make depending on the length of words vs number of words.

Honorable Mention. LRU Cache (https://leetcode.com/problems/lru-cache/). This may be the most common interview question asked at one point. Being able to execute a doubly ended linked list, leverage a dictionary, and implementing them all in a class with object oriented principles.

These don't cover everything, but give pretty good coverage of all the topics.

Hopefully that answers your question!


I highly recommend modern Emacs over VSCode. If you don't want to tinker, Emacs has multiple levels of preconfigured distributions like Prelude, Nano, and Doom Emacs that give the same out of the box working experience. Modern Emacs includes all of the great features of VSCode, plus much more. You get things like:

- LSP support for completion, refactoring, listing.

- Tree sitter support for super fast highlighting.

- File supporter just like VSCodes.

- Project management tools.

- Fuzzing finding for everything, including files, commands, and buffers.

- Probably the best git client out there (magit). It's seriously magical.

- Terminals (can pipe terminal contents directly into other buffers).

- DAP mode for debugging, ctags, cscope.

It's all FOSS, worked on by a non-profit foundation and volunteers, it can be used as a GUI or in the terminal, and it's insanely extensible thanks to it being a small C core running a lisp environment. Lisp is easy to learn and fun, and if you don't like lisp there's still a bazillion more packages you can install for extra features.


> They also lost some muscle mass, as expected.

Not sure why that would be expected. Fasting leads to significantly increased levels of HGH which preserves muscle and improves body composition. [1]

Fasting yields an up to 1250% increase in HGH depending on the duration of the fast. [2]

[edit] > ...exercise has a much larger magnitude of benefit than dietary interventions...

This could not be further from the truth. Exercise is not a good way to lose weight. It has a ton of health benefits and you should do it anyways but you should not expect to lose weight from exercise alone.

Think about it, 1lb of fat loss requires a caloric deficit of 3500kcal. An average human burns like 100kcal per mile running - so 35 miles per pound of fat. Deficit. Humans are well known for compensatory eating, after all you're running 1.5 marathons, you're going to be hungry, right? Someone who needs to lose 16bs needs to run from San Francisco to LA along the entirety of the high-speed rail Phase 1 track, via Merced, Bakersfield and Palmdale.

This is bad advice and it simply does not hold up.

> Contrast this with the utterly confusing state of nutrition science.

Fasting is the simplest nutrition science there is. BMR 2000kcal? Don't eat for a day, lose 0.6lbs. Or, you know, run an entire marathon without treating yourself to any more food that day than you would have normally consumed.

[1] https://pubmed.ncbi.nlm.nih.gov/3127426/

[2] https://pubmed.ncbi.nlm.nih.gov/6758355/


Avoid all these other types except Solver (we call it Fixer at FB). Anyone who calls themselves these things is weird, and will be hard to rely on.

Disregard levels. Disregard titles. Fix what needs to be fixed. Unit tests, CI, docs, there's no such thing as grunt work. Real leadership happens in the IDE and only code matters - everything else is overhead.


Although you understood the importance of practice, I still want to reiterate it. Practice is the key! As far as I know, ACRush and Petr have many years of experience of practicing and competing in serious algorithmic contests such as IOI, ACM, etc. They have solved thousands of algorithmic problems and counting. Through that hard working process, they polish their problem solving and coding skill to the very high level as you know. I am not sure hard working is enough to reach the level of Petr's or ACRush's but I believe it can push you further than you think you can be. Here are my concrete suggestions:

1) Read relevant chapters in CLRS (Introduction to Algorithms, kind of bible textbook).

2) Try to solve as many problems as you can in:

http://acm.uva.es/ (mix, some hard)

http://acm.timus.ru/ (mix, some hard)

http://www.spoj.pl/ (mix, many hard)

http://acm.sgu.ru/ (mix, many hard)

Topcoder.com (of course)

3) Solve the problems first then read experts' code to learn coding skill from them. Petr publishes his own TC screencasts in his blog: http://petr-mitrichev.blogspot.com/ You can learn a lot from watching them.

4) Practice consistently and never give up.

Good luck.


I haven't been doing anything at all, beside some mechanical tax-loss harvesting. Suffered -21% YTD returns (~$800k lost on paper) on my 3-fund portfolio, which includes roughly 85/15 stocks/bonds.

Thanks to new contributions my net worth since last year is "just" down -11%. I haven't increased the rate of my new contributions because I have always dumped all my savings in the market since forever, I never kept any "dry powder" for moments like these as I always thought it was too much opportunity cost. Actually, thinking about it, the rate of new contributions significantly decreased this year because a good chunk of my pay is in FAANG RSUs :-)

It sucks, but I know I don't know anything and I can't risk timing the market and being double-slapped (first slap: taxes to liquidate the portfolio; second slap: not catching the market rebound, when/if it will happen).

Again, it sucks when you think how much stuff I could have bought with the ~$800k I lost on paper. My parents, old-school folks, think I am completely insane and irrational for letting this happen, and they also though this in March 2020 when I didn't sell. I literally have my dad texting me every other day begging me to sell. But surprisingly you get used to it, and I do not think I would panic, nor change my plans, even if we go much lower from here.


Part of being a software engineer is constantly feeling stupid. Until you realize this is normal, you're going to feel bad and have imposter syndrome.

If it were easy, everyone would do it and it wouldn't pay as much. It's supposed to be uncomfortable in the same way that exercising at the gym feels uncomfortable. Thinking is effort.


I have worked at Microsoft, Google and Facebook as a software engineer, going through the full interview process every time.

The thing to realize is that being good at technical interviews (as done by the above companies) is a skill unto itself but it is a skill an intelligent person with a comp sci background has the ability to get significantly good at after a 1 to 2 months of disciplined preparation. - I went to a top ranked school myself and had a comp sci degree but was very intimidated by technical interviews until I realized that this was no different than all the other other intellectual hurdles/gauntlets I had successfully navigated up to that point by giving myself time to thoroughly prepare.

Get "Elements of Programming Interviews" and give yourself 2 months to prepare. Start to with "1-month" plan in the book spending at least an hour a day at the very minimum. (I have worked through both Elements of Programming Interviews and Cracking the Coding Interview in their entirety and while both are good, in my experience Elements of Programming Interviews was clearly the better preparation in terms of technical depth, breadth of exposure to the kinds of questions I faced in the full-day interviews, and succinctness of coding solutions)

Get dry-erase paper/notebook or a white-board and work through the problems by hand including the coding (important!). For the first week or two give yourself an honest focused couple of hours to wrestle with a problem before looking at the solution. it is not enough to settle for "I think I know how to solve this" - Actually code up the solution by hand and step through it with some simple cases. This is important and it allows you to develop confidence in your ability to think methodically through a problem as well as giving you an opportunity to develop mental heuristics for how to tackle and test unfamiliar problems. Developing confidence in your ability to think through interview-style problems is every bit as important as exposing yourself to interview-style problems. As you progress, you will be working towards being able to deconstruct a problem and be ready to start coding up a high confidence solution in 15 - 20 minutes.

"Talk to yourself" as you try to solve a problem to simulate explaining your thought process to someone as you go along.

When going through the solutions in the book, do not gloss over a detail you do not understand. Go online and find alternative explanations/references if you don't understand some detail of the solution provided.

After a few weeks of this kind of daily disciplined prep, you should start feeling pretty good and your confidence should start building nicely. Lots of interview questions are variants of each other and once you have enough breadth, you start quickly being able to key into the "type" of question and possible solution approaches almost as soon as you hear it.

Last thing is when you feel ready to start doing interviews, do not interview with your "top choice" first. If you can find someone that has done interviews to give you a mock interview, great! If not, schedule interviews whose outcome you are not as attached to (relatively speaking) first.

Hope that helped.


I've been working remotely on and off for 20 years, and consistently for the last 10 years.

A few suggestions:

- Choose to not make every day the same. You're free! Once or twice a week, take a train or bus to a different place for the day, somewhere with good cell signal. Work from a national park, or a beach, a café, pub, museum, library, take a river cruise... whatever. There's nowhere in the world that has nothing to offer. Just get into the habit of picking somewhere different than home and going there to work for a day. In your breaks and before/after working, explore, relax... Just enjoy yourself somehow.

- Be deliberate about arranging calls with coworkers, friends, family. Just hang out on a call. Go for a walk while you take these calls, or do something other than sit in a chair.

- If you don't already have hobbies, you want to discover some that you enjoy. Do something that creates nice memories with your spare time, and that isn't the same as your work, or just relaxing. Once you've got one or more hobbies, if you're missing social interaction, find a (local/regional/national/International) group, meet up, society or some other form of community. Join it and participate. In person, online, maybe travel to go to events related to it. You'll quickly find your people this way. There are friends and partners out there for everyone.

- if you're not happy with where you live... move! It might not seem possible, but it is. Whatever the hurdles, you can get over them. Remote work is freedom to choose your environment. Choose it.


https://overthewire.org/wargames/bandit/

I did a bunch of those panels in my time. Imo if you read 2-3 books on unix/linux/networking and can complete this wargame in a couple evenings you’ll stand out quite a bit.


Everyone recommends The Manager's Path, but I don't think it's a good book to explain HOW to become a manager. The book's goal is to explain the career path of a manager from tech lead to CTO.

My #1 recommendation these days is "Become an Effective Software Engineering Manager" by Jamies Stanier. This book explains how to approach the work a manager is involved in and what you can expect from the day to day. Planning, hard conversations, performance reviews etc.

Also, look for general management books. Leadership is something all humans do – software management is about managing creative people. Some other books I recommend are:

• Creativity, Inc by Ed Matmull • Crucial Conversation • Team of Teams

For email newsletters, I recommend Software Lead Weekly (https://softwareleadweekly.com/) and Better Allies (https://betterallies.com/more-content/).

Lastly, I also write a blog called Build the Stage (https://www.buildthestage.com) about managing SWEs. I've got posts about performance reviews, team meetings, how to give feedback, etc. It'll help you out.


I ran a few dozen kafka clusters at MegaCorp in a previous life.

My answer to anyone who asks for kafka: Show me that you can't do what you need with a beefy Postgres.


Note that at Big Tech companies, almost all code is written by L3/L4 (the untitled "Software Engineers") and the few L5s ("Senior") that enjoy coding. L6-L8 ("Staff" through "Principal") are the political levels, where EQ is more important than IQ, your job is as much wrangling/convincing/selling/negotiating with other people as writing code. Most L6-L8 engineers are managers, but even those who are ICs spend the majority of their time politicking.

Being under the protection of a good L6+ is essential to actually getting anything done, launching, and getting promoted at a big company. Without it, you'll find that you won't get any cooperation from other teams, your projects will get canceled, and nothing you do will ever launch.

Staff+ engineers burn out at significantly higher rates than lower levels. I know a lot of Senior engineers that are quite content to just sit there and never get promoted, because the quality of life at that level is IMHO significantly better than at higher levels. Was talking with another recently-promoted Staff engineer (I'm Staff myself), and we were lamenting that Big Tech doesn't really do demotions, because the job expectations at higher levels get progressively higher, you have progressively lower control over outcomes, and the consequences for screwing up are significantly worse.


Exercise.

You don't have to run marathons. Start with 30 min walks every morning. (Outside helps). Work your way up to moderate exercise that gets your heart rate up.

It's amazing the difference it will make.


No one is guaranteed a job at Big-N companies.

It's important to understand that Big-N job offers or rejections aren't a perfect indicator of a person's programming ability.

The interview process isn't perfect. It's not designed to be perfect, because that's an impossible goal. In reality, Big-N companies have so many applicants that some of them have higher rejection rates than Ivy League universities.

At this scale, the goal isn't to admit any and every qualified candidate. The goal is to select for the best of the best and minimize false positives, even if it results in a large number of false negatives.


https://www.thebalancecareers.com/work-life-balance-and-jugg...

“Imagine life as a game in which you are juggling some five balls in the air. You name them - work, family, health, friends and spirit - and you're keeping all of these in the air. You will soon understand that work is a rubber ball. If you drop it, it will bounce back. But the other four balls - family, health, friends and spirit - are made of glass. If you drop one of these, they will be irrevocably scuffed, marked, nicked, damaged or even shattered. They will never be the same. You must understand that and strive for balance in your life." —- Bryan Dyson, then the President and CEO of Coca-Cola Enterprises, delivering a commencement speech at Georgia Tech


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: