Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How to transition from academic programming to software engineering?
220 points by fdsvnsmvas on Sept 9, 2018 | hide | past | favorite | 103 comments
I taught myself how to code and ended up doing a PhD in a computational discipline. Programming has been a big part of my life for at least the last decade, during which I've written code almost every day, but always by myself. After graduating I joined a medium-sized company (~10^2 developers) as a machine learning engineer and realized how much I don't know about software engineering. I feel very comfortable with programming in the small, but programming in the large still feels mostly opaque to me. Issues like testing / mocking, code review, management of dev / stage / prod workflows and, most importantly, the judgment / taste required to make maintainable changes to a million LOC repository, are areas where I can tell I need to improve.

Former academics who moved into software engineering, which resources did you find most useful as you made the transition? Python-specific or language-agnostic books would be most helpful, but all advice would be welcome.




I have hired some PhDs in your situation and worked with others, I personally just went to work after I got my BSEE.

My observation is that you're halfway there when you realize that you need to improve, of the folks I saw who did poorly it was because they didn't realize that you could be both the smartest person in the room and the least capable at the same time.

Right now, on your first job experience, even a kid who never went to college is better at programming than you are because they've been experiencing all the pitfalls that can happen and have built up a knowledge from that which is perhaps more intuitive than formal but serves them well. What you have over that person is that you've trained yourself to consume, classify, organize, and distill massive amounts of information in a short amount of time.

Use that training to consume everything you can on the art of writing programs. Read "Test Driven Development" read "Code Complete", read "Design Patterns", read "The Design of the UNIX Operating System", read "TCP/IP Illustrated Volume 1", and a book or two on databases. With your training you should be able to skim through those to identify the various "tricky bits" and get a feel for what is and what is not important in this new field of yours.

Soak in as much as you can and ask a lot of questions. Pretty soon no one will know you haven't been doing this your whole life.


This is pretty good advice overall. One small change I suggest is to take it easy on Design Patterns and the like. I’ve seen people in OPs position (general smarts but limited production experience) turn into architecture astronauts and start overengineering everything. It can be useful if you’re working on a legacy codebase and need to understand the jargon that can appear in [possibly overengineered] existing codebases.


I've written software over a decade and I loath the design patterns book.

I would not give it to a beginner as it would corrupt their mind with useless drivel .

It's authors exhibit themselves as morons who celebrate renaming existing computer science concepts while occasionally mixing and matching them.

I would not be this uncharitable towards it if it was not such a famous (and hence harmfull) book.

It's harmfull because software engineering is really hard, and they just add up to the load by trying to have the reader memoize their junk instead of doing something that would actually make them a better programmer. And, if you try to use their concepts while programming - shudders, oh god help you.

I'm not going to iterate over every pattern. Here's an example:

Flywheight? You assholes, just tell memoization for what it is. Why don't you rename existing data structures as well? Good thing those come out of the box, otherwise you would have began the book by renaming array, linked list and dictionary. Maybe you would have called linked list "the chaingang" or something.

You don't simplify things by giving things cuddlier names. You stunt peoples growth that way.

Gang of four book is a malignant offshoot of the practice of attempting to make software engineering better by legalizing and dogmatizing it by decomposition into trivial details that hurt your brain. It's a branch of 'consultancy driven software development' where people attempt to aquire a halo of professionalism by calling things by fancy names and over-complex descriptions (while skipping the practical things with equally complex names but at least those weren't made up by a bunch of idiots).


While I agree that the GoF book is way overrated, I think you're being a bit harsh. Most of the patterns are actually tricks to overcome limitations of statically-typed class-based languages like C++ and Java. They can be useful in these languages but are not necessary with dynamic languages. Take Strategy for instance, with Python or JS, we'd just use a callback but in old Java you couldn't do that, hence the need for a more complicated technique. Creational patterns also mostly fall in this category. Some are so trivial (Template Method, Adapter), that it's hard to understand why they make them such a big deal.

Among the few patterns that are more generally useful and have stood the test of time, we'd find Composite, Command and of course Observer.


Your comment is very interesting. I recently took a course on Design Patterns. I sat squirming during the lectures because I didn't like what was being said, but couldn't put my finger on what exactly I disliked.

What I understand from your comment is that you dislike the Gang of four book because it renames concepts that don't need the cutesy names that they give them. Do you have a problem with the _concept_ of design patterns? Or just the names they are given? Are the concepts themselves sound and worth paying attention to?


I wrote a long rant as a response to another comment above. To quote myself:

"It reads like it was written by a clever, verbose, and 'over-eager novice."

Design patterns are a really usefull concept. The GoF book just totally botches up that concept by dwelving deep in trivial language details while missing the big picture.

Christopher Alexander's "A timeless way of building", and "Notes on the Synthesis of Form" are the books in architecture that prompted a lot of dicussion in software design circles, and from which I presume GoF got their idea of software design patterns.

What are good design patterns in software? IMO they are composed from the programming models exposed in a basic CS book like Aho and Ullman's "Foundations of Computer Science" and further developed in a books like Structure And Intrepretation of Computer Programs.

GoF is an ok anecdotal reference after those, but it really is not suitable as a didactic resource.

Peter Norvig wryly commented that 16 of the 23 pattern are either invisible or non-existent in lisp Lisp[0].

[0] http://www.norvig.com/design-patterns/design-patterns.pdf


I do not like the GOF book. Some of the patterns appears sometimes in my code, but the book has never helped me to program better or helped me to think about my code. The single benefit I got from this book is that it gaves me words to explain my code to other people.


your instinctual reaction against design patterns because you may have realised that design patterns are patches over flaws in the programming language's design - they're a terrible basis to build an architecture on. There's some discussion on this idea at the C2 wiki: http://wiki.c2.com/?AreDesignPatternsMissingLanguageFeatures


Thanks for this post, I remember reading about "Flyweight" and being confused at the time, if it's just memoization that instantly makes sense and is just such a better way of describing things.


I disagree.

Design Patterns are very useful, but they are a tool like anything else.

They're not just a bunch of things with cutesy names, they are the most common and core patterns that we can use in OO.

Yes, they are way overused, particularly in Java, but they are useful.

The book could be summarized however.


Everybody recommends it for no particular reason, and then I read one anonymous post ripping it apart in detail. I’m strongly inclined toward believing the latter and massively discounting most people’s opinion as unfounded, assuming they didn’t read it, and are just trying to convince others that they did. Is this the right way to think about this?


Being charitable, you could say at least part of the former are experiencing confirmation bias - that they enjoyed reading a different take on their existing knowledge - and cannot be arbiters of whether the book imparts anything useful to those without that knowledge.


as a middle ground: my advice would be to learn and internalize design patterns. and then “forget them.” let experience show you when to break the rules. keep things simple.

this level of the craft is subtle, intuitive, and often inaccessible to the conscious intellect.


I agree. I would like to add, doing something like The Rails Tutorial from Michael Hartl would be a great place to start. It takes you through a reasonably real world application including testing and how to think about the constituent parts. A PhD is like being an expert in materials science while lacking in craft-level skills such as running a vertical mill.


To all readers who've never experienced it, working on a large Java (and other) codebases where the original developers completely overused patterns, your life will be hell in my opinion, just as much as a codebase written totally without thought or patterns.

Patterns can be useful but only in moderation and only with reason. They're spice.

A contrived example, but:

https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...

FizzBuzz Enterprise Edition. It's enterprisey!


> folks I saw who did poorly it was because they didn't realize that you could be both the smartest person in the room and the least capable at the same time

Any advice to deal with this kind of people would be much appreciated. I had a hard time trying to convey basic software engineering practices such as not hard-coding file paths, using version control instead of filenames to manage versions, removing dead code, keeping the code at least moderately DRY, more or less sticking to a code convention, etc. They would use all their reasoning power to refute what they would see as meaningless pet-peeves. It was tiring. I tried to provide pointers so they could find out for themselves but they gave it zero interest. At the moment I've given up.

The simple fact the OP is asking this question here means she or he will almost certainly do greatly.


If you have two such people, you might get there by making each of them the sole maintainer of the other's code. They might - might - see the problems when it's being done to them, instead of them doing it to others.

More in general, the problem is that these people have been in an environment where programs were small enough that those practices didn't cause problems. They've moved into an environment where the scale is different, and those practices now cause real pain. Until they experience the pain (rather than just being told about it), they aren't really going to get it.


Sadly it is very much a person by person process. In my experience with folks like this there are people they can "hear" advice from and those that they cannot. One option is to talk to their original PhD adviser (if they are still around) and get them on board. You and the adviser's interests are aligned, you both want the person to succeed and to do so they have to be able to hear your guidance on how to be better at the job. Sometimes they will be able to hear it from someone they have already put in the 'trusted advice' category.


I would also add that lots of new people waste more time doing TDD wrong than they would save doing TDD right. There are lots of things you only really understand until you've done them wrong.


Yeah, a lot of software engineering schools of thought are just based on superstition, but they're all somewhere in the neighbourhood of "productive practices" regardless of how different they are from each other.

If you do

* 100% pairing,

* buttloads of unit tests,

* lots of microservices,

* use an autoformatter,

* in a Node codebase,

to order-of-magnitude approximations you're probably going to be about as effective a team as if you

* always code alone,

* mostly write integration tests,

* write a big old monolith,

* mostly ignore code style (except for egregious cases),

* in C.

These variables do make a difference, but those differences are context-dependent and can be swamped by other factors. Use version control. Do at least some testing and some code review. Make sure engineers can actually run their code before submitting it (on their laptops or in special staging environments or whatever, it doesn't matter much.) Help coworkers when they're stuck, and ask for help when you're stuck. And try to hire Harrison Bergeron onto your team :-).


When programs are simple, usually time to write code is dominated by finding the requirements. As programs become more complex, the time to write code becomes more and more dominated by understanding and working with the existing code.

In code bases you have written yourself, often you remember enough about it that you can mitigate the effects of complexity. Possibly you already had a plan for how to implement some functionality. Or maybe you know how to avoid certain problems.

As programs become more complex, you will find that "legacy team members" are disproportionately productive. Learning the code base becomes more important than learning the problem domain.

Teams that expend effort on maintaining a status quo where the exploring the problem domain dominates the cost of development (even with new developers), will usually lag behind teams that do not. Mostly, because it is difficult to do.

Some of the practices you mention can be used to help in this, but for a "legacy team member" it might not be attractive. If you have a more typical situation, knowing the code base better means you are much more productive and therefore more valuable. This secures a senior position on the team. Turnover is usually high, so most people leave before it gets to the point where even "legacy team members" become unproductive. When you go to a new shop, you can always complain about the "crappy" code and suggest a complete rewrite (at which point, you secure your senior status again).

While your list does not necessarily comprise what I think of a "productive practices", I think it is incorrect to imply that "productive practices" can not have a profound impact on the outcome of a project. The productive practices that work best for each team will likely be different. It's important to pick practices that work for your team and produce code that is simple to read and extend. Not thinking about it is a good way to ensure that you will end up with a complex system.


Whoa I just read about Harrison Bergeron on wiki [0]. That was an interesting read. I'm sure the full story would help capture a lot more of Harrison's spirit and qualities

[0]: https://en.wikipedia.org/wiki/Harrison_Bergeron


On this vein "Ian Cooper - TDD where did it all go wrong?"

https://www.youtube.com/watch?v=EZ05e7EMOLM


I wish I had watched that video two years ago! I learned all of that the hard way, and it took lots of failures.


I want to second about the "Code Complete" book. IMO, it provides a much more practical and hands on approach than most of the other books, and does this to a great depth. In my view, if I'd recommend only one book on good software engineering practice, it would probably be that one.

On a related note, I just this week stumbled upon a post / essay, which was probably the best piece of writing I've read in computer-aided engineering, and working in a disciplined manner. I'd recommend reading it right now, as the very first thing you do: https://queue.acm.org/detail.cfm?id=3197520


> read "The Design of the UNIX Operating System", read "TCP/IP Illustrated Volume 1"

What? Those are unnecessary details as far as software engineering is concerned.


I think you are interpreting "go read" as in go learn these by rote. He's not saying go memorize the details of these Unix and TCP books. He's saying go read them and absorb their gestalt approaches to decomposing complex systems so you can think about your own problems that way.

To this list, I would also suggest compiler construction or database systems as different topics exhibiting the same thing.

A more basic challenge for a scientific programmer might be absorbing the mindset of tool building rather than problem solving. It's a different mental perspective to think of your code as a product for other developers rather than as a means to producing a data artifact.


I would be interested to know why you think of them as unnecessary details.

What I like about them are that they describe large complex systems that have to work with a bunch of people plugging in their own implementation of the parts. For me, both books give a good sense of what you need to do to enable the product you are working on to evolve successfully, and highlight areas where a lack of flexibility impacted that evolution.

So often in programming you look at your code and have to evaluate not only what features you are delivering but also what paths you are closing off with your design. In my experience the folks who excel at programming hold both the model of the system in their mind while they execute some part of it.

I think both books give a flavor of that.


Some general advice I've given multiple junior developers over the years, you probably aren't a junior but most likely applicable to the advice you are seeking. These were passed down to me by other developers. Other HN folk will have links to literature but hopefully my advice will give you a precursor.

* testing - write your functions small enough to be readable, but not so small their abstractions are meaningless (because you have to test them all)

* testing - don't reach into your code's modules and mock. Instead use dependency injection with non-testing defaults

* code review - It shouldn't be personal, if it is are you reading it wrong or are they attacking you personally?

* code review - when referencing style complaints ask for reference material. Don't get caught in cyclic-pedantic style war between lead devs.

* code - your code should be environment agnostic, if you have environment/context specific things to do, pass along a environment/configuration dict or make a global config singleton. As long as your code depends on that you can write code more discretely.

* code - personal preference but try to not nest your loops too deeply, when you can use itertools.

* code - if you can help it, try not to mutate dicts/objects in place while in a loop. Makes testing a difficult.

* code - exit early if possible, test for failures instead of nesting your entire function inside a single `if`. Helps identify the bad inputs faster as well.

Above all, remember code isn't perfect. It's a tool to get to an end goal. If you aren't solving for the end goal you aren't solving the right problem. At the end of the day, you are employed to build a product and that product needs to perform it's job. (that isn't a pass to write super shitty code)

edit: formatting


I quite like this short piece as a way to explain why things like exiting early and not nesting loops are important: https://medium.com/@matryer/line-of-sight-in-code-186dd7cdea...


> write your functions small enough to be readable, but not so small their abstractions are meaningless

More precisely, you want your functions to be deep. The interface to implementation ratio must be as low as is reasonable: you want to hide implementation behind interfaces that are much smaller than them.

This goes for functions (something with 5 arguments that takes 2 lines is too shallow, something with 2 arguments that spans 20 lines is deep), classes (something that is mostly getters and setters is too shalow, something with 3 methods full of business logic is deep), or anything else.

This is not just for testing, this is to make sure you can understand the program at all. Deep functions (and deep classes, and deep modules…), are also about good old decoupling: the less you have to understand about a piece of code to be able to use it, the better.


I couldn't agree more with you about nesting loops. It seems clever at the moment when you're writing but when you have to come back after a while or worse, another developer has to, it becomes a nightmare.

I would also go a bit further and put nesting if statements. Sometimes it's really required but other times nesting can be avoided. I try to avoid nesting as much as possible.


Quick tip: extract the contents of loops out into functions. That allows you to test the function outside of the context of the loop. It makes it easy to test boundary conditions and other exceptions. Often it is unnecessary to actually test the loop.

This is also true of branches. If you have an if statement (or other branch), consider extracting the contents of each half out. It allows you to test each half independently. If you have nested branches, you need to have 2^n tests (where n is the number of nested branches). If you extract the contents, then you need 2n tests.

This is one of the reason for unit rather than integration tests: you can dramatically reduce the number of tests while still getting full test coverage. Of course the downside is decomposing the structure of the code more than you might be used to. It's always a judgement call.


> testing - don't reach into your code's modules and mock. Instead use dependency injection with non-testing defaults

Could you please go into more depth with this?


In an example. NoopTelemetry would be some type of empty class not dependent on mock (I've used a meta class a singleton in this case, but whichever, could be a module just the same). To test, you'd pass in a mock object to telemetry and check that a both timer_start and timer_stop are both called with the correct function name.

In your main or context, you setup your application with the needed pieces.

  def main():
     context = {'telemetry': StatsClient(....)}
     start(context)

  def start(context):
      algo_5(1, 4, context['telemetry'])

  def algo_5(param1, param2, telemetry=NoopTelemetry):
     telemetry.timer_start('algo_5')
     ret = param1 / param2 ** param2 # whatever
     telemetry.timer_stop('algo_5)
     return ret
  
  def test_start():
    context = {'telemetry': MagicMock()}
    start(context)
    # more testing.

  def test_algo5_math():
    ret = algo5(4, 5)
    assert 78  # maybe?

  def test_algo5_telemetry():
    mm = MagicMock()
    algo5(1, 1, mm)
    assert mm.timer_start.called_with_args(['algo5'])
    assert mm.timer_stop.called_with_args(['algo5'])


Mocking everything is the unwise conventional wisdom. It complicates the code base for no discernible benefit. In practice, mocks are only useful for external dependencies, and computationally heavy internal dependencies. Possibly not even the latter.

Why external dependencies are a good thing to mock is not just related to testing, it's related to code rot: external dependencies tend to change under your feet, so isolating them from your code base is a good thing, especially if you want to minimise vendor lock in. And for tests, it helps you simplify the test environment.

Internal dependencies aren't like that: they only change if you change them, and they're already part of your environment. Mocking them just complicates your code without simplifying your tests.

---

Now there is this idea that modules should be tested "in isolation". Imagine you have three classes, A, B and C, such that A uses B, and B uses C (and C is self contained). Without mocking, testing A will automatically involve B's and C's code as well. Mocking lets you test A alone, without the distraction of B and C.

This is mostly bullshit. If A depends on B all the time, there's no point into making B a parameter (dependency injection). It's simpler to just hard code the dependency. As for the tests, who cares about isolation? If A is only used when it depends on B, there's no point in testing it under other conditions. More generally, it is okay to pull in the transitive closure of dependencies when you test a component.

Of course, you should have tested each dependency (C, then B) before you test your module (A). That way you can mostly ensure the absence of bugs in the dependencies. For instance, if the tests for C and B succeed, and the test for A fails, the bug is probably in A. The order of the tests also matters for the test suite: if the tests for the dependencies are ran first, the first failed test should point to the failed module (instead of a user of the failed module).

Now if your program is highly stateful, you might have to mock. Testing a highly stateful program without mocking requires setting lots of bits of states, and making sure all the pieces of runtime state interact with each other as they are supposed to. But then the problem isn't the lack of mocking, it's the fact your program is highly stateful in the first place. Make your program less stateful, first, then it will no longer need mocking.

---

There is this mistaken notion that we should mock because if we don't, we're no longer making unit tests, we're making integration tests. The answer to that is simple: unit tests aren't the holy grail they are touted to be. Don't force them. If the dependencies are well tested (which they should be), then you can trust them like you would the standard library.

Besides, the integration tests will catch more bugs for less effort than pure unit tests (with mocking) would have done. Don't waste your time with mocking, just go catch the bugs.


Thanks everyone, the comments are much appreciated. Here's a list of books and other media resources recommended so far in the thread:

Robert C. Martin, Clean code: https://www.amazon.com/Clean-Code-Handbook-Software-Craftsma...

Vaughn Vernon, various: https://vaughnvernon.co/?page_id=168

Steve McConnell, Code Complete: https://www.amazon.com/Code-Complete-Practical-Handbook-Cons... 2

Clean coder: https://cleancoders.com/ videos

Hunt and Thomas, The Pragmatic Programmer: https://www.amazon.com/Pragmatic-Programmer-Journeyman-Maste...

Hitchhiker's Guide to Python: https://docs.python-guide.org/

Dustin Boswell The Art of Readable Code: https://www.amazon.com/Art-Readable-Code-Practical-Technique...

John Ousterhout, A Philosophy of Software Design: https://www.amazon.com/Philosophy-Software-Design-John-Ouste... This one looks particularly interesting, thanks AlexCoventry!

Kent Beck, Test Driven Development: https://www.amazon.com/Test-Driven-Development-Kent-Beck/dp/...

Dan Bader, Python Tricks: The Book: https://dbader.org/

Ian Sommerville, Software Engineering: https://www.amazon.com/Software-Engineering-10th-Ian-Sommerv...

Svilen Dobrev, various: http://www.svilendobrev.com/rabota/


There are a lot of good recommendations here, and I certainly relate to the instinct to go to books when you're looking to level up a skill set, but I really think what you need is not a bunch of books to read, but a few people to watch do the work. The only real way to do that is to get a job alongside them. You can read the books at the same time; you can ask your new coworkers which recommendations they agree with and read those ones first.


I'll supplement this good advice by recommending pair programming. You will pick up a ton of stuff just sitting down at one keyboard with another programmer.


Yeah, software engineering is a craft, and generally the only way to learn those fast is to learn from others.


It's not a craft, in its purest form it's an engineering discipline with specific rules, procedures and standards.

The crucial point is that most of us a doing programming, and not software engineering. Learning from others is hit or miss. One can certainly learn to program from others, but that's not enough to be able to do software engineering.


"It's not a craft, in its purest form it's an engineering discipline with specific rules, procedures and standards."

Sorry, but I have to strongly disagree. In it's purest form the core of software engineering - i.e. programming is a craft. The other parts are mostly about creating processes so that craftsmen can create something together without stumbling into eachother.

The difference between a craft and engineering are numerous.

- engineers generally need a license

- engineering is about repeatability and creating dependable cost estimates

- engineers are required to study for years for a very good reason. You can be a rockstar programmer out of highschool.

Just having a bunch of cargo cult gibberish bound into a book does not make a craft into an engineering discipline.

It's harmfull to call programming engineering. Engineers have curriculums that can teach them pretty well what is expected of them once employed.

Not for programmers - or, well, software engineers. If there was even one curriculum that could churn out good programmers dependably, don't you think this model wouldn't be copied instantly elsewhere? If such a curruculum existed, do you think think software interviews would be filled with whiteboarding just to check out that the candidates understand even the basics?

I think this incapability to create a curriculum for actually creating good programmers is the best evidence that programming is a craft. It's such a complex topic that you can't create a mass curriculum that would serve all equally. Not with our current understanding, anyway. Maybe if we could teach everyone assembly, and Haskell , and have them implement compilers and languages as a standard things would be different.

The second best way to learn programming without being born a programmer savant is to learn from others while doing. Apprenticeship is the traditional way to train craftsmen.

Programming is so much more like a craft than engineering that it's best to call it a craft.

Craft is not a deragatory term. It just means we don't understand it theoretically well enough to teach it properly.


Software development as practiced now by a huge number of individuals and companies is closer to a craft, but it can be and must be more than that if we want to be able to tackle the growing complexity of software and improve its overall barely adequate quality.

Crafts don't scale and are a poor fit for highly complex domains.

The curse of software development is its huge financial success, anemic legislative specification and the observed reality that customers will still buy poor quality software.

These are preventing the craft-like programming from turning into software engineering, but the craft is already failing to reach expectations: countless security disasters, unethical programmers enabling spying on millions, software literally killing users. This stuff will only get worse.

And finally, we do understand software engineering well enough to teach it properly. It's just not done, because it's not considered necessary when one can get by with a computer science degree, no degree or a bootcamp certificate.


"And finally, we do understand software engineering well enough to teach it properly."

This is news to me. I would very much like a citation, please. Or do you mean applying formal proof verification to everything?


Engineering doesn't mean using formal methods or specific fancy proofs, it's a systematic, disciplined quantifiable approach to software. It's described in an ISO standard and the more approachable IEEE SWEBOK.

The above is neither widely known (I only found out about it after many years of doing professional programming), nor is it necessary in order to be successful in the profession and/or make a lot of money.

Commercial software development is mostly a wild west and we're calling that craftsmanship.


I tend to come down on your side of the craft vs. engineering debate, but then I disagree with basically your entire argument for it :)

You list three distinguishing features of an engineering discipline. The first and third swap cause and effect; a field doesn't become engineering because once it requires licensure and years of study. Surely you wouldn't agree that a successful campaign to require those things for software developers would bestow the status of engineering upon the work.

Your second point seems closer to the truth, but I'm not so sure it's true. If someone comes to a civil engineering firm and says, "design us a road between this city and that city", that is often a unique challenge because the terrain between those two cities is likely unique, maybe it requires a big bridge or a tunnel, which will be akin to but not identical to other bridges and tunnels you've built before.


"The first and third swap cause and effect;"

Sorry, I wrote that in a hurry. I wasn't claiming either was a cause or effect. It was more of finding characteristics that we can use to identify one from another. I.e. following the argumentation "If it quacks like a duck and walks likes a duck it's likely a duck, and if it doesn't, we don't really have much evidence of the duckish quality of the observed thing".

So, I was not aiming to claim that licensure would turn software engineering into actual engineering. Rather, that the requirements of the field are so poorly understood in the general context that there would be very little to agree on the specific requirements. Poorly understood -> not engineering, really.

I totally agree with what you wrote above.

On the third point: I'm not claiming 100% truthiness to my argument, but it's pretty close. Software engineering projects are still among the riskiest ventures where you can think of investing capital in. If you want to build a road:

1. The language of the requirements are pretty well understood, from point A to B, this many lanes 2. Unless some unforeseen calamity arises, and you have the capital to pump to the project, eventually you will get the road

I think we can agree that these define any engineering project. Of course, engineeering is not cut and dried either - that's why you need to have actually trained professionals who can react to the events that come along as the project progresses.

I don't think 1. or 2. can hold for a software project in the general sense. Furthermore, you can end up accidentally, without anyones fault, wasting an arbitrary amount of capital on a feature that could, in the worst case, be replaced by a few lines of Python.

This poor quality of our general understanding of software development and lack of common language to describe anything means that most of the time software develoment is closer to R&D than engineering.

Generally, you can get better estimates when you are implementing a similar project the nth time. Like some general website, or a server backend. And in these instances you have the language to describe features and requirements. But in the general sense, software development isn't anything like this.


Convince me that its "purest form" is an engineering discipline rather than a craft. What distinguishes it from things that you would agree are crafts? Or are all crafts actually engineering disciplines in their purest form?

I think this is a pretty interesting question. Personally, when I was young, I would have said what you said: it's engineering, specifications go in and properly engineered finished product come out. I was proud that my CS program was contained in an engineering school and that all my friends were in other engineering disciplines. The longer I do this though, the more I think it's a craft: fuzzy idea is put to a skilled practitioner and through discussion and creativity, one or more artifacts are created to satisfy that idea. You could argue this is true of more traditional engineering disciplines as well, and I would probably agree. I'm not totally sure what the distinction between the two things is. So tell me what the distinction is and convince me what we do is more the one than the other.

For what it's worth though, apprenticeship is also a very important part of engineering education. At some point, you have to see people do the thing, regardless of whether it's engineering or a craft.


When we say software development is a craft, we're saying that it's like shoemaking, pottery or woodworking.

Can the immense complexity of today and tomorrow's software be tamed by applying the same principles of building a cupboard? No, it requires an engineering mindset.

We're now limping along as an industry and it's not obvious because SW is bringing in massive amounts of money and we can basically get away with a lack of quality.


"When we say software development is a craft, we're saying that it's like shoemaking, pottery or woodworking."

The point being, there are intricate details that are very hard to deliver in the traditional class-room oriented school environment with well defined requirements.

Those do not state anything about scalability. Crafts can scale - like for example how the old giant cathedrals and castles were built in middle-age europe.

They don't say anything about mindsets either. You need an engineers mindset to buid a cathedral or a castle.

The specific problem with crafts is that adapting to new requirements is a complete hit and miss process. The reason for this problem is the lack or proper theoretical framework in which to pose ones work and into which embed the requirements themselves.

The program verification people are working towards solving this problem, see for example Leslie Lamport's work in TLA+.

But until we have a general, mathematical proof backed compiler for requirements, as well as for the program implementation, we are pretty much stuck with craftsmen.

(Well, we have proof compilers but those are at the moment completely unusable for general programming since they are so complex to use.)


Skip anything by Robert Martin (clean coder series) and read at first Ousterhout and then McConnell instead.

Martin is well intentioned, but very dogmatic about some things like TDD, functions size, personal responsibility, etc. You need to already have some decent engineering experience to be able to detect and ignore the harmful stuff from his books.


I'd like to re-emphasise sanderjd's point not to focus too much on reading books. I myself went from doing a PhD and a lectureship in mathematics (with some coding here and there) to a decent software engineering job in a smallish company. I've learnt everything on the fly by reading code, searching stack overflow, trying stuff out and coding alongside others. The great thing coming out of a PhD is not just that you have to be pretty smart to have done it: you now know you can grasp almost any aspect of human knowledge with sufficient brain racking. This is a vastly underrated piece of self-awareness which enables one to stay humble and tenacious.


Yeah, I think of a PhD as a kind of intellectual adolescence. It led me to far greater intellectual independence.


In my experience, it's easy to just learn on the job. Some basic points though:

* Follow whatever formatting and style rules your workplace uses. It's religion and not worth getting into, as long as everyone uses the same style its a win.

* Dev/stage/prod is also workplace specific. Just go with the flow and avoid time wasting arguments on these topics, it's not usually worth it.

* Try to break your work into small commits. This is both easier to review and easier to estimate time on.

* Architect your code so that you can add unit tests. Make sure all your commits have this.

* Prefer longer simpler code to clever code, you're optimizing for newcomers to your code reading it.

* When a one line comment explains it to you, you'll probably need a paragraph at least for someone outside the field to get started understanding it.

* Think about how you'll respond to someone coming to you and saying "something something prod something something your code is buggy." How will you get enough information to determine if this is true, and to debug it when it is? Logging is one good tool here, so consider what you log carefully.

Finally, don't be too surprised if you find people talk down to you. Unless you are in a FAANG company, which it sounds like you are not, developers can be very condescending towards academics (and people from other fields).


That is a wonderful point. There is no replacement for an on-the-job experience as the understudy of a more experienced professional. I have experienced this first hand and can vouch for it.


- Most of your job is to make people happy. Communicate well. Coming from pure research, it might feel a little uncomfortable at first. Remember, you're there to consult, and that happens to involve writing code.

- Go to hackathons to learn to ship code fast and get used to building "skateboards". Learn how to make tradeoffs that optimize for development speed. It's not about writing crappy code, it's about optimizing for the right variables at the right time. There are now a lot of real world variables to consider.

- Practice Kanban. Divide and conquer your projects. Make small and focused pull requests. You will naturally start strategizing on how to do things right while you're doing things quickly.

- Using category theory and functional programming in your code, but being practical about it so others can read it, will really help when it comes to writing unit tests. Unguided polymorphism is from the devil.


I'd suggest "The Pragmatic Programmer" by Hunt and Thomas. It's a compendium of advice on being an effective programmer compiled from experience. Also, take look at "The Practice of Programming" by Kernighan and Pike. It's a bit more narrowly-focused, but Kernighan and Pike are models for clarity in programming and in writing.


It is critical for a company to have a concrete on-boarding process. If your present company doesn't have a good one take this as an opportunity to design one. You will learn a lot and also help others in the process.

Here are some of the guidelines:

  # Code style/guidelines
  # Git/version control workflow
  # Testing methodologies & tools used
  # Agile/project management tools used and best practices
  # Read the wiki about infra/services used in production/dev/staging and its workflow
  # Release guidelines & workflow
  # Mentoring process
  # Engineering style/culture


Your coworkers are your best resource.

- Ask them to review your code and suggest changes

- Look for questions of taste and ask more. It may feel intuitive to them, but if you dig in you can often find a good reason/principle behind it.

- Read your coworkers' code

- Read the comments people leave on others' code


Here are some principles that inform my taste in maintainable code:

- Each function should do just one thing.

- Don't reuse a variable if making a new variable with a new name would describe the value better.

- Give functions verb names that describe what they do. If that's hard, they may be doing too many things.

- A function should either change something or return a value (command-query separation)

- Any data should have a single, canonical source of truth. https://en.wikipedia.org/wiki/Single_source_of_truth

- When deciding between making code DRY https://en.wikipedia.org/wiki/Don%27t_repeat_yourself or not, decide if future changes should affect both places at the same time (use DRY) or not (probably doesn't need DRY, maybe shouldn't have it.)

- Avoid spooky action at a distance https://en.wikipedia.org/wiki/Action_at_a_distance_(computer... and if you can't, refactor or at least add comments.

- Write modular functions that can be used without understanding much about the function beside what the name/arguments tell you.

One measure of your success in this area is how quickly someone who'd never seen your code could describe what it does.


This is the best answer here. You've come to the conclusion that you're good at coding alone, but you don't know how to do it well in a team or company— your team and company. Other answers frame this as a technology problem (patterns and practices) but you'll hack it faster as an acculturation task. Get mentors. Plural. Grab one person in each department where you feel shaky— QA's, solutions architects, operations, maybe product, etc. Tell them you're new at this and you want to ask them questions and work closely with them to get better. (This will not offend them and it will not make them look down on you. If it does, you don't want what they're selling anyway.) Two months into asking them for code reviews and just taking them to lunch and asking about things you know they care about in their areas of responsibility, you'll notice results in terms of your own thinking and output. 1 year in, you'll be very, very good at this.


I was in the same boat as the op. Dove in head first into a software engineering role. Ended up working with the only true 10x'er I have known to this day. Nothing improves you faster than getting feedback from someone like that.


I have made the jump from writing academic code to working on a product where actual software engineering was encouraged. Although I did jump back to academia pretty quickly.

Hitchhiker's Guide to Python is a very good book (freely available online, or get a copy from O'Reilly); some of it may be obvious but some might not be.

It is true IMO that making your code testable will also make it better designed. It might even be worthwhile to do completely dogmatic test-driven development (i.e. always write tests first, then stub out everything with NotImplementedError, then write actual code until all tests pass) for a while to get used to it, and force yourself to become familiar with tools for dependency injection/mocks/etc.

This is complicated by the fact that unit-testing machine learning code can be unusually tricky; normal unit-testing practices and metrics (e.g. code coverage) may not be very effective.


Oh and I don't think you'll be as hopeless a coder as some other posters might think, because you did your PhD in a computational discipline and know Python and have heard of unit testing.

There are, say, physics PhDs who only write numerical Fortran or C++ routines (in one big file, sometimes even in one big function), who really might want to attend a boot camp or something but it doesn't sound like you're in that boat.


I went back for a PhD after a few years in industry.

As a PhD student, my code and habits do not meet the standard of industry. This is because I'm constantly changing the whole architecture to try new ideas, so I I optimize for small and simple code at the expense of testability, modularization, robustness, etc.

It's important to recognize this. You will need to change your style.

I can't recommend any one book. I feel like I mostly learned these lessons through random articles, lecture videos, conversations, and personal experience.

IMO, some of the most important principles:

* Implement as much as possible with pure functions (but don't contort the code to achieve this).

* Make your commits as small as possible. Well structured version control history is valuable.

* Spend lots of time time thinking about how data flows through your program, more than how the code is organized.

* Strongly prefer DAG dependency structure. Write a set of libraries and then a top level program that uses them.


Read "Software engineering" by Ian Sommerville. Any edition (maybe from 6 onwards, though they are slighty different.. pick latest u can get). Maybe skim/skip the (technical) parts u think u know, and read the rest. Most will not make sense initialy.. does not matter, keep reading. u need to get all that "uploaded" in brain in order to be able to grasp it one day.

It took me 10 years to be able to skip all the technicals. And another 10+ years to understand why u may ever need the rest..

for judgment etc... Maybe pick some big-enough open-source project in a domain u know well and follow it - how and when they do change what. Dont worry, it does take years to really form your own judgment.

btw u will need some philosophy/methodology/human-side too.. there's not many of it in the above book.

For more, see the recommended readings on www.svilendobrev.com/rabota/

have fun


First of all, SW engineering is a practice with a lot of responsibility. The main responsibility lays in writing code, that is easy to understand. For example, if you think you write well written code, then try reading code that you have written a couple of months ago. Usually, a very painful experience :D

So try to write code for an audience. This has been the trigger for me. Also I encourage code reviews and TDD.

The main learning resources for me have been, Clean Coder videos by Robert C Martin aka Uncle Bob. They are pure gold. They can feel awkward, but after a while they make sense.

Also DDD domain driven design is a key topic to tackle.

Books: - Clean code - DDD by Vaughn Vernon

Videos: - Clean coder E1-E52

With these two books and videos you are on a good track! These worked for me.


I can vouch for Clean Coder. We watched them in our company. It's a small dev team so we took the time together. Afterwards we implemented a 4-line rule amongst other things.

We don't always hold ourselves to it, sometimes 5-6 line functions make sense, but we strive toward 4. Sometimes it's as easy as breaking code out into a new function, but sometimes you just simply have to create a class for it. That way a lot of complicated code suddenly becomes very easy without much effort.


And this is why I don't like the clean code series and usually recommend against reading it.

Limiting function length to < 10 lines is like limiting your walking stride to 30cm. You'll spend most of your time chasing useless functions when trying to understand the system.


This honestly sounds extremely limiting. I do get why you'd want to make functions short in general but I think there's a tipping point where making the functions shorter actually increases overall complexity and 4 lines seems to be past that tipping point in my experience.


It's honestly not as limiting as you might think. Readability has definitely improved a ton since switching.

Of course we don't count every bracket or blank lines. Only the rows with any logic or assignment.

And yes, I agree, there are occasions where the complexity goes up. If there is a good argument for that, then we of course go with it.

But so far, almost everything anyone in our team has made, has been improved by rewriting it to something that works well in 4 lines, be it using polymorphism, object oriented, or functional.


Be patient with yourself. You have talents that are quite useful, but good code design and architecture are rarely thought about in academics. Realize that while you spent 4-6 years getting your PhD, your coworkers were becoming better engineers. That doesn't mean you can't do good work, but for a while you'll mostly be learning from others.


The exception is if you are in the area of study whose entire existence is to understand what is good code design and architecture.


Agree with a lot of the comments here. I went through this very issue a couple of years back and I did end up reading a few of the books suggested in that thread and while they were good reads, I think I learned the most from my colleagues’ harsh code reviews and developing a slightly thicker skin to those review comments, and not getting triggered at every single slight disagreement. I used to write a lot of grad student code at my current work and got rightly flamed for it (when appropriate)…

These days, I do try and think about the software engineering side of things first just so that I get quicker +1s from my team, and honestly, all the _quick and dirty prototypes_ I used to write (I still do, but a lot less) ended up requiring me to do a lot more debugging/redo-ing/thinking about scaling up etc later on anyway.

Books can get you a decent idea of what to do, but I think I found reading code (and especially code reviews of my colleagues for other people’s code) much more useful. I think reading a few 800 paged books to improve your software engineering skills is a very grad student thing to do. :p. I admit I did that way too much.


whenever you want or need to do something more than shuffling bits between buckets with different names, do some research. most likely someone already did it and published a library for it.

test third-party libraries. it's uncommon to find bugs, but it's not so rare that it happens only to others.

don't forget to leave comments. a lot of my code review questions could be answered (hence could be not asked in the first place) by a well placed comment.

sometimes people say that code is documentation or code should read like documentation. this is false. code can explain (usually poorly) what it does but it can't give a rationale why it does it the way it's been written, can't say what it doesn't do, etc. always write some documentation - comments and commit messages at least. this should be enforced in code review.

i'd say engineering is about not writing code unless absolutely necessary. code is an asset, but it's also a liability. you really don't want more than you need.


I made a very similar transition four years ago. Finished a Ph.D. Started a job in a related field writing lots of Python. My advice is to take advantage of your analytical and abstract reasoning skills. Your peers may have more concrete experience writing software, but you did a Ph.D., which means you have the patience and tenacity to follow all the threads until you figure out where they go. That means that where other people might give up, you can actually figure out how the system works and where a new feature fits in it. Or why it doesn't work the way anybody thinks it does. Think of reading other people's code like doing a lit review -- multiple authors, different schools of thought, arguments about how to do things right -- these have all played out in the code base and they're there for you to read. As a Ph.D., you have the ability to pull this all together into something that makes sense.


My two cents: you don't need to read anything at this point, you need to apprentice. Go work somewhere where there are experienced developers. Spend your first weeks there sussing out who is highly respected among your coworkers and choose one or more of those people that you click with. Then just brazenly copy all their techniques and opinions for awhile. Pretty soon you'll find yourself disagreeing with some of what they're doing or thinking. That's natural, but you should resist the urge for awhile; some of that stuff comes from hard-won experience that is hard to explain. Eventually you'll start going your own way more and more. Sometimes that will blow up in your face, and that will give you your own hard-won experiences. Before you know it, you'll be one of the highly respected engineers that the newbies are cribbing from.


Code review, dev/stage/prod workflows all vary on a team-by-team basis. If you already know what they are and why they exist, there isn't a better way to "prepare" for these than to just roll up your sleeves and look at how your current team implements these things.

Good testing practices:

1. Minimize mocking as much as you can -- as a rule of thumb, mocking is inversely proportional to test confidence.

2. Don't test implementation details, test public-facing APIs. This way, your implementation can change. Mocking makes this harder. Don't test how you get things done -- test that they are done.

3. Make sure your API is well defined before you start writing tests, or you will waste time.

You can find loads of Python testing guides on Google on the first two points. There will be times when you have to break some of those rules, but knowing when will come with experience.


> the judgment / taste required to make maintainable changes to a million LOC repository

Try The Art of Readable Code (a pair of google authors, IIRC), and Ousterhout's A Philosophy of Software Design.


Recently I've been involved in transitioning an academic software piece to an open source library. One of the most noticeable things is the different priorities and emphasis on what is driving value in these different environments. The people who were making the code before had priorities mostly to do with research, the main artifacts were papers and research, the software itself was not the main artifact. The interesting thing is that they had good software and research skills so it wasn't a matter of bad skills muddying the waters and hence gave a great spotlight into how different people can have different priorities with code. So when we were making it into a library which others could base their work off there was a big shift in priorities because the code became an artifact worthy of directly spending more time/money on. You may find what we wrote about this process interesting as it highlights the things from a software engineering/open source perspective that were now important and had to be done to make the project a standalone library useful for consumption by other developers: https://www.customprogrammingsolutions.com/blog/2018-02-25/P...


"Software Engineering" is merely a collection of principles, techniques, heuristics, structure and practice all validated by trial and error. As such you have to read a variety of books to get the overall picture. Specifically books with sizable code for various problems. You may find the following helpful to get started (many of these can be bought used and cheap);

* Fundamentals of Software Engineering by Ghezzi, Jazayeri and Mandrioli

* The Practice of Programming by Kernighan & Pike.

* Code Complete by Steve Mcconnell.

* The Unix Programming Environment by Kernighan & Pike

* Advanced Unix Programming by Marc Rochkind.

* C Interfaces and Implementation by David Hanson.

* Large Scale C++ design by John Lakos

* Unix Network Programming by Richard Stevens.

The key is that while reading the above you need to "get" how the code is "structured" rather than the details. For example, how does the code for a TCP server and client "look like"? It is a kind of spatial knowledge which you can then consider as one "module" of functionality and reproduce as needed. Large Systems consist of a bunch of layered and well partitioned modules exposing simple and clean interfaces. There will also be modules which cross-cut all the functional modules like "Error-Handling", "Logging" etc. This is the core of "Software Engineering", everything else is details.

Finally, you would also need to read a book/source where you can see all of the above principles put into practice while building a non-trivial (initially not overly complex) system.


I read systems science at university. While we did some technical stuff like basic Java programming and database design, we mostly focused on WHAT a system is and how to design one from certain requirements. So when I got my first job, as a web developer, I hit the ground quite hard. I hadn't really programmed in my spare time either so I didn't have that backbone experience. (You might wonder why I got hired in the first place, in hindsight, I also do).

Nowadays, some X years later, I identify myself as a backend developer and I tend to stay out of those "up in the cloud" discussions about what a system should look like. So how did I get here? First of all, I think I was pretty lucky having a boss at my first job who wasn't interested in me being really productive during my first time there, but rather wanted me to learn and grow with the company. I also had great colleagues, especially my then team lead, who really took the time and showed me the ropes so to speak. I bought one book, which I didn't really read. I did some online classes, but I mostly learned programming, problem solving, TDD, etc. by working.


I've worked in a variety of companies as employee and consultant and I have some counterintuitive advice that applies to building things on the web: In almost all cases, things like code maintainability, coding "standards", and TDD, go out the window during actual development. I'm not saying this is good, just that it happens (this is more a management problem than a software development one). There are deadlines to meet, changes to make, surprise features to add, etc. And usually you're just throwing things away and building new things before any of this comes back to bite you. Being able to go with the flow and handle chaos --flexibility-- is probably the most important skill to have. The job ends up being a lot of communication. If you're lucky, you'll get to code some cool stuff. But you'll also have to hardcode something clunky and ugly because there was no time to do it right. Be OK with that.


> things like code maintainability, coding "standards", and TDD, go out the window during actual development

As someone with experience in both academic and professional programming, I'd say that the difference is that while in the professional world those principles may be sacrificed at times, in the academic world you are lucky to work with someone who even knows what they are, much less how to implement them.


To quote another great academic, Ray Stantz: "I've worked in the private sector. They expect results."

Just remember that the business world is going to expect results driven by their current business needs, not by solving interesting problems. So you want the shortest path that'll get you from here to there, which means bone up on the libraries or frameworks that are germane to your company's needs. Learn what your company's coding standards are from developers who are in the know, and apply them to your code.

Also, might I suggest finding a company that's at least tangentially related to what you did your Ph.D. in. That doctorate is going to look great, and your expertise is going to be super valuable, giving you a much-needed opportunity to strengthen yourself in the areas of industrial development where you are weak on the job, while still contributing value.


A bunch of comments on this post give pretty good advice about what to expect, and if you're in your first commercial job it's probably worth going with the flow. However, one thing I'll add is that it's worth watching out for the "academic bad, commercial practices good" mindset. Keep your eyes open, form your own opinions about what is and isn't working. Don't necessarily kick up a fuss about the things you don't like right now but do file them away for the future.

https://yosefk.com/blog/why-bad-scientific-code-beats-code-f... is an interesting counter-point to some of the usual commercial-vs-academic thinking.


Your best bet is to look into a programming boot camp training program. As you have seen Ph.D. academic programming deals more with research while commercial programming deals more with delivering a product as fast as possible. It's two different mindsets.

1st decide on what area you want to specialize in and then look at a reputable boot camp that fits your goals. You can do it on your own but it's going to take you a lot longer and it's hard to focus on what you need to learn. Also, if you could do it, you would have done it already. The advantage is that you're already used to the scholastic environment and you'll be able to do very well and be even well ahead of everyone if you challenge yourself rather than strictly following the curriculum.


I'm also self taught, have done a lot of research code and product code, have worked on 'production' teams for over ten years and have had several PhD students as contributing team members.

The number one thing you can do is read through other people's code. If your colleagues are very good then you will learn a lot and pick up good habits, if they are so-so then you will build your self-esteem and sharpen your critical thinking skills. Some developers are shifty, and others love to talk about what they are doing and share insights. Spend time with the latter type.

Don't try to be an expert at everything, most teams should have self-selected individuals that choose to specialize in different areas that the team depends on.


You won't be alone. In any real organization, you'll be part of an experienced team that's working towards the same goal as you, and if you're curious, open and appropriately humble, they'll teach you everything you need to know.


In fake organisations you'll be alone, surrounded by passive-aggressive office culture and every personality on the spectrum from always-hostile to Vulcan-autistic. If you're able, shop around for a job until you find one of these real organisations. They do exist.


If you are OK with reading books, I will recommend Code Complete.


I made this move (and back): treat the goal of software engineering like another topic to do a lit search on, map out the domain, implement a few papers, etc. Instead of journals, you’ve stackoverflow, coworkers, Google.. I got out-coded more times than I can count, but as a PhD you can catch up quickly by treating it as a domain and problem to analyze and solve.

For Python, the built in docs are already very good, but I use devdocs.io a lot.


Does your institution have a Research Software Engineering group? I think increasingly universities acknowledge the gap between how academics use software and how industry approaches it, and I think that would be a fantastic first step if you were looking for a change.

https://rse.ac.uk/


If your code is reviewed by colleagues and if you review code of your colleagues, if you do some occasional pair programming you probably don't need to read books about programming. Concentrate on books that help you with things like estimations and communication, e.g. "The Clean Coder".


This really depends on what specifically you're struggling with, so going to take some shots in the dark:

* "Refactoring" by Martin Fowler would probably help with writing good quality code and doing code reviews (or understanding the reasons for changes requested in others' code reviews).

* In my experience, "academic" code tends to be far more prone to very long functions. Understanding the Single Responsibility principle was a very important part of the transition from academic scripts to software engineering for me. If you regularly write functions of 30+ lines, start looking at breaking these down better -- what are you actually doing in each chunk of code? Can it be broken down further.

* In Academia, building software is 90% coding. Now, reading code will be far more important, and you might even spend more time reading code than writing code. Relatedly, the readability of your code is now the most important thing to optimize for (sometimes even at the expense of computational efficiency, you should aim to reduce the number of developer-hours spent wherever possible). This means writing more readable code (good variable names, learning and following style guides and other conventions, following a process even if it feels like a waste of time), using better tooling, more focus on testing, more time documenting, and more time communicating with programmers than actual programming.

* At times this will be frustrating because you'll remember when you could just go into the zone for several days straight and produce something fairly significant. Remember that a lot of the stuff that feels like a waste of time isn't - it's necessary to get out of the local maximum of what was possible when you did everything yourself and could keep the entire project scope in your mental model at all times.

* Whenever you have written code and are 99% sure that it will run fine and not break any other parts of the system ("it was such a small change"), rather assume that it is very likely to break something else. I still find I constantly need to recalibrate my confidence that what I'm changing has a limited area of affect.

* Spend a decent amount of time getting more familiar with things like source control. Assuming you use git, go down this rabbit hole to read about the different git workflows and get as comfortable as possible with flows that your team uses, dealing with conflicts, reversing mistakes, etc.

* Most of what you specifically asked for can only be learned through experience. Experience is gained the most quickly by doing, even if you make mistakes. For software quality, the best (only?) way to learn is to have someone more experienced review your code. The more pedantic they are, the more you will learn. If it takes you 3-4 attempts to get a code review through, then you have team that will accelerate your learning. If you're getting mainly LGTMs, you might be super competent and had no need to ask this question, but more likely you need to try find people to push you harder to learn new standards.


Code Complete 2 is kinda the bible on disciplined software engineering practice. That's a good start.

Other than that - there is not that much theoretical basis. The core principle is that software engineering is about dealing with a situation where you have far too many variables to fit into a single persons working memory, and how to organize a group of people in a way that they can co-operate without turning the thing into a mess, really fast.

It's more about having a set of understood processes, rather than what those processes really are, so that people can communicate about and co-ordinate their work effectively. Of course the processes need to make sense, but there's no "silver bullet" process that either would fix everything, or, conversely not following would lead to the end of the world.

"Issues like testing / mocking"

Testing has sadly developed bookish dogma around itself. But it's extremely practical. The most important automated test is the high-level integration testing - will this and that work when the customer uses it.

Unit test are about creating enforceable rules to the production system, which makes thing break faster, and, hence, faster to fix.

You don't want someone else to accidentally to break your code - especially that kinda weird cornercase? Write a test - now the rule becomes enforced as a part of your domain model.

" code review,"

Same principle as in writing text. Having someone proofread the things you write generally improves the quality. No need to be dogmatic.

The second aspect is the zenlike increase in code quality. People know that their work will be looked at by someone, hence they have a higher intrinsic motivation not to fudge things.

"management of dev / stage / prod workflows"

The only thing that matters there is that there is one agreed process inhouse. Otherwise things turn really messy, real fast. It's kinda tricky to wrap ones head the first time around the ramifications of the chosen rules, so that's why there are lot of published ways of working .

"the judgment / taste required to make maintainable changes to a million LOC repository,"

"Working effectively with legacy code" by Michael C. Feathers is a good start. Now, if the corpus of code has a thorough integration and unit test suite, you can change things, and if you accidentally break something, the tests will tell you.

If there are no tests, then, better start writing them. You can't do any large scale modifications - especially to production code - without them.

Have some tool that automatically tells you the test coverage.


I personally came from the other side of the table: I got a German "Ausbildung" as a sysadmin, started developing software as a hobby, and have now made the switch to professional software engineer. While I feel I'm pretty good at what I do, I'm lacking the algorithm education, the basic concepts and especially team related skills. Always something new to learn, I guess.


Get into a startup where a lot of these practices/ideas aren't yet fully ingrained/adhered to and grow with your team. This will also let you learn more skills than "just coding" as you will have to wear multiple hats.

Once you are confident you can move on to bigger engineering shops. Or just stay and have a great time building new things in startup world. :)


code design patterns look at 'testing'


If you're smart enough to do a PhD, you're smart enough to figure out all of the scaling and operational bits. It's not rocket science. It's just operations.

Simply by asking relevant questions, practicing some things a bit, you'll get used to it.

And consider that it's different in every organization, and nobody does it really well frankly.

Given the pace of change, the varying technologies and flavour-of-the-month processes, it may always seem a little unwieldy an opaque: the feeling of 'not knowing everything' never goes away.

And I concur with the itronitron: read other people's code on the team, who are known to 'code well'.

It does not mean that algorithmically it's genius or even good - it just means that those are the styles/patterns that may be expected of you. It's like learning to say certain words a certain way. You'll get it soon and then forget you're doing it.

Don't fret you'll get all of this quickly.


This. I transitioned from a (non-STEM) PhD and am now working as a "real" engineer. It is called a "practice" for a reason, you will get better with time as long as you are self-reflective about it.

One practical tip - take a look at Dan Bader's book if you are deep in Python, it has a lot of good stuff.

One philosophical tip - depending on your organization, remember that slow is fast in engineering. This is somewhat different from more academic computing environments (at least that I know of). So take the time to get it right and really deeply understand your solution.


> remember that slow is fast in engineering

Coming from an academic field myself... be careful with this one. Depending on your personal habits in academia, you might have to learn the opposite - your code doesn't have to be perfect, it has to work.

Be careful not to end up overthinking the code/design and under-delivering on the timeline. Missing time estimate once in a while is usually OK, missing it consistently and by a lot might become a problem.


The thing is "not knowing everything" is the default in science too, so that should be familiar. It's just expectations and soft stuff that you got used to in your first two years of grad school you have to relearn again, I suppose.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: