Does this have anything to do with the original Pragmatic Programmer book or the https://pragprog.com/ publishing company?
If not I think the name should be reconsidered. It's a distraction from the content of the book itself if it's not actually related to that other text.
The phrase "Pragmatic Programmer" is a common one to denote somebody focused on "pragmatic issues" in Software Development and can be used in that capacity wherever it is applicable.
This book deals with such practical issues in ML software engineering and hence very much worthy of this name.
I tend to agree with the parent comment. The name of the book sounds like it could be part of a series authored by the same people as “The Pragmatic Programmer”. For me, I subconsciously internalized the grammar of the title as “The Pragmatic Programmer: for ML”.
I have no idea what the expectations are legally, but given the original “pragmatic programmer” book has been out for around for ~25 years and is extremely well known, it seems like a reasonable name collision to avoid.
The cover of the book has an Addison Wesley logo on it, and the hard cover also has a Pearson logo on it. So that name has some textbook companies backing up PragProg as well.
It also DOESN'T have any indication that "The Pragmatic Programmer" is any sort of trademark, so who knows. Either way, IMO calling your own writing "X for Y" where "X" is a commonly known specific work, and "Y" is a generic term, just means that you've diluted your own discoverability into a very big pot.
Why are we fixated on the name? "A Rose, by any other name, would Smell as Sweet" and all that.
What i am looking for in this submission is insights/opinions from people working in this domain on the topics presented in the book. For example, the book talks about "Concept/Data Drift"; so what is it exactly, how does a ML engineer encounter it in his data and how does he deal with it over time?
Because names mean a lot in and outside software. Try naming it “ML: The Big Nerd Ranch Guide” or use a O’Reilly/No Starch-style cover and you will get a similar reaction.
Let the authors deal with it; it doesn't concern us here.
What i am looking for is a discussion of the contents in the book which they have kindly made available for free (the book is expensive).
PS: I am always very appreciative and thankful of people who make their knowledge/books/software available for free and am sure they would like us to focus on the core contents rather than ancillary issues (which they doubtless are aware of and cleared with publishers).
Honestly, I only clicked into this thread because I associated the phrase "The Pragmatic Programmer" with the famous book, and if it's not by the same people, I am less interested in their content specifically because of the "borrowed"(stolen?) term.
It's possible, however I've already wasted time with a click based on the book title, and based on that I would prefer not to give the authors any more of my time regardless of missing out.
That makes no sense since you are the one losing out. To paraphrase a proverb; "Hate hurts you more than the person you Hate since they are unaware/unaffected by it".
My primary purpose is to use good judgement on choosing where to spend my time. Thus I choose not to read the book, it's not hate, it's judgement. My secondary purpose is to not reward those who use underhanded schemes to get ahead. This may not have been their intention, but it is how I perceive it.
Your logic, perception and judgement are all flawed. You merely looked at a familiar phrase in the title and immediately jumped to a conclusion with negative connotations. That's on you. You have no idea about the book, have read no summary/review of it and hence do not have a clue about it and yet are trying to justify your "judgement"?. The book is published by well-known publishers who would have cleared its title to make sure that there are no legal violations (i.e. underhanded schemes) which could get them into trouble. So on that count also your "judgement" fails.
I am advising you to browse/read the book because we (I and a few others in this thread) have browsed/read the book and found it worthwhile (if you are interested in the domain in the first place, of course).
I have seen some silly arguments in my time but you take the cake on "judging a book by its cover" to a whole new absurd level.
Because that name is associated with one of the best and successful books about software engineering.
I almost sure that "The Pragmatic Programmer" is a trademark so it comes natural to associate the book with either the same authors or the same publisher as the original book.
I am quite surprised there is no discussion here. The book actually gives a nice overview of practical Software Engineering principles applied to ML Engineering and hence of use to regular Programmers moving to ML from other domains. I personally found it quite useful to understand the practices employed in ML Engineering and how it is different from "normal" programming which is where i come from.
Part II titled "Best Practices for Machine Learning Pipelines" and starting from chapter 5 is where the meat lies.
It's new to me! It certainly looks good, or at least like something that could be increasingly useful to an increasingly large group of people. But it's a whole book and I have only a few minutes' break to check HN, so I can't evaluate it for quality until I've had a chance to read it (and I definitely will, because it looks useful to me). I assume, being new, few other people have experience with it to comment. And fortunately whenever actual math and code show up, the AI maximalist/doomer blabbermouths tend to stay away.
+1 adversarial robustness [0] & privacy were included in the analysis stage. People forget that stuff.
+1 on having to rewrite academic code (or code from some Jupyter notebook). Bane of my life sometimes.
+1 versioning data and code, running pipelines based on changes in either
+1 ingest your data, then validate, then use it. Data/model drift etc.
+1 on consistent tooling and language use.
+1 references everywhere
Wasn’t sure about the super specific approach to the commit history (squashing specific file changes together with validation/safety changes in a separate commit).
But then I’ve rebased my MRs to do something similar before and enjoy doing it. I guess I’m just pointing out that trying to get other people to do this regularly is a massive PITA and usually doomed to fail.
—
[0]: adv robustness was a bit light on content unfortunately. But then I researched to topic full time for three years so probably always gonna be light for me KEKW
Finally! Somebody who is actually talking about the contents :-)
Could you clarify a little bit on what is meant by "Concept/Data Drift"? Any examples/links you can point us to? Wikipedia (https://en.wikipedia.org/wiki/Concept_drift) describes it but without a specific example to walk through i am not really "getting" it.
Probably a very over simplified example below. Because data doesn’t usually drift in this obvious way. It’s usually more subtle and happening over a longer period.
My model is learning on my business data of orders over time.
People keep ordering every day. but usually in small amounts.
But today we got a new customer and they put in monthly orders which are 1000x larger than all other combined.
They are going to keep making orders for the next year or so. At which point they stop ordering from us.
Two data drift “episodes” here:
1. When we get the new customer. We’ve now got an outlier. They aren’t like all the other customers. How’s will the model react to this when being trained? Will it skew the output? Do we exclude the new customer in training data? Or do we change the model to account for them?
2. When that customer stops ordering after a year. Now the outlier is gone. But maybe we changed some model settings and tweaked it a bit to account for it. Now we need to account for that customer not being around anymore.
I got the obvious way. What i was asking about is how do you identify drift in the data in the first place? The Model has been deployed after training/test data-set passes. Presumably with drift in the input the model's predictions will not be "good" anymore. How do you disambiguate this case from the model itself being wrong for other reasons?
It's a continuous process checking the training data is from the same "distribution". Usually through automated pipelines running against the ingested training data (i.e. once you've got the new data fully processed and ready for training, but prior to actually training the model).
In the pipelines you do some checks on statistical outliers/differences. Check the current training data against historical versions of the training set. If anything goes beyond some specified tolerances you highlight that for manual testing/checks.
Using the toy example from before, something like checking the sum of orders per customers in a month compared to the last N months. If the maximum per customer orders this month is 100x higher than any previous month then something has significantly changed in the data. May affect training, we need to investigate this.
If you've identified some statistical changes/differences, that's usually where someone needs to investigate in more depth. Train a dev model on the brand new training data. Pass multiple unseen test dataset(s) through it. What happens?
* Is global test accuracy up or down?
* Is robustness affected?
* Is the accuracy degrading for specific classes?
* How does this compare to drifts we've seen before?
Then you make decisions about whether you need to:
* exclude parts of the new training data?
* tweak some model hyperparameters?
* tweak the architecture of the model?
There's no single right answer on what to do at this point. This is the difficult and expensive bit of machine learning. It requires a lot of continuous experimentation even after you've got something running initially.
Nice. It is these sort of issues that made me realize that ML Engineering/MLOps are a very different kind of beast where Statistics and coupling of input data to the Model plays a very significant part. The awareness about the data domain is vital.
I haven't read the text, but data drift refers to how, after deploying a machine learning model, the input data changes over time to something that wasn't tested on. For instance, let's say you create a gradient boosting forecasting model that does a great job at predicting tomorrow's earnings. At the time of training, the earnings might be in the $1000 per day range. But a year later, the earnings might be in the $100k range. The model has never seen numbers this high before, so it doesn't know how to handle them well. That is data drift.
The most common solution is to frequently retrain on the latest data. A forecasting model might retrain every week, including the last weeks data, and might even drop older data, for instance training data older than a year.
It's best to transform your target variables, like "number of orders", to "number of orders per customer per day" or something like that. And then in your pipeline, you feed the latest estimate on your number of customers (e.g. average of the last two weeks). That's way more robust over time.
Makes sense. We need to continuously monitor the performance of the model deployed in the field with our preexisting statistical knowledge of the data and then accordingly schedule regular "model updates".
> Even so, it is difficult to understate the impact that machine learning is having on many aspects of our lives.
I'm struggling to parse this. Does this mean that the impact has been so small that it is very difficult to understate it? Do they mean "it is difficult to OVERstate the impact", meaning that the impact has been large?
Wow what's the context behind this? Is it worth reading? Who are the authors? Looks like this was first published in print in 2023 but there's basically no reviews online.
If not I think the name should be reconsidered. It's a distraction from the content of the book itself if it's not actually related to that other text.