I understand the "first rule of optimisation" (and I am sure you do, too) but I hate it because people cite it too often so they can simply ignore performance aspects of their code (not you right now, just a general rant).
You have to know when to apply it, like your example. Don't optimize something that takes 1 minute into a 50-second task if it runs every hour when you have something that runs every 1 minute and takes 10 seconds that you can optimize down to 9.5 seconds first, and know how to tell which to optimize. Maybe don't optimize either if you have other stuff to do first.
I think another thing to note is that the rule is mostly about low level optimization, which can be done later. For things like protocol design, software architecture, that can also affect performance and hard to modify later, you probably want to take performance into account...
You have to know when to apply it, like your example. Don't optimize something that takes 1 minute into a 50-second task if it runs every hour when you have something that runs every 1 minute and takes 10 seconds that you can optimize down to 9.5 seconds first, and know how to tell which to optimize. Maybe don't optimize either if you have other stuff to do first.