Author begins with fairly idiomatic shell pipeline, but in the search for performance the pipeline transforms to a awk script. Not that I have anything against awk, but I feel like that kinda runs against the premise of the article. The article ends up demonstrating the power of awk over pipelines of small utilities.
Another interesting note is that there is a possibility that the script as-is could mis-parse the data. The grep should use '^\[Result' instead of 'Result'. I think this demonstrates nicely the fragility of these sorts of ad-hoc parsers that are common in shell pipelines.
It probably depends on what you are trying to accomplish... I think a lot of us would reach for a scripting language to run through this (relatively small amount of data)... node.js does piped streams of input/output really well. And perl is the grand daddy of this type of input processing.
I wouldn't typically reach for a big data solution short of hundreds of gigs of data (which is borderline, but will only grow from there). I might even reach for something like ElasticSearch as an interim step, which will usually be enough.
If you can dedicate a VM in a cloud service to a single one-off task, that's probably a better option than creating a Hadoop cluster for most work loads.
Another interesting note is that there is a possibility that the script as-is could mis-parse the data. The grep should use '^\[Result' instead of 'Result'. I think this demonstrates nicely the fragility of these sorts of ad-hoc parsers that are common in shell pipelines.