Namespaces (GNU Awk 5.0+) make Awk pretty well suited for larger projects (as demonstrated by aho), but it never quite took off; this article (by the author of GoAwk[1]) is a nice look at the relevance of Awk in 2020: https://lwn.net/Articles/820829/
I forget where I read it but Brian Kernighan said, at some point, the lack of namespaces in original Awk was probably the biggest mistake that prevented bigger adoption in large projects. Now that we have it, I'm hoping to see a brilliant IDE surface in the near future.
I read the book written by the creators and it's pretty clear they never intended it to be a general purpose language lol. I'm paraphrasing but they basically wrote in the book "People are completely mad and are intent on using our DSL as a general programming language so we added half baked functions to it but it's bad and you should feel bad"
Perhaps; but Awk is such a compact approach. My head feels cleaner when using it compared to Perl - perhaps as it's fairly close to C. The interpreter is a single binary on practically any OS, and is usually around 5MB in size. It's installed by default on essentially every OS (aside from Windows, which is an easy single exe download). Also with implementations like Mawk, it will destroy Perl in speed.
Perl is widely known as a write once read never language for a reason. In that sense I'll grant you awk is no better, but if you are stepping into scripting land you may as well use something generally readable (e.g. python, or just use bash).
I am X-er whose first experience with Perl was working in enterprise software tech support in 1999 and diagnosing a crash due to a syntax that had somehow shipped in the install script for a GA product with a ~$100K annual license/support fee.
While “write once, read never” is hyperbolic, it captures a real issue for Perl compared to many other competitive languages (of which, to be fair, there were far fewer widely supported for the tasks Perl was most often chosen for 20+ years ago.)
That's not to say I don't find some things nice about Perl, and I’d love to be able to spend more time woth Raku which grew out of it.
ok, I found https://github.com/noyesno/awka so it is possible, though it uses it's own lib to link against - I don't know how much of that is based on AWK C vs original code.
My mother in law who's into this sort of stuff warned me a few days before the Facebook blackout about this. Her exact words were "Mercury is in retrograde and that is bad news for technology" i just brushed it off and a couple days later Facebook had the worst outage of its history. Gave me chills. But not enough to change my sceptic ways (again her words)
> Astrologers you say.. But is no one going to talk about how mercury retrograde wiped out Facebook ..?
For a fact, Facebook engineering uses mercurial instead of git in their tooling. They are going to do some 'hg rebase' whenever the stars are out of alignment =)
It isn't a good idea for a production system or anything, but the code itself seems pretty clean. I've seen infinitely worse bash scripts than this compiler.
> A class of minimal bootstrap binaries that has a reproducible build on all platforms. Providing a verifiable base for defeating the trusting trust attack.
HoMM3 has a cult following among lots of 25-45 years old males. During recent elections, one of candidates mentioned that he enjoys playing HoMM3 a lot, comments under the video were like 80% "I'll vote for anyone who plays HoMM3" and alike.
In the 2000s they were among the go-to strategy games. The hot-seat mode probably helped. Of course, the publishers and devs saw little profit from all that following—unlike middle-aged dudes selling pirate CDs in mall stands.
HOMM 5 was developed by Nival, a major Russian dev and publisher at that time—though the 4 and 5 games weren't as famed, afaik.
someone needs to make a script that can scrape tweets/medium articles and predict what the next hot pet project will be. you could have your git implementation out and on your resume six months before the rush with a system like this.
The only really horrid thing about awk is the way you declare local variables in a function ... by just adding more parameters (and letting them be default-initialised on calls). Otherwise it can act like a relatively conventional scripting language, aware of associative arrays (ahead of its time).
Back in Google New York, sometime after 2006, but prior to my team changing floors after the YouTube acquisition, my team sat about 2 cubicles over from Peter Weinberger, and a bit further from Brian Kernighan's office. My manager, John Sarapata was standing up and complaining how much he hated awk, and I ducked a bit and said "You know AWK stands for Aho, Kernighan (motion toward Brian's office), and Weinberger (motion toward Peter's desk), right?" They're both very nice people, so I don't think they would have been too offended, but it just felt odd to be boldly talking behind the backs of such well-recognized luminaries. Last I checked, John was very senior in Google NYC, so I guess he's a bit more careful to get full context before complaining loudly. Good times.
1. This is an incredible story haha. 2. I constantly talk about how much I hate python or other languages that I actually like. Sometimes the things you complain about most are the things you need most too
A big problem with Awk is that it lacks garbage collection (including ref counting), and that puts major limitations on the language. You can't return an associative array from a function:
$ awk 'function f() { a[1]=2; return a } BEGIN { f() }' </dev/null
awk: cmd. line:1: fatal: attempt to use array `a' in a scalar context
You also can't have nested associative arrays, i.e. recursive or cyclic data structures are not allowed.
As far as I can tell, this is because a stack frame owns everything allocated within it, and when a function returns the whole stack frame is cleaned up unconditionally. You can pass arrays down but not up. It's very naive memory management (by modern standards; it's probably better than BASIC).
So I'd say it's clearly not expressive enough for general purpose programming. There are Lisps in awk but they do weird tricks with text as far as I remember.
> There are Lisps in awk but they do weird tricks with text as far as I remember.
Not quite -- https://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/... was the first and used two global associative arrays, car[n] and cdr[n], where n is an integer which iirc had to have a particular tag modulo 4, as the type tag. (Much as you'd do it in a lower-level C implementation.) I used almost the same scheme in my later Lisp.
Yeah this is cool, I was thinking of the "normal" way of doing it where evaluating a Lisp function reinvokes the interpreter function. But there are obviously other ways to do it if you have global arrays, as awk does.
Oil has recursive data structures, although I'm still working out how to compose shell-like "procs" and functions which could return an associative array. We could just add them both naively but I think it would cause many shell programs to be a lot messier.
Sure but you can also declare global arrays in Awk. The point is that Awk and bash are the same in several respects, so you don't really gain anything by switching to one or the other:
- they don't have any garbage collection or ref counting
- they don't have recursive data structures
- you can't return any kind of array from a function
In bash you also can't pass an array to a function, but in Awk you can.
You definitely want to work with it if you can. I live much of my life in python, but there hasn't never been a week, and usually not a day, in the last 10+ years where I don't write a quick awk snippet to do something useful to data.
The advantage of awk over python is that it almost always takes < 60 seconds from the point at which you say, "I want to summarize, review, filters, report" on some columnar data to the point at which you have output in front of you.
I"m fast with python doing that - but it's usually a 3-4 minute set of steps.
The only exception being json, which awk is clumsy with. (jq is my goto tool for the simple queries, python for anything that requires more than a minute of thought.)
Also - it's almost always the case that I'm using awk in conjunction with a bunch of other tools, cut/sed/uniq/grep to get what I want.
I have really warm feelings about AWK, actually! It’s such a simple language that you can learn pretty much all of it in an afternoon, but it is still expressive enough to be really useful as a text processing language. I have an AWK script that turns my CSV bank statements into Ledger transactions, and AWK is just perfect for that kind of thing.
Didn't mean to sound negative about it, I love awk too. It is definitely small enough to learn in a day (or two), and it fills a niche people don't realise they have.
be wary that the initial parsing is probably going to be bespoke for your bank account csv export format. my own bank doesn't even share consistency in these exported csvs between the checking, savings, and the credit card I have with them.
SNOBOL4 was earlier than AWK with associative arrays by a little under ten years. However, from the standpoint of manipulating strings with regular expressions, AWK introduced the concept as far as I know.
I did say "more or less" and "modern". Who's heard of SNOBOL4 now? :) My guess is that every other language that got AA's got them by way of influence from awk (or something else that got them from awk); awk itself may have gotten them from SNOBOL4 though.
But ok, we can at least say that awk is the oldest programming language to have associative arrays that's still in widespread use. (Yes, I'm hoping that someone will argue for SNOBOL4's current widespread use. :) )
> However, from the standpoint of manipulating strings with regular expressions, AWK introduced the concept as far as I know.
Two can play at this game! `awk` is really a successor of `sed`, which is all about manipulating strings with regular expressions. sed maybe isn't turing complete though (but someone's gonna prove me wrong here too) or at any rate not convenient to use in as general a way as awk.
> Christophe isn't the first person to realize that sed is almost a general purpose programming language. People have written tetris, sokoban and many other programs in sed.
Wow! But clearly, you wouldn't WANT to except for the fun of it.
I wouldn't want to :). It looks very expressive, much more than I had known just 3 days ago, but maybe it's a little loose around the edges for my taste.
Yep, I recommend Gimpel's old book Algorithms in SNOBOL4 for fun examples (from the days when such a title did not mean an undergrad data-structures-and-algorithms curriculum).
We are used to see awk very condensed one-liner, so the code in the repository is surprisingly readable. This makes it a fun project which can be used to learn a lot about both awk and git. Nice!
Technically this isn't quite compatible, because it uses a different hash algorithm. git as of v2.13.0 uses a hardened sha-1 algorithm to counter the shattered attack. In normal Cases you won't see the difference.
— which both could play cartridges and had Basic, with absolutely unlicensed Mario sprites in the mem.
But never connected the dots that there was an official thing like that, even though vaguely heard of the Famicom brand's various forays into neighboring markets of the day.
Yes, the Famicom had Family BASIC (it was a Family Computer after all!), an official Nintendo release only in Japan. What you had on your Subor was probably a totally unlicensed copy of Family BASIC, as that program did come with Mario and Donkey Kong assets to play with.
A version of PROLOG in awk is here - https://github.com/prolog8/awkprolog - but it would be interesting to see those problems for students in AI which are done with awk.
If one is gonna develop on a SBC, he can utilize proper GNU/Unix tools not trimmed-down equivalents from busybox/toybox. In fact, I fail to see a use-case for these tools in embedded space whether on SBCs or not.
See also awk-jvm [1], a toy JVM in awk. Though, inexcusably, that one also uses gawkisms rather than awk proper, when portability is the one thing awk has over other mini langs ;)
Fun fact: early awk did not. The Lisp interpreter in awk which I linked in another comment was from those days and worked entirely as a big nested loop, using no functions besides the built-ins.
[1]: https://github.com/benhoyt/goawk