If you enjoy stretching what you can get done with limited environments, I recently discovered BusyBox includes not only an AWK implementation but an HTTP server[0] that supports CGI as well. I spent a weekend setting up a really simple web-app using recutils [1] as the database and BusyBox AWK as the programming language.
If you enjoy bricolage and you feel like wasting a weekend, I would recommend giving it a try!
That looks so featureful (thinking of attack surface here), you might as well install a regular web server at that point. What's the benefit of using busybox for this?
Edit: ah is this file size? Or does this refer to RAM? Either way, sounds like that's the benefit
> BB httpd can be compiled with only basic features like CGI and ETag and will have only 8Kb
BusyBox provides a complete set of POSIX utilities and more in a single minimal binary; there are lots of embedded systems with pretty much just BusyBox installed, although I'm not sure how common it is to compile it with httpd. GP linked an OpenWRT article, so I'm guessing (the site seems to be down) it's included in that OS.
All this tells me is that preventing directory traversals can only be done by checking absolute file paths are within a bounded range, and nothing else.
Running the server as a service account that can only read its own directories, running it in a chroot, running it in a mount and pid namespace, using SELinux to further restrict what files it can read even in principle.
Of course, if you're trying to go superminimal anyway, it's not that big a deal to create a server that doesn't even have sensitive data on it. You can make init simply mount a root filesystem that only has busybox and whatever files you want to serve and starts up the httpd process and nothing else. Turn Linux into a unikernel basically. If you compile busybox yourself, you're also able to remove all the subcommands you don't actually need.
neat! i was aware of bash's built-in tcp client at /dev/tcp, but gawk being able to function as a tcp server is way cooler. Thanks for that bit of knowledge :>
a small nitpick - it uses inetd, which actually handles the network stuff allowing you to work on stdin/stdout. At least that's what i remember from 2003(?) when i wrote an identd for conntracked connections in bash.
What does the loop and sleep 1 do? Is that to respawn upon crashes (why'd it crash?) or does it exit socat after handling a request?
I recently made a netcat webserver returning only one static response just to have a tiny info page for my new email server, that needs a while true loop but no sleep. I then benchmarked the performance and was very surprised to find that a slow VPS manages [spoiler answer] https://lgms.nl/p/cau/?b64&bY%2FBTsQwDER%2FZbivViDxAxw58Q1pO... Performance graph: https://snipboard.io/Vtn0MO.jpg
The sleep is probably to prevent endless revival of the script when you try to cancel it with Ctrl-C. But yeah, it's not ideal considering you can't make more than one request within a second.
Instead of sleeping, you can create that file when you want it to stop. But perhaps that'd imply that someone would genuinely use this rather than only developing it as a toy
I've heard of Awk for years and (probably like many) only used it for single-line snippets for the vast majority of that time.
Imagine my surprise when I just decided to look into it one day (after finding slightly more complex Awk scripts that did a lot with very little code which piqued my curiosity) and finding this very nice line-oriented DSL that has aged SHOCKINGLY well given how old it is.
I just wish it had interrupt handling of some sort without running a custom fork/patched version
then on top of that you can run werc a web framework written entirely in rc, save for a few of the filters which i think are awk files (like the markdown2html converter)
Sort of. Gawk supports magic file names like "/inet/tcp/8080/0/0" to be a tcp client or server, but you don't get control of accept()...it's all mooshed together. So you can make a slow, single threaded webserver with it.
Which is too bad, because the usually shipped gawk extensions also give you fork. If they had separated listen() and accept(), you could actually make a reasonable webserver. See my other comment for an example of what you're limited to.
I don't think so. Poking around the source code, the only thing exposed in the cli client around listen() and accept() is related to FTP, because of the way FTP works. It does mean, though, that libcurl has functions like Curl_conn_tcp_listen_set() and Curl_conn_tcp_accepted_set() that could be used for what you're describing. It's just that they are only used for FTP now.
Something that handles the TCP connections and passes them to the wrapped program’s stdin and stdout. Back in the day, lots of network daemons were written as command line tools and executed via inetd.
Is there a modern solution for this that isn't some variant of FastCGI? I know it's inadvisable but sometimes I would like to write a proof of concept that doesn't require me to code a reliable, long-running process.
For tcp. There's another one which uses GAwk's internal tcp stack, but I wasn't sure if that would work in normal awk, and I'm a purist like that so you got this one.
True, but you can go the other way and do it with only socat: socat -v tcp-listen:8888,reuseaddr,fork exec:'cat RESPONSE' is a (Very Static) entire web server, if RESPONSE has 2 header lines, a blank, and an html body. (A little too late in the morning for me to do the golfing needed to eliminate the "cat" and do the cr-nl from whatever shell you're already running, though.)
(scrolls down further in the socat manpage) Ohhh, under "EXAMPLES": you can combine a crlf option to TCP-LISTEN, SYSTEM instead of EXEC, and "echo -e", and there's actually a slightly useful http server already done :-)
For those who were drawn to ''awk'' in the title it is probably uninteresting. For those who were drawn to ''simple web server'', it is likely of some interest.
If you enjoy bricolage and you feel like wasting a weekend, I would recommend giving it a try!
[0] https://openwrt.org/docs/guide-user/services/webserver/http....
[1] https://www.gnu.org/software/recutils/