Larry Wall's Very Own Home Page

AdieuToLogic · 2024-08-10T06:12:05 1723270325

It is a sad day indeed when the Perl link on Larry Wall's home page results in a 404.

romwell · 2024-08-10T06:35:44 1723271744

It's been "a sad day" for over a decade.

Here's the last capture on archive.org[1]; there wasn't much there in the first place.

It looks like Larry had better things to do than maintaining that page (...or the page that links to it) :)

[1] https://web.archive.org/web/20100420032605/http://www.wall.o...

beretguy · 2024-08-10T08:45:16 1723279516

It says right at the beginning that site is under construction.

_giorgio_ · 2024-08-10T06:16:02 1723270562

That was a fast 404 anyway.

I forgot how damn fast websites used to be.

olalonde · 2024-08-10T08:17:42 1723277862

> I hate my telephone. Please don't ask for my phone number.

I can relate...

romwell · 2024-08-10T06:31:21 1723271481

It's... beautiful.

And doesn't chew up CPU nor makes me wait.

cornholio · 2024-08-10T08:25:10 1723278310

Looks just like Perl.

tmtvl · 2024-08-10T09:10:09 1723281009

Is Perl chartreuse? I always imagined it to be a light blueish colour. Maybe cornflower blue, or dodger blue, or a light steel blue.

romwell · 2024-08-11T05:49:30 1723355370

No! Perl is surely beige, like everything else in the 90s.

userbinator · 2024-08-10T06:49:43 1723272583

%7Elarry? Looks like an old trend ( https://jkorpela.fi/tilde.html ) that is no longer common.

riiii · 2024-08-10T06:59:40 1723273180

It wasn't a "trend". It was Apache's way of automatically mapping a username to his home directory. I feel old.

CalRobert · 2024-08-10T07:26:26 1723274786

My 90's ISP homepage URL makes more sense. inreach.net/~myusername

Actually, I miss when ISP's all came with some space for simple web hosting. It was a given that a lot of people would want to make their own sites, not just consume them.

dotancohen · 2024-08-10T08:10:49 1723277449

  > It was a given that a lot of people would want to make their own sites, not just consume them.

I wouldn't say it was a given that a lot of people would want to host, but it was a given that people _could_ host.

Then Geocities came along and made the hosting easy, destroying the ISP-hosting market.

beretguy · 2024-08-10T08:46:27 1723279587

And now we have Neocities.

xelxebar · 2024-08-10T07:11:02 1723273862

Not just Apache, but POSIX shell syntax! It's called Tilde Expansion[0], so in your dash, bash, whatever, ~USER expands to the home directory of USER. This is the general form of standard "bare tilde" syntax as a stand-in for $HOME.

[0]:https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

userbinator · 2024-08-10T07:06:18 1723273578

I am referring to the practice of escaping the tilde.

mikae1 · 2024-08-10T07:00:10 1723273210

But still incredibly rad. https://tildeverse.org/

guax · 2024-08-10T07:55:17 1723276517

Godamnit, is beautiful. 10/10 click.

bluedays · 2024-08-10T05:55:41 1723269341

Is Larry Wall brat?

cholantesh · 2024-08-12T13:20:20 1723468820

Perl, it's so confusing most times when written in Perl (perl perl perl)

Perl, why must it fit on one line when written in Perl (perl perl perl)

kzisme · 2024-08-10T07:06:51 1723273611

I used to visit his personal site to see updates, but It's been awhile since I last looked.

I do however frequently watch this interview with him: https://www.youtube.com/watch?v=aNAtbYSxzuA

alex-moon · 2024-08-10T08:45:09 1723279509

GEEK PERL CODE [P+++++(--)$] My tendencies on this issue range from: "I am Larry Wall, Tom Christiansen, or Randal Schwartz.", to: "Perl users are sick, twisted programmers who are just showing off." Getting paid for it!

Thing of beauty.

nabla9 · 2024-08-10T07:27:29 1723274849

That's exactly as I remember it from 1999 when I visited it.

librasteve · 2024-08-10T22:18:17 1723328297

Perl 6: We're working on it, slowly but surely...or not-so-surely in the spots we're not so sure... (it’s called raku now)

ChrisArchitect · 2024-08-10T06:22:05 1723270925

Some previous discussion:

2018

https://news.ycombinator.com/item?id=18175215

penguin_booze · 2024-08-10T10:44:32 1723286672

I didn't know Mr. Wall was competing (or teaming up) with tonsky.me on who blinds the reader faster.

commandersaki · 2024-08-10T08:55:08 1723280108

Ah I remember his journal on Kerataconus, helped me through my own struggle.

norvig · 2024-08-12T05:19:08 1723439948

Larry Wall is Brat.

_xk9i · 2024-08-10T06:34:57 1723271697

Thank you, Larry!

veltas · 2024-08-10T07:21:44 1723274504

Is there a Larry Wall Facebook Wall?

fuzztester · 2024-08-10T09:14:03 1723281243

Larry is Wall.

cranberryturkey · 2024-08-10T05:37:07 1723268227

i wonder what he's up to these days.

_giorgio_ · 2024-08-10T06:12:58 1723270378

He's building a bigger keyboard, because perl has finished all the available symbols £_&++()/@!?;:'"*~`••√π÷×∆∆\}{=°^¢$¥€%©™™]]

shawn_w · 2024-08-10T07:03:05 1723273385

That looks more like APL.

zerr · 2024-08-10T07:17:06 1723274226

Would be useful for Rust as well.

librasteve · 2024-08-10T18:10:14 1723313414

that’d be unicode then https://docs.raku.org/language/unicode

cranberryturkey · 2024-08-10T07:38:16 1723275496

i became a master at regex from my perl days in the 90s and early 2000s....valuable skill imo

silisili · 2024-08-10T08:14:40 1723277680

Regex is great (sometimes), for the writer.

As a team lead for a typical SaaS app, they're banned. I'd rather see a chain of individual string checks than long regex strings that only the author understands, because they're usually brittle and often incomprehensible to anyone but the author.

bazoom42 · 2024-08-10T08:32:26 1723278746

How is a chain of string checks less brittle and easier to understand? If they are checking for the same pattern, the intrinsic complexity will be the same, the string checks will just add some additional complexity and risk of bugs.

silisili · 2024-08-10T08:39:45 1723279185

Edited a bit to explain we're just a typical SaaS application. Regex mostly crops up in validations.

Just Google the first result for 'email address regex validation.'

(?:[a-z0-9!#$%&'+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'+/=?^_`{|}~-]+)|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])")@(?:(?:[a-z0-9](?:[a-z0-9-][a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-][a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

How many programmers do you think understand that perfectly at first glance? I've programmed and used regex for decades and can admit, I don't. Is it even correct? Who knows, unless I waste time deciphering both it and the RFC side by side.

I'd much rather have a handful of single checks, preferably commented. As is usually the case, performance is not the primary concern.

nathell · 2024-08-10T09:00:45 1723280445

> Regex mostly crops up in validations.

I’ve just grepped my codebase for regex matchings, and this is not true. The most common use case is matching a filesystem path or a URL that is known to conform to a schema (e.g. file names prefixed with dates) and extracting parts of the schema from it.

> Just Google the first result for 'email address regex validation.'

That is an abomination and not a good way to validate emails, because, as you say, it’s super complicated and barely understandable. Draw a finite-state automaton corresponding to this regex to see why. Equivalent code written without regex, implementing the same FSA, would easily be >100 LOC and equally incomprehensible.

In practice, it’s better to check whether the string contains an @ and maybe a dot, and that’s it. Sure, you won’t be RFC 5322 compliant, but who cares? Your users are much more likely to make a typo in the domain name anyway than misspell the characters that would render the email invalid. Just send an email and see if it arrives.

All of the regexes in said codebase of mine are simple. The longest is 75 characters and a straightforward one to check for UUIDs; you can understand it at a glance:

    [0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{12}

bazoom42 · 2024-08-10T08:46:38 1723279598

Now rewrite the same to a sequence of string checks and show me the code. For a fair comparison you should remove all comments and whitespace as you have done with the above regex.

The problem with the above is not the regex per se, the problem is that the email address grammar is really complex for historical reasons. If you insist on validating email syntactically, you can’t avoid that complexity by rewriting to multiple string checks.

The solution is to use a library or just perform a simpler validation (eg check for a ‘@‘), since a full syntactic validation does not provide much value anyway - the address might still be invalid anyway.

silisili · 2024-08-10T08:49:07 1723279747

The difference is, individual checks can be commented and referenced to a particular rule or even line in an RFC.

A regex blob is basically 'this is all the rules, RTFM.' And as you mentioned (especially in the case of email validation), they're usually incorrect.

bazoom42 · 2024-08-10T09:14:52 1723281292

You can add comments to regexes, explaining each part. I believe it is called verbose mode.

> And as you mentioned (especially in the case of email validation), they're usually incorrect.

My point was that the email address might still be invalid despite being syntactically correct, eg if you miss a letter. This is why I don’t understand the obsession with syntax-level email validation. You still need a confirmation mail.

But of course there can be a bug in a regex - just as there can be a bug in imperative string-matching code implementing the same pattern.

tmtvl · 2024-08-10T09:23:15 1723281795

From `perldoc perlre`:

> A single "/x" tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a bracketed character class, nor within the characters of a multi-character metapattern like "(?i: ... )". You can use this to break up your regular expression into more readable parts. Also, the "#" character is treated as a metacharacter introducing a comment that runs up to the pattern's closing delimiter, or to the end of the current line if the pattern extends onto the next line.

defrost · 2024-08-10T08:51:11 1723279871

That alone is hard to document and maintain.

Coupled with auto gen state diagrams, the current and correct RFC 5322 spec and case notes it's far more defensable.

There are some pretty decent RegEx tools about these days.

https://regexper.com/#(%3F%3A%5Ba-z0-9!%23%24%25%26%27*%2B%2...)

^^ Heh. Markup processing error in HN ?? the final ) wasn't captured in the link creation.

See https://stackoverflow.com/questions/201323/how-can-i-validat...

for a working link to the state diagram generator.

Even with a handful of single checks there's still the need to compare those, block by block, to the RFC.

Assuming RegEx is to be used (I'm not intimidated by RegEx's but I'm general not a fan, preferring custom parsers for many things that are hard or impossible with a RegEx) this is a better approach:

https://regex101.com/r/gJ7pU0/1

It's a "live" example that includes a test suite and has a parser that annotates blocks.

The RegEx expression uses a DEFINE for sub clauses to improve clarity.

bazoom42 · 2024-08-10T11:12:31 1723288351

> I'm not intimidated by RegEx's but I'm general not a fan, preferring custom parsers for many things that are hard or impossible with a RegEx

Good call not to use Regex for things that are impossible to do in Regex! But seriously, a custom parser must have some way to recognize individual tokens. If you distinguish parsing and lexing, what tool do you use for lexing?

Regexes have a particular purpose: matching patterns of characters. I haven’t seen anyone suggest how to do that in a simpler and cleaner way.

defrost · 2024-08-10T12:24:47 1723292687

It's less about the matching and more about the validation with most of my past applications, IIRC the best RegExp matchers for the current email specification have 99% or somesuch coverage but aren't complete .. there are many examples of data extraction and validation where a regular expression is an imperial tool for a metric job.

Nested data, eg JSON, not a good fit in general, they are weak at balanced tag matching, they suck at validating numeric ranges such as Lat|Long clock time, etc.

bazoom42 · 2024-08-11T08:45:37 1723365937

Yeah use regex for its purpose (matching character patterns) and don’t use it for things it can’t do. That is just common sense and applies to any tool.

But the argument about the email address validation confuses the tool with the problem to solve. The email address grammar is intrinsically complex, so if you want to validate an email address against this grammar (which I think is silly, but that is a seperate discussion) any validator implementation would necessarily be at least as complex as the grammar. Regex is not the problem here, rather it is the simplest possible solution for a complex problem.

fuzztester · 2024-08-10T09:17:39 1723281459

iirc perl from 5.x onwards allowed both whitespace (at the right places) and comments, in regexes. using those could make them a lot more readable.

can't remember but you might have to specify a flag for it.

librasteve · 2024-08-10T18:13:20 1723313600

this is the default in https://raku.org

fuzztester · 2024-08-10T22:46:29 1723329989

good to know, thank you.

cranberryturkey · 2024-08-10T08:31:44 1723278704

interesting...you ban anything people typically suck at? At PayPal we banned html and made everyone write XML....turns out we just wrote shitty XML which lead to shitty xhtml :P

tannhaeuser · 2024-08-10T09:01:56 1723280516

> At PayPal we banned html and made everyone write XML

That's gross when XML with its pointless verboseness is actually just a "canonical" SGML subset without tag omission and other short forms, intended for delivery to browsers, while SGML proper has all the authoring features. Goes to show how clueless and prone to koolaid sellers developers were and still are (cf crypto, LLMs).

cranberryturkey · 2024-08-10T10:38:51 1723286331

the idea was you ask 10 web devs how to code a <button>Save</button> you'd get 10 different answers, so we had a <Button>Save</Button> xml tag that generated them all the same. there was only one way to create a button now. It worked until people started adding all these options to to <Button> template that it became garbage again.

silisili · 2024-08-10T08:43:56 1723279436

Absolutely. Readability trumps all else in a productive team environment.

If everyone had the same 'because people suck at it' attitude, we'd never have evolved beyond asm, if even.

bazoom42 · 2024-08-12T09:48:49 1723456129

Regex is a high-level domain-specific language. So in this analogy, it is the tedious substring-comparisons in imperative code which is the equivalent to low-level assembler.

Using the right level of abstraction for the problem at hand is key to readability.

cranberryturkey · 2024-08-10T10:39:29 1723286369

well most code i've seen in corporate america sucks balls