Hacker News new | past | comments | ask | show | jobs | submit login
How many ways can you slice a URL and name the pieces? (tantek.com)
66 points by remi on Aug 26, 2011 | hide | past | favorite | 12 comments



I have no idea why some think it's a good idea to 'supplement' the browser's built in navigation by overriding yet more of its standard keys. On pressing left arrow, I expect it to behave uniformly, not navigate to some other page. If you insist on doing this, then at least use accesskeys= attribute. Instead I'm left reeling, hands off, staring at my keyboard as if it is an unknown enemy preparing to attack.

Really don't care about presentation (serve your blog in text/plain all I care), but some things are unforgivable. Didn't get reading past the first few lines.

Is it really unfair to assume someone who makes this mistake could not hold any kind of useful, complex, technically valid opinion that would teach me something new?


"I have no idea why some think it's a good idea to 'supplement' the browser's built in navigation by overriding yet more of its standard keys."

Because they can. I have yet to meet a 'web designer' or an 'user interface designer' who doesn't redesign all buttons, scrollbars and text fields for every design they do: as if the standard built-in ones aren't good enough. Or perhaps they're afraid they won't be judged as having done a good enough job on the design?

Jakob Nielsen writes about this phenomenon, and I agree with him:

"Many Flash designers introduce their own nonstandard GUI controls. How many scrollbar designs do we need? [...] The specification of a new GUI widget is a major human-factors exercise. The current Macintosh and Windows scrollbars emerged after the world's best interaction designers worked for years testing numerous design alternatives. A new scrollbar designed over the weekend is likely to get many details wrong. And, even if the new design was workable, it would still reduce a site's overall usability because users would have to figure out how it worked. They know how to operate the standard widget. When you use standards, users can focus on content and their reasons for visiting your site. Deviate, and you reduce their feeling of environmental mastery."

The disadvantage of this approach is that my apps use only the standard GUI components, which makes some people say that "they're not of this time" and that I should give the app a "fresh new look with a contemporary skin". Maybe their taste and expectations are already numbed beyond repair by the chaos in designs and visual overkill we encounter each day on the web.


Surely there is a name for this phenomena.

Just like human languages have more than one word for the same thing (words for sex, genitals, alcohol, people you like/love/hate).

When something is popular (like the things I mentioned above) words to describe it flourish.

Also, the url is a close to a universal thing as we have in programming. Almost every framework evolves to eventually deal with a URL (Greenspun's Tenth Rule).

So maybe this is asking why the word for (sex|drugs|etc) are different in English than Russian.

</wild speculation>


I think the word "Balkanization" is pretty accurate. "Divergent Evolution" if you're into the whole PC thing.


Or "Babelization"


Yet another lovely cheatsheet to print off and hang on the wall by my desk. Thank you remi. Perhaps I should put up a blog post with all my cheat sheets some day.


I was somewhat disappointed that the author of this article left out URI "path parameters": semicolon-delimited name/value pairs that can be attached to each component of the slash-delimited path.


The newer RFC (RFC 3986[1] from 2005, which I assume is being used here) for URIs doesn't specifically mention them in the spec. You have to go back to 1998 and rfc 2396[2].

RFC 3986 (circa 2005):

      path          = path-abempty    ; begins with "/" or is empty
                    / path-absolute   ; begins with "/" but not "//"
                    / path-noscheme   ; begins with a non-colon segment
                    / path-rootless   ; begins with a segment
                    / path-empty      ; zero characters

      path-abempty  = *( "/" segment )
      path-absolute = "/" [ segment-nz *( "/" segment ) ]
      path-noscheme = segment-nz-nc *( "/" segment )
      path-rootless = segment-nz *( "/" segment )
      path-empty    = 0<pchar>

      segment       = *pchar
      segment-nz    = 1*pchar
      segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                    ; non-zero-length segment without any colon ":"

      pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
RFC 2396 (circa 1998):

      path          = [ abs_path | opaque_part ]

      path_segments = segment *( "/" segment )
      segment       = *pchar *( ";" param )
      param         = *pchar

      pchar         = unreserved | escaped |
                      ":" | "@" | "&" | "=" | "+" | "$" | ","
IIRC, somewhere in RFC 3986 it alludes to the fact that you could still do something like that, but it would be scheme-specific, and not part of the URI spec.

Also of note is that pipes ("|") are no longer even mentioned in RFC 3986, but they were characterized as 'unwise' in 1998.

RFC 2396:

   unwise      = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
[1] http://www.apps.ietf.org/rfc/rfc3986.html#sec-3.3

[2] http://www.apps.ietf.org/rfc/rfc2396.html#sec-3.3


I see the block at the end as "

A few conclusions: "/* is more prevalent than "/. Yet anecdotally developers use "/ more, and in practice most schemes are protocols. "/* is used consistently (to mean the same thing) as are "/* and "/. "/ has been used consistently for the past 10+ years and in a way consistent with its operating system roots. "/* is used inconsistently as to whether or not it includes the leading "#" hash/pound symbol. However, notably absent from any specification or platform was the alternative phrase "/*.

"

I feel there is something wrong


And absolutely any combination gets named "baseUrl".


I've seen people confuse "http" with "http:", as in "the 'http:' URI scheme".

The name of the scheme is "http"; "http:" is meaningless.


That's probably because the DOM's window.location.protocol returns "http:" for URLs with http schemes. That is reflected in the diagram included in the article where protocol extends to cover the colon in the DOM row. Overall, it's a pretty unfortunate mish mash of terminology.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: