This didn't seem interesting to me, because there's so many ways to scrape a site, in every high level language, that learning one more way to do it is going to be a diminished return...and I also didn't know that "Factor" was a language.
I honestly didn't know what a "concatenative" language was until I saw that "Six programming paradigms that will change how you think about coding" post that fronted HN last week:
So this is just a long way of saying...after just learning about concatenative languages, I'm really interested in what that paradigm brings to a common task...maybe there aren't productivity gains, but I love learning different philosophies of coding, and thanks to the OP for showing one practical example.
...unfortunately, it appears that the webpage of Joy's author has been lost to the web, so therefore the link to the Internet Archive. And don't forget that there was a very mainstream concatenative language at one point in time: PostScript.
The Factor documentation is hosted at http://docs.factorcode.org. Feel free to browse! It's the same set of docs that you also have access to locally when you download the Factor binaries[1] or pull/compile from github[2].
The concatenative.org[3] wiki also has similar starting material and pointers.
Factor is a fun language. The blog that I linked to is written by one of the Factor contributors.
Do you know about any "Factor for Forth programmers" tutorial? Factor is similar enough to irritate me with it's "for beginners" materials, but different (and larger) enough to make normal manuals useless by themselves.
I'd especially appreciate a bottom-up write up, starting with the stack and cells (which feel familiar), and then introducing higher-level abstractions of Factor.
Better wash your brain about that implied misconception. Factor is much more like Lisp and Haskell (esp point-free style) than Forth. I guess "learn Lisp and/or Haskell and Factor won't seem so foreign" isn't terrific advice. But right now there doesn't exist many newbie guides at all.
I dunno, syntactically it resembles Forth quite a bit, what with : ; for defining words and () for comments and all that. Anyway, I have no problem whatsoever with high-level abstractions in Factor, nor with its concatenative nature, nor with its macros and so on. I know all these features from other languages. What I want is a just description of how these high-level things map to assembly, I guess. For example I just learned that: "Internally, a quotation is a pair, consisting of an array and a machine code entry point. The array stores the quotation's elements" - this is a kind of definition I want for all the abstractions in Factor. It's probably best to go through Slava Pestov blog and pick up such scattered descriptions, but I'd really appreciate if someone prepared a single article with all these definitions.
Note that the stack comments aren't actually comments in Factor - they're part of the function definition and are mandatory. The compiler will do a simple check to ensure that all of your stack inputs and outputs match up for each function call.
you can write an entire web scraper with just a url using http://scrape.ly
With scrape.ly I can just do this to crawl the entire HackerNews site across pages and grab the urls and extract any data from the page it lands on without defining any fields (it discovers them on it's own) and so doesn't require you to 'relabel' fields when the site changes layouts. It also generates new IP addresses on the fly so you don't get stuck and launches multiple threads for you to speed up the process. It works fully with ajax sites and single page apps. Flash support is coming too.
http://scrape.ly/s/{https://news.ycombinator.com/}
{next:More}{Space Monkey dumps Python for Go}*{fields:'Auto'}
Honest question (I don't mind downvoting if you disagree), but why would you want to waste time writing web scrapers, maintaining it to run and fixing the code? Multiply it by 100 or 1000 different websites and it becomes a full-time job. For me, I want to get the data I need with the least possible of overhead and as soon as possible and I don't really want to be bothered with setting up environments and hosting for it to run and fixing bugs when sites change layout.
Are you familiar with the idea of implementing common problems for the sake of pedagogy? For example, someone who might want to demonstrate how a particular programming language can be used might start a blog, and in that blog said person might post articles demonstrating how you could attack a particular problem in that language.
Your criticism of this post comes across as tone-deaf. You might as well have written the editors of Beautiful Code to lecture them about how the chapter on quicksort is horribly misguided and that everything a good software craftsman should ever care to know on the subject can be found at http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.ht...)
Honestly, I meant no harm. I saw that we were talking about web scraping in other languages like PHP and Python, and I wanted to add on to the idea above that Factor doesn't really provide additional value than any other implementation of the job in another language would. They equally share the same overhead associated with web scraping activity that must lay on the shoulder of the developer. All in all, I wanted to highlight that one shouldn't put so much effort into creating web scrapers, and suggested a different tool that is specialized for the same job mentioned in the article.
The OP's template could have some more info on what Factor is, but there's a few links, including this wiki for it: http://concatenative.org/wiki/view/Factor/Learning
I honestly didn't know what a "concatenative" language was until I saw that "Six programming paradigms that will change how you think about coding" post that fronted HN last week:
http://brikis98.blogspot.in/2014/04/six-programming-paradigm...
So this is just a long way of saying...after just learning about concatenative languages, I'm really interested in what that paradigm brings to a common task...maybe there aren't productivity gains, but I love learning different philosophies of coding, and thanks to the OP for showing one practical example.