I feel kinda silly just having spent the long weekend writing a threaded 4chan scraper. This is a super welcome change though. Even if you don't visit 4chan regularly you can't ignore the VAST amount of content people upload there. I imagine some interesting statistics will come of this ( I know I plan to).
Even if you don't visit 4chan regularly you can't ignore the VAST amount of content people upload there. I imagine some interesting statistics will come of this.
That was my reaction too. Beyond statistics, this will make it easier to develop all sorts of user-facing and machine-to-machine applications -- for sharing, grouping, ranking, and linking items, and even for 'overlaying' the content on top of other social networks.
I'd expect the API to be grow and mature over time, and am curious to see what comes out of this experiment.
Coming from the country where needing a parking permit to the garbage dump is the norm (i.e. only way to get rid of your trash) and can actually be a desirable activity (I can't even tell you how many books, monitors, speakers, etc. I've either restored or recovered in perfect condition for no cost)... I can't wait!
4chan represents a social cauldron unlike others. The first and foremost question I had hoped to find out was "What do people talk about when they don't have an identity to think of?" There is almost zero consequence in the case of failure. If you say something and no one responds, or everyone insults you, it is completely forgotten within a few minutes. You can say almost anything you want without fear of retribution. (Where else in life does this exist?) So, with no rules, what do people want to talk about?
BUT, be careful with the concept of 4chan and "no retribution". There have been countless examples of 4chan "reacting" to threads, I'm sure I don't need to go into detail here.
There is plenty of excellent content (consequently, statistics), you just have to look a little harder. I would hazard a guess that you've taken a quick glance and dismissed the entire community. Feel free to correct me if I'm wrong.
While you're talking about the homophobia, let's not ignore the blatant racism on 4chan. The work n-gger is not only casually thrown around on those boards, at this point it's ingrained racism that's influencing a lot of young teens visiting the site and is subtly encouraged by the moderators. It's disgusting and even thought I'm a strong advocate for the freedom of speech, I fear the power of effective propaganda.
Just a little bit of confirmation bias there, from my own experience. That notwithstanding, 4chan most definitely has a higher rate of homosexuals than the rest of the internet.
While you clearly pulled that percentage out of nowhere I would agree it's probably not too far off. While it's fair to call it bisexuality, it might be more accurate to simply label it sexual opportunism.
If you really think the use of fag and faggot reflects homophobia, you need to take a step back and look at where you are - the internet. Do you think that people posting gore do it because they enjoy the content? It's all shock value.
(Not really interested in whether 4chan's use of "fag" reflects homophobia. I am responding to your next sentence.)
>Do you think that people posting gore do it because they enjoy the content?
I think there are those who post gore because they get pleasure from grossing people out, outraging people or otherwise eliciting a strong emotional reaction (which is probably what you mean by "shock value") but I do not think that is all of it.
I think a significant fraction of the people posting gore do it because it will induce others to post gory pictures that the OP has not seen yet. And I think that they want that because they derive pleasure from seeing people and animals being harmed.
One of the reason I believe that is a book I read called Among the Thugs, in which a reporter spent some time hanging out with British football hooligans. He reported that after running with the hooligans a while, engaging in violence and contemplating engaging in violence became pleasurable. This and other things suggest to me that many people are capable of deriving pleasure from seeing people get fucked up once they've acquired a taste for it.
The main reservation I have about 4chan is that it seems to be enabling many to acquire a taste for it (and for other things like harassing people on Facebook).
At least when I still browsed /b/, gore was almost entirely to dissuade people from browsing. The logic being that if you couldn't get over it, you weren't all that welcome.
> If you really think the use of fag and faggot reflects homophobia, [...]
Yes, it does.
> you need to take a step back and look at where you are - the internet.
Well...
I suggest reading this introduction to the subject of "why second degree&ironic gay bashing work only if there is real homophobia somewhere": http://www.queerty.com/does-calling-someone-a-fag-really-mea... I have better articles about sexism and homophobia but there are written in french.
People really like to dismiss that by calling it transgressive humor, but it's really not? You never truly get made fun of for being part of the majority.
Normalizing the "ironic" use of slurs normalizes the unironic use of them as well, and further entrenches privilege as a norm.
It doesn't take spending a lot of time in any nerd culture group to see that the racism, sexism, and homophobia are not ironic at all. Case in point, the enormous administrator/moderator-led backlash on SA in the past ~year against anyone daring to not openly welcome the death of anyone who is not a white cis male.
Came to post this. The only redeemable value in 4chan, in my opinion, is that the fact that posts aren't archived makes for a very interesting social experiment. An API firehose pretty much puts an end to that.
I'd say the fact that your posts are most likely to be forgotten, even if it is archived, is much more of a negative aspect of the site than a positive. How many times have I spent 30 minutes on a post, only for no one to respond to it, or worse, realize that the thread 404'd? It makes you look at yourself and wonder why you bothered.
Forced anonymity is the interesting part of imageboards -- the text BBS equivalents to anonymous imageboards, based off the original 2chan, manage to maintain a very similar flavor while featuring permanent archival of all posts, and enjoy longer-form discussion as a result.
> I'd say the fact that your posts are most likely to be forgotten, even if it is archived, is much more of a negative aspect of the site than a positive.
This is the most magical aspect of 4chan, which is why I don't care for archives.
> This is the most magical aspect of 4chan, which is why I don't care for archives.
The written word allows us to lend ideas (memes, concepts, what have you) a sense of permanence that they never would have had otherwise. But at the same time, it prevents them from evolving in a way they otherwise might have, if their exact origins were not so easily recorded & referenced.
I don't think it's a coincidence that 4chan, which lacks this permanence, is the origin of so many of the top memes of the past decade (and by 'meme', I don't just mean things like LOLcats).
(Gleick argues this same point in the first few chapters of The Information, for those who are interested).
EDIT: Just realized who I was replying to - if I may ask, are you concerned at all that an official API might detract from 4chan (by making said content more traceable)?
>The written word allows us to lend ideas (memes, concepts, what have you) a sense of permanence that they never would have had otherwise.
I agree, and that's why it was horrible for 4chan. In the beginning, there were new memes, concepts, and what have you every other day, and old stuff was forgotten (or rather used to show you'd been there for a while.) Now, it's just a constant recycling of the first few years of the site.
IMHO it was caused by archives and meme dictionaries. No need to lurk moar anymoar. Also, very little reason to laugh.
I totally agree. /k/ommando here Thanks for keeping it relatively the same. I've thought about creating a bookmarklet or something to add some features I originally thought would be nice (threading/grouping linked comments, or alerting you when you get a reply) but I realized that features like these could fragment each thread and distort the flow of the conversation.
Two features I still think wouldn't conflict with the site are buttons to expand images inline in a thread and turning the text links clickable (copying and pasting on a tablet sucks). I know there are bookmarklets which do this but I can't display my bookmarks bar on chrome on my tablet.
I understand that 4chan isn't very/at all profitable. I think there really is opportunity for you branch out on some boards rather than just links to that jlist site. Have you considered doing more contextually aware ads or even relevant amazon affiliate links in threads to boost your revenues? There will always be detractors but I think most users really appreciate 4chan and would love to see you better compensated for it as long as it doesn't ruin the site in the process.
one more thing, This API will no doubt be used by people to create their own sites which add the features they want to 4chan, do you consider the API a potential source of revenue by perhaps charging for faster versions of it?
You've mentioned before that you visit 4chan every day. Do you visit any text boards?
I guess it's my own fault for fighting against the nature of the site, but sometimes I try to go the extra mile and put some effort into a post, and feel like no one even notices when I do. It's very discouraging for a conversation to have long since moved on by the time you've posted, or for a person you were trying to help to have given up already and left their thread. While I like 4chan for what it is, I'm still a little bit sad that the textboards never really took off as much, and I wonder why they didn't -- anonymous somewhat-long-form discussion sounds appealing to me. I'd be right at home in a text board with a fraction of the userbase of a 4chan board and a slightly slower pace, but most of the ones I know of are practically dead at this point.
I guess you kind of answered the question I had in mind. Most "western" people seem to see anonymity or even pseudoanonymity as something you use when you have something to hide, something you don't want traced back to you. Everything else, they don't seem to mind having their real name (or a pseudonym that they make little attempt to disguise) attached to. Or, on a different level, people believe that you need to have some kind of identity that you care about, a reputation that you want to uphold, to keep discussion civil and meaningful, and that this should be the key element differentiating an online community.
On the other hand, the default in my mind is anonymous discussion. You only don a pseudonym or reveal your true identity when it's actually relevant to the conversation, and immediately stop when it isn't -- people usually don't care about who I am, but they might care about what I have to say. As an example of this, I've browsed Hacker News daily for over a year now, and just recently got around to creating an account. I still feel uneasy about it, even though I'm posting with nothing but a pseudonym, and not posting about anything that I would particularly care about having traced back to me. I doubt Hacker News would have the same culture if it had allowed anonymous posting, but I certainly would have started contributing much earlier if I didn't have to create a pseudo-identity to do so.
It's important to note that while it might be true that Reddit serves as the West's 2chan, the two deliver markedly different experiences. 2chan being as enormous as it is (millions of posts per day) should indicate to you that there is some itch that a giant collection of anonymous textboards can scratch that Reddit can't.
Yeah, I'm aware. The problem stems from the fact that you can't type the unambiguous name "ni-channeru" without looking like an insufferable dork/weeaboo. 2ch and 2chan (.net) are the domain names of "ni-channeru" and Futaba Channel, respectively, but it's obviously very confusing to refer to them by their domains. Therefore, in English discussion, we tend to say 2chan or 2channel when we mean "ni-channeru", and Futaba when we mean Futaba Channel.
I'm a new graduate student in an American university. As part of my Data Mining/NLP project, I'm wondering if I can do something cool with this fresh API. Any ideas?
um... make a 4chan app that lets users up/downvote threads, then builds a naive bayesian model of what keywords (eg. "toasting epic bread") are correllated with the kind of threads you like. Netflix-like. A sort of automatic cream-extractor.
de-anonymizer based on posting times, writing style, what baits them to respond, etc. The Thread-Local unique ID's would help, giving you more stuff that you knew came from same user. Don't know if this one is practical. Kindof scraping the bottom here...
Probably not. Since we don't have user accounts, and already have a bit of a spam problem, I'd be pretty worried about what a real POST API might bring.
Pipes/YQL seems not to have been banned yet, so getting 4chan data back wrapped in a callback for nothing-but-front-end hacks seems doable. Also along those lines: it'd be super-awesome if the API would take ifModifiedSince as a query parameter and not just in headers.
Could someone explain to me how this could be leveraged (or if it could be) to gather a sort of stream of messages, a la the Twitter streaming API or reddit.com/r/all/comments.json?
I'd be interested in doing some language statistics and comparing them to the aforementioned networks.
Sadly read-only, though it's not much work parsing the HTML and faking a submit through a Post request. Good luck submitting a 4chan app to Apple's app store though :)
Forgive me if this is a noob question, but does 4chan restrict embedding of images.4chan.org images from external urls? I was just playing around with the API and it seems all the images are rendered as the placeholder image that says "4chan.org".
If this is true, I don't know how to utilize this API to make something valuable since all I can do is get the url or text. Somebody please enlighten me. Thanks!
This sort of protection is usually done by checking the referrer header, which is trivial to set when retrieving something programmatically or when using standard tools like wget. The API seems focused on reducing the processing costs of browser extensions that let the user view the page, but add extra features to the page, anyway. Those would probably still seem like a normal browser view of the image to the site by default even if browser plugins can't perform the trivial client sent header change (not sure if the browser plugin API exposes it).
Why would you hotlink to 4chan-pictures? These get deleted with their thread once the thread hits page 10, anyway, which can happen in under 5 minutes (on the more active boards like /b/)
There are already mobile clients for android and, IIRC there were clients for iOS but were banned from the app store due to some kind of infringement (I think it was adult content)
So I don't think a lot of stuff is going to change, excluding the diminishing server load that happened with old clients/extensions.
I'm sure this is about trying to improve site performance, and charging for it would inevitably cause everyone to continue scraping the HTML, thus defeating the point.