Now, saner developers might tell you that Lambda is stateless and you can't use it to store data. You and I know those people just lack strength of character. Remember: if Lambda functions don't solve your problem, you're not using enough of them.
Following the link in your readme to davywtf's twitter post, I saw this comment by dade:
>I remember last year, and probably the year before that, and probably the year before that still, people continually rediscover CSS keylogging. Glad to see something more interesting this time. Good work.
Following up on this, I found an article[0] claiming that it isn't really an issue since a pure CSS keylogger can only catch one character. It seems to me like your method of pseudo-refreshing the page by making old elements hidden and delivering new elements to replace them might be able to overcome this impediment, which would make your on-screen keyboard unnecessary; if I understand correctly, you could just keep capturing whatever the user types in a field, and deliver a new field that already has whatever they've typed already as the default value with autofocus on. Do I understand correctly? Is there something I'm missing, like autofocus not playing well with continuous chunks of data being delivered? I'm not saying you should put in the additional effort to do this for your beautiful-horrible-clusterfuck of a project, I'm just curious about the feasibility and thought you might already know the answer.
When I saw this had 400+ upvotes, I kind of rolled my eyes and thought, Wow, HN will upvote the absolute worst idea if it’s an implausible-enough hack. But having actually read your explanation and FAQ, I have to admit every upvote is well earned, including mine. That is really concise and entertaining prose, and the feat is well done. Bravo.
A maybe more horrible idea, but also more fun, would be to deliver not an HTML that infinitely loads, but an mjpeg like webcam monitoring software does. Server-side rendering taken to the extreme.
Thank you for elegantly solving the “don’t click order twice!!!” problem:
Thankfully, our method of receiving data fixes that for us. Here's what happens:
We show an "a" button whose background image is like "img/a".
When you press it, the server receives the image request for "a"
The server then pushes an update to the client to hide the current button and replace it with one whose background images is "image/aa".
Reading your comment is the first time I’ve ever pronounced FAQ in such a way as to rhyme with “hack” instead of as “fax/facts” or “Ef Ay Qu” or “teh Questions.”
I am happy to add “the hack FAQ” to my snack pack.
I always read FAQ as "eff-aye-que." Relatedly, I recently saw an HN comment with the line "an .edu," referring to the TLD, and at first took it as a grammatical error since I pronounced the "." as "dot" in my head rather than dropping it as the author intended. Then I realised I was pronouncing the whole thing as "an dot ee-dee-you," whereas I would naturally think of ".com" as "dot com." I'm not sure why, but I think one possibility is that the recent proliferation of .io domains has made me think of other less common TLDs as the sum of their constituent letters rather than as fragments of words. Another possibility is that less common TLDs need to be spelled out for people in normal parlance more frequently. Yet, I'm pretty sure I've been thinking of the recently added ".dev" TLDs as "dot dev," not "dot dee-ee-vee," perhaps specifically because I've never had to spell it out for somebody in speech? Or maybe because we use "dev" as a standalone word already, as in "dev house?" I dunno. It would be interesting to see a survey of how different people naturally pronounce acronyms and word-fragment-TLDs like this.
* FAQ is "fack", though I hear "eff-aye-que" often enough.
* .edu is "dot ee dee you" not "dot ed-you", though I pronounce all of the other old-school domains as words.
* .dev is definitely "dot dev", since "dev" is a word.
It may be apocryphal, but I remember reading somewhere that Tim Berners-Lee chose the name "World Wide Web" because the acronym was harder to say than the word itself.
Same for Americans, but it’s ‘ee-oh’ for a lot of other languages. I like to say “dot yo” whenever I get the chance, and flash my best west side hand sign.
tl;dr css hover selectors that change the background image don't actually cause the browser to GET the specified background image until you hover over it, thus creating a way to send data from a web page with no javascript.
We used to do that to track emails. Gmail's fix was to cache the emails' images on their own server, so it's only hit once. They also don't listen to selector:hover{} so you can't have hover effects.
But it does not matter, if google download all the images when for all your emails then showing them to you is just a fetch from their own servers. Similarly to the ad-blocking extension that clicked all ads on the page (in isolation) so that tracking would be useless.
This would actually be detrimental to users. Responsible email publishers use lack of opens as a signal to reduce volume of emails sent and, eventually, unsubscribing you automatically. Gmail causing a lot of bogus engagement would make it look like people can’t get enough of your content
I'd love to meet these people. I've yet to have a good email publisher experience. whether it's a fortune 500 co or the newest startup they all terribly abuse email.
> unsubscribing you automatically
What is this magic? I've never once been automatically unsubscribed from anything.
Hi, pleased to meet you. Even though everyone on our newsletter list specifically signed up to get newsletters, we’ll still warn you and then unsubscribe you if you do not engage for a long time.
Google, in particular, will send all a sender’s mail to everyone’s spam folder if it sees low engagement across all gmail users... so it is in publisher’s own self interest to remove disengaged users.
I believe mailchimp has something that automatically unsubscribes users if they haven't opened your emails for x time, but the amount of publishers that use this is probably pretty low.
> Responsible email publishers use lack of opens as a signal to reduce volume of emails sent and, eventually, unsubscribing you automatically.
That does not remotely sound like the behaviour of responsible "email publishers". Responsible behaviour is to only email people who asked for it, and to stop when they tell you to stop. Clever trickery to spy on people is not the behaviour of responsible people.
If their intention was as you say, it would be really stupid and unreliable trickery, not just because some systems might load the images without the user reading the email, but also because the user might read the email without loading the images. And even if they were to only and reliably load on reading, reading the email does not in any way imply that the user wants to receive it. Lots of people open email before throwing it away. Some mail readers show a preview which may be enough to read the message. Does that count as reading or not?
No responsible organisation would rely on this kind of trickery, and no organisation that relies on this can be considered responsible in their handling of email.
Agreed, it's a terrible idea. I've been subscribed to the NY Times' "morning briefing" email for a long time. I'm using an IMAP client, and I never bother to load the images for this, because all I want is a text summary of the day's news.
They recently sent me an email saying something like "we noticed that you're not reading our email, so we're unsubscribing you." Apparently I hadn't been loading their tracking pixel/script/CSS, so they thought I wasn't "engaging" enough. This was despite the fact that I clicked on links to full articles, which had all sorts of tracking info embedded in a redirect.
A responsible email publisher offers a clearly-visible "unsubscribe" link at the bottom of the email, which will unsubscribe you with a single click. No nags, no checklists of email categories, maybe an "are you sure?" page at most, with equal-sized "yes" and "no" buttons. One or two clicks, and I don't hear from you again.
A dodgy email provider is more likely to "use lack of opens to reduce volume." If I don't trust some company to actually unsubscribe me when I ask, I'll just filter their domain directly to the trash. Clicking on spammers' "unsubscribe" links is usually a bad idea.
I don't know why this is being downvoted. I work for an ESP, and this is an accurate statement. Whatever you think about marketing emails, you probably don't want gmail to simulate click traffic. Trust me.
> Responsible email publishers use lack of opens as a signal to reduce volume of emails sent and, eventually, unsubscribing you automatically.
No, they don't.
> Gmail causing a lot of bogus engagement would make it look like people can’t get enough of your content
For a few days, perhaps. Gmail accounts for a significant proportion of all email. 'Publishers' would quickly realise they are no longer able to track emails sent to Gmail. To fail to do so would be their loss; if that weren't the case, they wouldn't bother with tracking at all.
"Sorry boss, please fire me and ask one of the 200 other employees here to use that system with built-in tracking to make a newsletter for this clothing brand."
Take some pride in your work. Software is literally the most in demand profession today, you don't have to work for corrupt employers. You can contribute something positive to society and still make decent money.
In some other profession I'd make exceptions, but seriously the amount of money flowing to developers these days, there's no excuse to sell out.
What you describe as a corrupt employer is anyone that uses Mailchimp, SendinBlue, MailPoet, CampaignMonitor, Dotmailer, MailGet, etc.
Do you think that companies who send newsletter do it without any traces of analytics? Every link is tracked, every image is tracked. On the web, there are heatmaps of every single mouse movement. Your keystrokes used to be tracked too, until GDPR hit. Anyone who works does analytics can play back the path visitors used to navigate on the site.
It doesn't take any advanced team to do that. You simply drop a .js file from some third-party CND in your site's head and you have all that data. Any mom ? pop shop that has a website has access to that data.
Everyone does it, that's the current state of the industry. To refuse work from anyone who does analytics would mean to leave the web industry.
> but seriously the amount of money flowing to developers these days
At the time, I was paid cad$40k/year. According to glassdoor.ca, the salary for the same position would be cad$59k/year today. Not everyone works from the inside of a bubble.
> Do you think that companies who send newsletter do it without any traces of analytics?
I don't doubt it.
> Everyone does it, that's the current state of the industry.
That's not an excuse
> To refuse work from anyone who does analytics would mean to leave the web industry.
Analytics as a whole is not the issue. Doing shit like abusing CSS in order to track when someone opens an email and what they do in that email is evil. That violates the user's trust and expectations. I don't doubt that any time I spend on somebodies website will be tracked and analyzed by them. But they have no right to track and analyze me on my own properties, like while reading my own email.
"Everybody is doing it" is not an excuse for evil behaviour. Be better than others, don't contribute to this race to the bottom.
59k a year is a very healthy salary. I know real Engineers doing things like verifying building and bridges who make less than that. Honestly to think that $59k a year in Canada is too little money to afford a moral compass shows how much of a bubble you are already in.
This tracking is made by tracking when someone loads an image from our server.
When their device calls our server, we have access to this person's basic information. Usually this information isn't aggregated but only counted to know how many users opened the email.
That's the equivalent of a caller id. This is the less hurtful and evil method of tracking I can think of.
I don't understand why you are so outraged from it.
Nobody is forcing you to open the newsletter email titled "AMAZING deal from [brand], get ONE FREE if you purchase THREE!" that you just received and much less to click the "request images in this email" button.
I could understand your point if we were talking about "Canvas Fingerprinting" where an invisible image is generated and the user's GPU is singled out to an unique token by exploiting the unique hardware information outputted during the rendering of the image, allowing you to track user across browsers, sites, logins and even after a software format or operating system reset.
However right now I'm merely talking about tracking the number of hits our server receive for "banner-image.jpg". This is not even information unique to the viewer.
Yeah, I'm playing around with the idea of using the background image trick to profile the integrity hash speed of the visiting browser.
That little background image feature in CSS has given up quite a bit of data in similar situations (people used to use it to check browsing history of :visited links before browser started blocking that).
But that hash is a regular, fast hash that takes like 1µs to compute right? Doesn't that get lost in network jitter? Wouldn't averaging the time it takes to run for(i=0;i<Math.pow(2,18);i++); over 10 runs be much more accurate? Or is this meant to spite the 0.01% of visitors that really try not to be tracked and have turned off javascript?
Preloading images so they're ready to show when needed, does not sound like unreasonable behaviour, especially on a connection with high bandwidth but low latency.
It will make this kind of example really slow, but if the intention is to break this kind of spying, then that's okay.
This is just a different spin on the (now fixed in most browsers?) trick of using ':visited' with a background image to uncover which sites the user has visited.
It's things like this that drove me to start browsing the web with CSS disabled by default. It's yet another vector for tracking.
> tl;dr css hover selectors that change the background image don't actually cause the browser to GET the specified background image until you hover over it
This specific page uses :active, not :hover, so it is really no different from a web form, that performs web request each time you press a submit button. It just does not reload a page.
It seems all those css tracking tricks (a:visted, [value=...], now :hover) depend on external resouces (background-images) being loaded lazily only once the selector is matched.
Wouldn't a easy solution be for browsers to always download all url('...') references found the the stylesheets even if the selector is never matched?
It would only be bandwidth heavy for sites that abuse this feature, so that's actually fine by me. And when CSS gets added at runtime, you can prefetch that too.
You seem to be confused about meaning of "CSS tracking". Detecting that user clicks on the link, that you have shown him, is harmless — as demonstrated by Google, a website can always track it's own outgoing requests by replacing all it's outgoing links with redirects. This is inherent part of hypertext.
The infamous "a:visited" tracking didn't simply track your visits from Google — it tracked all your visits across entire Internet. Browser vendors are bunch of lazy hacks, who can't even implement per-site link history (just like they failed to implement per-site cookies). All "a:visited" states are source from single SQLite database, that stores your full web history. THAT is the "CSS Tracking", because it can tell a page about visits from completely different domains. Instead of separating your web history per-domain those <censored> have crippled :visited selector in several undocumented ways.
> Browser vendors are bunch of lazy hacks, who can't even implement per-site link history
But who asked them to? As far as I know, the spec says nothing about per-site histories, and I find it much more useful to know if I already visited a site, regardless of the origin - for example, if I'm researching a topic, two or more sites might link to the same place, and I don't want to open it multiple times.
Plus, the idea that one can look at a modern browser, which are some of the most complex software packages being developed, and think "clearly these people don't know how to add an 'origin' column to a SQLite database", well, it boggles the mind.
> the idea that one can look at a modern browser, which are some of the most complex software packages being developed, and think "clearly these people don't know how to add an 'origin' column to a SQLite database", well, it boggles the mind
It boggles my mind too. Imagine, what would have happened, if those small Javascript snippets, used mainly to add cute visual effects to pages, could check if some image from different site is already in browser cache by performing cross-site HTTP requests... That would allow completely new dimension of spying on web users!
Fortunately, browser developers are some of the most competent people in the word. They would never give web pages too much power by letting them start CPU threads, use OpenGL, allocate arbitrary amount of memory or read your battery level to set exorbitant taxi tariffs for people in a pinch. Browsers are well-designed and highly secure, because they are being updated with security fixes every day, sometimes even multiple times a day.
I don't care about goals (or competency level) of browser makers. But it is hard to deny, that they are repeating the same mistakes Sun committed in late 90's with browser applets. They don't learn.
If your connection is so slow that background images are a problem, then don't load background images at all. That also fixes the problem.
If you want to prevent this kind of spying, the solution is to load these kind of interactive background images either always or never, but not on the interaction they're supposed to track.
It does seem odd that browsers don’t preload images for hover effects. I guess in practice most hover effects just use colors? Otherwise there’d be a noticeable lag the first time you hover or something.
Seems like a compelling argument for not lazily loading urls referenced in CSS rules. If the browser progressively loaded everything in the stylesheet without user interaction it would presumably consume more bandwidth in many situations but it would also blast this server with a continuous stream of the entire alphabet.
Now I don't know if there's any causal relationship between the two (or the three) or if it's just a big coincidence, since it says on the README that the inspiration for it came from a Tweet posted a few days ago.
Exploits rely on interactions between parts of a system, not on crunching numbers. A pure Turing machine is perfectly un-exploitable since <s>its only i/o is supposed to be to the keyboard and the screen</s> (edit: it has no i/o whatsoever). CSS would have more holes than JS if it offered more APIs (which it might do unknowingly by mistakes in programming).
Indeed, and that exactly puts all the onus of exploitability on the machine's environment. The tape might as well be written by regular expressions, if some of the outputs make the ‘interpreter’ do network requests and stuff.
Probably a silly question, but is there a working demo for this? This is one of those things you kinda have to see in action, and there doesn't seem to be a link to anything like that in the readme.
It still uses CSS so as not to be ugly, colour-code messages by user, and to position the "Enter message" box below the chat history, but I hope it gets the basic idea across. It should continue to work if you disable styling.
Skeptical because there needs to be some dynamic component to it which is what the CSS pseudo-selectors are achieving. Would love to see a working demo.
You don't need letter buttons. You can have a regular text box in an iframe and submit the form with the text.
I actually remember websites doing exactly that years ago.
Some would display the chat like here, by never closing the connection, as described in the repo. And another trick was to just send a refresh header (or use a meta tag) to automatically reload the iframe displaying the chat every $x seconds. The latter one was iirc more prominent because you could do it on any web host with PHP and MySQL as it doesn't require any long living processes.
That's exactly how I implemented a web chat around 2002. IIRC, Internet Explorer didn't update the iframe with the chat messages reliably, so I had to fall back to reloading. The whole thing was rather inefficient, simply polling a database for new messages every second, but it worked fine for about 10-20 users.
You could write a UI framework using these ideas to provide rich client-sided functionality to Tor hidden services, where users typically disable JavaScript.
Definitely the most creative code I'll be seeing this week! I realize this is really bad practice and likely to be fixed since it can be used for tracking users, but: are there any advantages to using CSS over Javascript, in theory?
What exactly do you want to block here? The user clicked a form control, the form control sent a request to server. It is no different from clicking a link.
You know, that HTTP allows websites to "track" you each time you visit them, right? The horror!
A ruby dsl that’s indistinguishable from JavaScript. http://kevinkuchta.com/_site/2017/07/disguising-ruby-as-java...
^ and in talk form: http://confreaks.tv/videos/rubyconf2018-ruby-is-the-best-jav...
A url-shortener using AWS lambda - JUST lambda. No data store. http://kevinkuchta.com/_site/2018/03/lambda-only-url-shorten...