Kind of like when I first dragged a window across multiple monitors in the 90s. We are so used to content being stuck in artificial containers that it's kind of crazy when it transcends them.
There was a clever game called Metal Marines back then that exploited this idea - the game was a 1v1 base-building RTS where each player's "base" was on its own map in its own window. When you fired missiles or launched APCs from one base into the other, the missile would fly out of your base window, over the desktop and into the other window to hit his base. The effect was jaw-dropping even though it was a simple and cartoony game.
Thank you! I was looking for that game a few days ago, but couldn't remember the name at all! And it looks like there's a reboot of it on Steam: http://store.steampowered.com/app/335940/
What's the reason for allowing web pages to get absolute screen coordinates?
This is a privacy leak. I have a 24" screen, and I don't keep the browser window maximized because it would be too big. I presume other people do to, and I'm pretty sure most have a preferred size and position.
Just use NoScript if you are concerned. JavaScript allows a large number of other ways to identify you as well. Direct ways of course include cookies / local storage / Web SQL, but other indirect ways include localization preferences, OS / browser / plugin / extension fingerprinting, HID inputs, side channel profiling etc. etc. – some combination of these allows uniquely identifying at least 80% of web users.[1][2][3]
Allowing a site to execute JS in your browser is equal to trusting them, like it or not, and browser vendors are definitely in the business of adding new APIs rather than reducing attack surfaces.
NoScript is not about disabling JavaScript but allowing only white-listed domains to execute JavaScript. Generally, I trust the domains I visit frequently, and have them white-listed. I explicitly block domains of "analytics" and social networking services, since these do not offer me any value-added content.
However, if I follow a link to a domain I have never visited before, I will first see if the content is viewable without JavaScript. Most of the time, it actually is.
But sometimes the content does not render at all or the page layout is broken beyond recognition. Then I'll try "temporarily allowing" executing JS from the home domain of the site (I've noticed most sites these days bundle JS from 5+ domains, most of which are analytics and social networking services). For maybe 80% of those sites that do not work without JS, this fixes the issue and I am able to read the content. Takes maybe 2 secs to temporarily white-list and reload the page.
The rest are generally pages that make assumptions like that the analytics library is always present in the page JS scope and then crash when it is not, leaving the content unreadable because the JS layout code never runs. A quick peek at the JS console when the page is loading generally reveals what is the issue.
Sometimes I just ignore those pages, but if I really badly want to see the content, I can launch an one-off incognito window for the page and have the page execute with all the JS tracking and social network code allowed. This solves any issues for almost all remaining pages. If problems still persist, these are generally pages that are just simply broken – maybe the page only works with some specific browser, like Google Chrome (I use Firefox), to start with.
If you're using NoScript in the fine-grained manner you describe, and for privacy reasons (not just security), I wonder have you ever looked at uMatrix[0]. Same deal but a bit more performant, and also covers the whitelisting of other privacy-leaking aspects such as cookies, CSS, tracking pixels, iframes, etc.
I have use NoScript consistently for years and I disagree with you. Essentially the only thing NoScript does is make it so when you haven't visited a website before, it doesn't automatically trust it. For the most part every single website I go to needs to have JS enabled for anything to work beyond just reading content.
And even then, I would say 70% of the time, some critical piece of content on a website does not work with JS disabled, be it images or text or video or etc.
I disable JS on my phone, and I disagree with you: the vast majority of the text content web is perfectly readable without JS.
If you want to watch video, you're out of luck. If you want to use a web app, you're out of luck. But if you just want to consume text content, the majority of the web just works, and a lot faster too.
(I've never been able to get NoScript to work right, it's always given me problems. Perhaps part of the problem is NoScript?)
> For the most part every single website I go to needs to have JS enabled for anything to work beyond just reading content.
What percentage of websites is that though? Of course it depends on your browsing habits if it is feasible or not. I don't click social media stuff, I participate only if I really want to or if I am part of a community.
The majority of sites I visit are either regular revisits (rules are easily set up then) or random browsing where security & privacy by default is good.
I never used NoScript but am a bit uMatrix fan. There I can easily allow things. NoScript looked super complicated.
I've been using NoScript for years as well, and I have to disagree with you. Now that NoScript auto-permits the base domain (which you can switch off), I don't have to do much manual permissioning. There's the occasional bit, but really, 70% is a ridiculously high estimate.
Then there's the occasional 'funny photo' site which won't work until you enable 15 different sources - in which case, I just pop open Chrome if I really want to see that funny photo.
And then starts the crazy hunt for the one thing you need to turn on to make the page work.
I was trying many years ago to create a public DB/wiki telling us which things we need to turn on to get the page to work, but it got abandoned before I really started.
On the contrary, I would argue that most new web platform features have been designed with more thought for security. We generally ask for user permission first before divulging data, for example. Some new features help mitigate past mistakes, too, like Content Security Policies.
>> What's the reason for allowing web pages to get absolute screen coordinates?
Web developers have always pushed for more access to information about the user and their environment. Browser and tool developers are happy to provide that access. There's always some use case that sounds reasonable, but you're right that it's just a security issue waiting to happen.
These holes are also being talked about in the new Wayland display server on Linux. Warping a mouse pointer, color picking, knowing your apps place on the desktop are all security violations. They are being very careful with that stuff because it's an insecure free for all with X.
Every time I upload an attachment to gmail or a picture to facebook, I wonder how secure things are. Those seem to require user action, but do they really?
> Web developers have always pushed for more access to information about the user and their environment. Browser and tool developers are happy to provide that access. There's always some use case that sounds reasonable, but you're right that it's just a security issue waiting to happen.
I'm torn. On the one hand I understand the privacy implications; on the other hand, if you'd want to be serious about those, you'd have to get rid of JavaScript and half of CSS. Every interesting feature can be turned into a privacy/security violation; how far are we willing to go in removing them?
> These holes are also being talked about in the new Wayland display server on Linux. Warping a mouse pointer, color picking, knowing your apps place on the desktop are all security violations. They are being very careful with that stuff because it's an insecure free for all with X.
I know the quip about how in IT paranoia is not a sickness but a job requirement, but damn it...
As for web developers pushing for more information. No surprises there, its so they can more precisely fine tune the layout of the "app" (notice how they refer to what used to be called a site with a term that used to denote something running locally).
That it also can be used to fingerprint the computer, and by extension the user, is a side effect, not a goal.
I'd say that maximizing is more general then having a custom sized window. There are only so many resolutions in use (and I would guess 50%+ is either hd, full hd or qhd). Resizing your windows yield nearly unlimited options for browser sizes.
It's a question of consistency. A few version numbers + window sizes are going to stay the same every visit.
It only takes 33 bits of information to uniquely identify everyone on the planet and you likely can get more than 1 from a maximized window.
How so? Wouldn't there just be a one to one mapping between resolution and maximized windows size? The only way to prevent this is to not maximize windows and instead have custom sized ones, but that just makes identifying you even easier.
The laptop I'm using right now has a 1366x768 screen, a maximized browser window is 1306x768. I bet many other users of of that size are going to have browser windows that have the full 1366 px width, but then some height <768 px. Details depend on the OS, the size and position of the taskbar/dock/whatever, ... It's not much, but it adds a bit or two of data.
Custom sized ones make you easier to identify only if you don't change them.
I've used it to automatically position auxillary windows adjacent to the main window in a data analysis application. I don't see how it reveals anything private about the user... my screen resolution and window location are not secrets to me.
But just because it's unique, why does it make it trackable? The overwhelming majority of devices have one of a few common display resolutions, so what's there to link across sessions for identifying users?
The idea is not that your resolution is unique it's that the maximized window size is unique. They can then track that you have more than one monitor, what size that monitor is, and, probably with some easy math, a jist of how many you have. That setup is probably a lot more unique than expected. That plus possibly some canvas finger printing or something similar could probably keep pretty good tabs on you. I have 3 identical 21 inch monitors and I'm browsing on the left-most monitor. Anyone else have that same setup?
I don't think the method you're describing exists. If you want child coordinates relative to parent coordinates, you would use both the parent and child's absolute coordinates.
Thanks to you (and others who've provided this information), so I guess my question now is what is the privacy leak described. I understand that the browser is confessing information about the machine to the web page but I don't understand why that represents a private piece of information.
I feel like the information about where the window is located on the screen is useful for implementation and I can't think what it says about me as a user.
Nothing on its own. However it is one of the pieces of information used to create a unique fingerprint for your browser which can be used to track you as you travel across the web. IIRC Tor browser recommends never resizing the window from the browser's default window size.
I think the thing I like most about this is how it's evidently very imperfect - the ball sometimes goes missing, the rules for what velocity it needs to bounce from one window to another seem hazy, etc - but none of that really matters: it's fun and cool to play about with for a minute or two, it doesn't at all take itself seriously, and you know that any real implementation of this sort of thing, if there were an actual need for it, would be implemented more robustly anyway.
I made a project in college like this that utilized the Windows API to enumerate the rectangles of the windows and you could create shapes and other things that you can throw around your desktop
Drawing programs such as Photoshop and Corel Painter can do it (in fact they tilt the window you paint into while keeping menus etc. straight which is what one wants them to do.) I guess it's a bit harder to do in one's program than it ought to be though, especially with the stellar hardware support provided by the cheapest integrated GPU.
Then you could make the ball roll down towards the corner and drop into another window, of course!
The NeWS window system let you reshape the entire framebuffer to any orientation or clipping, as well as individual windows and sub-windows, in the late 1980's.
Why shouldn't I be able to lay down on my side next to a laptop and read a web page off the screen sideways, or adjust the rotation of the window to match the inclination of my pillow?
The other thing I want the window manager to support (which NeWS couldn't do since PostScript only supports 2D affine transforms) is estimating my head position relative to the screen, and projecting the window in perspective so it looks rectangular from an oblique viewing angle, so I can watch a movie on an extra screen off to the side.
If you could rotate windows around on the screen, Browser Ball should be able to use the laptop accelerometer to detect the true direction of gravity, and bounce the ball accordingly as you turned the windows and the laptop itself around.
Unfortunately Apple stopped putting accelerometers in laptops with SSD drives, since they were only used to retract the heads when falling.
You can do a lot of fun things by modeling and tracking the positions of multiple people's heads and multiple devices in augmented reality! Here are two people using two tablets, two laptops, and a desktop computer together with Pantomime: [1]
The pseudo billboard idea is cute, but I wonder how it fares in readability on non HD screen (font rendering in 3D space may impede it, then viewing angles.). I'd love to see it done anyway, it's cute idea.
Another 'this is <year>' rant, we should have way more sensors on usb or i2c and have our laptops integrate with the real world more.
Here are some ideas and discussion about aQuery -- like jQuery for accessibility [1] -- which would be useful for implementing this kind of stuff out of of existing window systems and desktop applications.
Morgan Dixon's work is truly breathtaking and eye opening, and I would love for that to be a core part of a scriptable hybrid Screen Scraping / Accessibility API approach.
Screen scraping techniques are very powerful, but have limitations.
Accessibility APIs are very powerful, but have different limitations.
But using both approaches together, screencasting and re-composing visual elements, and tightly integrating it with JavaScript, enables a much wider and interesting range of possibilities.
Think of it like augmented reality for virtualizing desktop user interfaces. The beauty of Morgan's Prefab is how it works across different platforms and web browsers, over virtual desktops, and how it can control, sample, measure, modify, augment and recompose guis of existing unmodified applications, even dynamic language translation, so they're much more accessible and easier to use!
----
James Landay replies:
Don,
This is right up the alley of UW CSE grad student Morgan Dixon. You might want to also look at his papers.
Don emails Morgan Dixon:
Morgan, your work is brilliant, and it really impresses me how far you've gone with it, how well it works, and how many things you can do with it!
I checked out your web site and videos, and they provoked a lot of thought so I have lots of questions and comments.
I really like the UI Customization stuff, and also the sideviews!
Combining your work with everything you can do with native accessibility APIs, in an HTML/JavaScript based, user-customizable, scriptable, cross platform user interface builder like (but transcending) HyperCard would be awesome!
I would like to discuss how we could integrate Prefab with a Javascriptable, extensible API like aQuery, so you could write "selectors" that used prefab's pattern recognition techniques, bind those to JavaScript event handlers, and write high level widgets on top of that in JavaScript, and implement the graphical overlays and gui enhancements in HTML/Canvas/etc like I've done with Slate and the WebView overlay.
Users could literally drag controls out of live applications, plug them together into their own "stacks", configure and train and graphically customize them, and hook them together with other desktop apps, web apps and services!
For example, I'd like to make a direct manipulation pie menu editor, that let you just drag controls out of apps and drop them into your own pie menus, that you can inject into any application, or use in your own guis. If you dragged a slider out of an app into the slice of a pie menu, it could rotate it around to the slice direction, so that the distance you moved from the menu center controlled the slider!
Video: Prefab: What if We Could Modify Any Interface? Target aware pointing techniques, bubble cursor, sticky icons, adding advanced behaviors to existing interfaces, independent of the tools used to implement those interfaces, platform agnostic enhancements, same Prefab code works on Windows and Mac, and across remote desktops, widget state awareness, widget transition tracking, side views, parameter preview spectrums for multi-parameter space exploration, prefab implements parameter spectrum preview interfaces for both unmodified Gimp and Photoshop: http://www.youtube.com/watch?v=lju6IIteg9Q
PDF: A General-Purpose Target-Aware Pointing Enhancement Using Pixel-Level Analysis of Graphical Interfaces. Morgan Dixon, James Fogarty, and Jacob O. Wobbrock. (2012). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '12. ACM, New York, NY, 3167-3176. 23%.
https://web.archive.org/web/20150714010941/http://homes.cs.w...
Video: Content and Hierarchy in Prefab: What if anybody could modify any interface? Reverse engineering guis from their pixels, addresses hierarchy and content, identifying hierarchical tree structure, recognizing text, stencil based tutorials, adaptive gui visualization, ephemeral adaptation technique for arbitrary desktop interfaces, dynamic interface language translation, UI customization, re-rendering widgets, Skype favorite widgets tab: http://www.youtube.com/watch?v=w4S5ZtnaUKE
PDF: Content and Hierarchy in Pixel-Based Methods for Reverse-Engineering Interface Structure. Morgan Dixon, Daniel Leventhal, and James Fogarty. (2011). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '11. ACM, New York, NY, 969-978. 26%.
https://web.archive.org/web/20150714010931/http://homes.cs.w...
Video: Sliding Widgets, States, and Styles in Prefab. Adapting desktop interfaces for touch screen use, with sliding widgets, slow fine tuned pointing with magnification, simulating rollover to reveal tooltips:
https://www.youtube.com/watch?v=8LMSYI4i7wk
PDF: Prefab: Implementing Advanced Behaviors Using Pixel-Based Reverse Engineering of Interface Structure. Morgan Dixon and James Fogarty. (2010). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '10. ACM, New York, NY, 1525-1534. 22%
https://web.archive.org/web/20150714010936/http://homes.cs.w...
PDF: Prefab: What if Every GUI Were Open-Source? Morgan Dixon and James Fogarty. (2010). Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '10. ACM, New York, NY, 851-854.
https://web.archive.org/web/20141024012013/http://homes.cs.w...
Today, most interfaces are designed by teams of people who are collocated and highly skilled. Moreover,
any changes to an interface are implemented by the original developers and designers who own the
source code. In contrast, I envision a future where distributed online communities rapidly construct and
improve interfaces. Similar to the Wikipedia editing process, I hope to explore new interface design tools
that fully democratize the design of interfaces. Wikipedia provides static content, and so people can
collectively author articles using a very basic Wiki editor. However, community-driven interface tools will
require a combination of sophisticated programming-by-demonstration techniques, crowdsourcing and
social systems, interaction design, software engineering strategies, and interactive machine learning.
Small temporary answer while I unwind all the linked content, Dixon's target aware pointing is already missing so much. I wonder how on earth nobody in smartphone land thought to implement something similar. I'm already hooked :)
It's missing from many contexts where it would be very useful, including mobile. It's related in many ways to those mobile gui, web browser and desktop app testing harnesses. It could be implemented as a smart scriptable "double buffered" VNC server (for maximum efficiency and native Accessibility API access) or client (for maximum flexibility but less efficient).
The way jQuery widgets can encapsulate native and browser specific widgets with a platform agnostic api, you could develop high level aQuery widgets like "video player" that knew how to control and adapt many different video player apps across different platforms (youtube or vimeo in browser, vlc on windows or mac desktop, quicktime on mac, windows media player on windows, etc). Then you can build much higher level apps out of widgets like that.
Target aware pointing is one of many great techniques he shows can be layered on top of existing interfaces, without modifying them.
I'd like to integrating all those capabilities plus the native Accessibility API of each platform into a JavaScript engine, and write jQuery-like selectors for recognizing patterns of pixels and widgets, creating aQuery widgets that tracked input, drew overlays, implemented text to speech and voice control interfaces, etc.
His research statement sums up where it's leading: Imagine wikipedia for sharing gui mods!
Berkeley Systems (the flying toaster screen saver company) made one of the first screen readers for the Mac in 1989 and Windows in 1994. https://en.wikipedia.org/wiki/OutSpoken
Richard Potter, Ben Shneiderman and Ben Benderson wrote a paper called Pixel Data Access for End-User Programming and graphical Macros, that references a lot of earlier work. https://www.cs.umd.edu/~ben/papers/Potter1999Pixel.pdf
Everything was made to be broken. Got this neat effect where the ball was spinning rapidly in the corner of each spawned window! http://imgur.com/gmwQOCb
I remember playing with this at least 8 years ago. It's really scary that there's enough information disclosure in browsers that it's possible to do this.
This is really scary to you? I get that you can do fingerprints, and honestly there's a LOT more than just browser window position/size in them, but "really scary"?
Maybe we need to stop exaggerating on this sort of stuff if we want people to take us seriously.
Fine. It's scary that browsers allow for so much fingerprinting. This toy webapp is an example of a larger picture of "things that we really shouldn't allow to happen". So "really scary" was referring to the issue of browser fingerprinting as a whole, not just one toy webapp.
I just wanted to try that since it seems interesting. Sadly it tells me to enter the address of the home where I grew up, but then promptly refuses to recognize anything that I put in. It also says I can try schools, but does not recognize any of those either.
Maybe it only recognizes American addresses or something, but I don't know any of those.
edit: Ok I got it working by putting in Amsterdam, which happens to be a city and not an address.
Security issues aside, has anyone tackled how to intentionally span multiple monitors using similar techniques? e.g. A dedicated four monitor web-based media dashboard.