Hacker News new | past | comments | ask | show | jobs | submit login
List of Printers Which Do or Do Not Display Tracking Dots (eff.org)
586 points by prawn on June 6, 2017 | hide | past | favorite | 210 comments



I wrote this article/originally created this list, and I would like to emphasize that there is a second generation of this technology that probably uses dithering parameters or something of that sort, and that does not produce visible dots but still creates a tracking code. We don't know the details but we do know that some companies told governments that they were going to do this, and that some newer printers from companies that the government agencies said were onboard with forensic marking no longer print yellow dots.

That makes me think that it may have been a mistake to create this list in the first place, because the main practical use of the list would be to help people buy color laser printers that don't do forensic tracking, yet it's not clear that any such printers are actually commercially available.


Well, that being said, it sounds like there needs to be a more rigorous way of detecting these new codes.

One way I can think of, is to record data on the CMYK pins on the inkjet head itself. IIRC, they activate between 17v and 22v, and pulse per high.

The goal here is to make the printer think its printing, while recording all the data of the pulse operations. We would get a lengthy file out.

Ideally, the pulse coding should be consistent if printing the same image. "Printing" the same thing over multiple times could show time/date codes embedded.

I should also be able to compare underlying system internals too, with multiple clones of VMs with small config details different. They should be the same data. If they aren't, we know its encoding system stuffs.

But yeah, there is a way to attack this, and that's by going lower in the stack and treating the printers as a black box. It's not the best way, but a way I've thought of that could at least detect this new technique.


Examining printer firmware might also reveal the details more readily.


Probably. But my experiences are that these chips are fabless fab chips with half a dozen things on it. Reverse engineering is of course doable. But you're getting into ARM territory, with custom ARM licenses and different opcodes. It gets "unfun", and quickly. And ARM has no concept of probing for hardware, your best chance at a memorymap is whatever you build yourself, and not fry.

But I would imagine that a simple 3d printed harness would work a lot better with allowing signals to be recorded and make the printer think its printing. Then the bypass harness could have an ARM on it and spool instructions to either a SD card, or via USB serial.

The goal here is as transparent as possible, just in case there are other security systems that try to detect this attack. But I'd guess they havent got to that point yet.


Why make the printer "think" it's printing. Why not just literally measure and record the pulses sent to the actual print heads?

The "yellow dot" method would be picked up pretty quickly by the yellow being triggered while printing entirely B&W documents.

Things like dithering, if they encode things like printer serial numbers might be catchable by printing an identical set of documents across two examples of the printer.


> Why make the printer "think" it's printing. Why not just literally measure and record the pulses sent to the actual print heads?

That's what my shim would do, is record the signals to the ink solenoids. The reason to make the printer think its printing, is primarily because of all the DRM lockout crap all manufacturers use. Ideally, I'd even let it print so that when the firmware sees ink levels going down, nothing on the firmware side would look amiss.

> The "yellow dot" method would be picked up pretty quickly by the yellow being triggered while printing entirely B&W documents.

Indeed. However, its old hat about the yellow dots. I know they've moved on to something much harder to detect, and also likely scan and reprint resilient. What is this new type? No bloody clue. I'd assume a bad actor using heavy stego on chip. And if I were designing it, I'd watch for things like test images coming through and not mark them.

My first attempt would be with a high res 100$ bill scan. I betcha that'd trigger something interesting.

> Things like dithering, if they encode things like printer serial numbers might be catchable by printing an identical set of documents across two examples of the printer.

Yeah, I figure there's a serial number, time/date, hostname, IP, logged-in username.. All sorts of data. This is also corporate espionage area as well as national security, so I'd figure they would put out all the stops to catch, if they can't prevent the print itself.

Just chalk it up to me, and my paranoid mind. Still doesnt mean they arent out to get you!


Would filling all the color tanks with black ink before printing make initial detection easier?


I doubt it. There's tons of tricks and things that can be done to steganographically hide data in images. And given the algo is hidden, and the data is hidden in plain sight, we have to go to a level that can't be hidden from us.

Printing in different inks also wouldn't show us a way to diff 2 printed images. Whereas, saving the pulses from the CMYK pins would do that.

When you have a datalog of lots of pulses that represent a picture, you can back-calculate it into an image. You can also diff it without relying on losing data from scanning (or paying attention to the wrong thing). And with enough samples, we can recalculate the algorithm. With the knowledge of what they're doing, we can then start scanning other images for this... But only once we know what they're doing.


What about printing two images, each with only 1 black pixel in a different place?


What is annoying is that the user pays for this. How much more yellow toner do I need to buy because my print outs are covered in yellow dots?

I wonder how many million extra gallons of yellow toner and ink are wasted every year printing these tracker dots?


On some printers, this is the reason why the printer will refuse to print a BW-only printout while the only empty cartridge is the color one.


That needs a citation or a sarcasm tag.


Does an anecdote count? My Samsung refuses to print b&w when a color cartridge is empty.


Most likely it tries to print rich black, which uses color to make the b&w printout look better ( and use up all that costly color). You can disable that in your printer settings with most printers.


I heard that before, I think it's a tactic to bully people into buying more ink they don't need. Ensuring tracking dots is maybe just an unfortunate side effect.


I've always assumed that is just lazy design (some resources that _might_ be needed are unavailable, report and refuse to print withut checking if they are actually needed for the current job) rather than due to a security (or even ink selling) measure. I've seen the behaviour as far back as the first colour photocopier I ever used.


I've seen that as well a long time ago on a black+red photocopier, no yellow and probably no tracking patterns since red would have been quite visible.


No, the thing happening does not count as proof why it is happening. Printers refusing to print bw is widely known.


I would also be interested in a citation, but it does make sense; if tracking dots is well known (which they are), and it was easy to circumvent by just removing yellow toner, anyone somewhat technically adept would just remove their toner before printing their illegal documents.


This makes less sense than you might think because the original motivation from the government side was tracing counterfeit currency, not tracking all documents. Printing convincing counterfeit currency without yellow toner would be a challenge!

However, the governments did not succeed in limiting their technology to use in counterfeiting investigations, and may not even have attempted to do so.


If that's true that just makes me angrier because a far better solution would be to improve the currency, like Canada and many other countries did long ago.


I can feel my blood pressure rising...


Imagine the twist this put would on the whole Hanlon's razor thing if this were true.


No idea if this is the real reason, but it's still very annoying.

My previous printer did this, and my current Canon refuses to boot when any cartridge is missing or empty.


My Epson does this, and it's apparently because not having enough ink in the system leads to maintenance issues with heads drying out, etc. Which is understandable, but I wish there was like a 48-hour override so running out of a color didn't prevent me from printing the documents I urgently need RIGHT NOW. So instead, I have to purchase a couple hundred dollars worth of cartridges in advance.


I'm not so sure about that. Having had an inside view into one such program I can say confidently that at least in that particular major manufacturer's situation, that had absolutely nothing to do with it. It was entirely engineering-driven and about preserving the print head.


or so they said... Do you really know the ultimate reason why?


I do, yes. There were sound technical reasons for it.


That doesn't make much sense to me. B/W only (not greyscale) is currently understood to not output these dots at all, so why would the presence of yellow matter?


This is the first thing I thought of when I read about this. It made me quite angry and then a little satisfied finally knowing why this happens.


On the other hand, I didn't think of it until I read this. It made me a little satisfied, and then it made me quite angry.


I ought to be able to do better because I used to know more precisely how many dots per page are printed and how large they are.

But a back-of-the-envelope calculation suggested to me an upper limit of about 1000 kg to 10000 kg of extra toner per year in the United States. However, there are several factors that make me think that even the low figure is an overestimate.

I agree with your frustration about paying for this. Mako Hill used it as an example of an antifeature

https://wiki.mako.cc/Antifeatures

(It might be more accurate to define antifeatures in terms of buyers' willingness to pay to have the features removed, rather than sellers' insistence on being paid to remove them, since we can't, in fact, routinely pay for many of the antifeatures he mentions to be removed.)


Now that you mention it, yellow was the color that ran out first on my printer. By a very large margin.

I mostly print black & white.


But doesn't the comment above yours suggest that the 'yellow' dot thing may no longer be an issue and that now there's variation in the dithering parameters. Suggesting extra ink is not required.

More annoying are the privacy concessions that are the result of secret anti-counterfeiting measures (which is what I assume the measures are for).


I'm pretty sure any kind of such appeal can be refuted by connecting this tracking to hunting terrorists.


Almost none, considering that the dots are nearly invisible, and most of a printed page is visible ink


Is somebody working on identifying these modern watermarks? A start would be to print out test pages and compare high resolution scans. Maybe also multiple printouts from the same printer to see what the natural variation is, and if there is a timestamp component.

I would start, but I'm currently not around a printer...


I suppose the approach is to create a machine learning dataset that maps hi-res scans of sample documents to the printers that produced them. If the resulting classifier can accurately id the printer, you have probably found a watermark, but it might just be natural variations in the manufacturing.


The difficulty in this approach is that you have an extremely large number of classes. Each printer is its own class. Typically, as the number of classes goes up, accuracy goes done. That isn't to say it isn't possible, but it would require a lot of custom hacks to any learning algorithm.

Also to convince anyone that it works, you would need to test it out on an extremely large number of printers, including ones of the same model. In practice that could be expensive.


Nah, it's not feasible to know the printer model if you want to identify a laserprinted dollar.

A few variants at most.


still, if the result is a "fingerprint" of a printer, it'd be interesting to know, because it can be used by law enforcement too


Surely we can skip the paper stage and hookup the motors used for head positioning control to a rig - either reading through rotary measure or preferably reading the signal to the motors directly.

Print the same page, compare the signals sent to the motors? Won't that be a more easily/accurately measure proxy for what's actually being printed. One might need the timing data for the jets on an inkjet too, etc.


No printer controls their motors with >600 dpi resolution. Inkjet printers have print heads with many nozzles; the motors do the rough positioning / slide the head over the paper, the nozzles do the hard work. In laser printers a motor only moves the paper along (all rollers are either free-running or synchronized by gears).

So for an inkjet you'd have to look at the nozzle timing, which might be difficult depending on how integrated the drivers are (e.g. if they're a custom chip on a flexprint behind the heads... uhm...). For a laser printer you'd have to look at the laser modulation signal. That should be much easier, bugs have done that before.

Reverse engineering the firmware might be easier... on the other hand, the firmware is probably bolted shut rather well — the printer manufacturers cartridge DRM is in there somewhere.


I was assuming the nozzles has some sort of actuator that approximates to the term "motor" - I think you're either driving a tiny heater, or piezo, or a charge deflector plate in inkjet printing? That presumably is where the jitter would physically manifest; so you'd look at the input signal to those elements?

Reversing the firmware though, good call.


Maybe "just after the electronics, but before the print heads/motors" is the appropriate place to probe. It might be more work than anyone's prepared to put in (and of questionable utility), but you could emulate the motors and heads, and generate an image of what would have been printed.


Shouldn't we be able to control a printer on the pixel (dither) level in the first place?


This would be nice for other reasons too. There can be better halftoning algorithms to the typical pattern based halftoning of laser printers. It's hard to calibrate though, printers don't print "pixels". They print dots that typically overlap a lot, DPI for printers is the resolution for positioning the dots, not the size of the dots.


I think that ALL algorithms are better than the default halftone algorithms.

I think it's possible to send saturated pixels using PCL, and tell the printer to disable half-toning. It requires that a full page fits in memory, which isn't much (512MB) but typically more than the default.

For some reasons all printers use really vintage memory, so 512MB extra memory is crazy-expensive.


Even if we are able to, if the default is to divulge the printer ID via a dithering pattern either at the driver or machine level when given a blob of image data, I think this problem becomes similar to "Can't we all just encrypt our email?" i.e. largely academic.


We're currently in a situation where we don't even really fully know which are doing it or exactly how, so it would be many giant leaps in the right direction at least.

It would likely make identifying tracking marks and algorithms a lot easier.


Is there a notice/agreement tells customers there are tracking dots before buying? If there isn't, how is this not illegal?


Sometimes, for example

https://duckduckgo.com/?q=docucolor+"not+visible+under+norma...

We originally found that in German,

https://duckduckgo.com/?q=docucolor+"unter+normalen+bedingun...

However, I don't think that most printers currently disclose this, at least for sales in the U.S.


It's hard to imagine what law this would break.


By the same token, what is compelling them to do this? A sense of patriotic duty? The kindness of their hearts?

There's a lot of work that would go into designing, implementing, and testing this. Then you've got logistics in manufacturing. That's time and money.


> By the same token, what is compelling them to do this? A sense of patriotic duty?

The GSA Schedules: https://www.gsa.gov/portal/content/197989


The original motivation seems to have been a deal between a printer industry association and a central banks' association, perhaps as an alternative to threatened legislation, although we've never been able to get ahold of the details of that deal.


What law do you suppose this breaks? Illegal isn't defined as "something I dislike."


privacy / trade laws / 14th amendment equal protection / due process

It definitely does not feel right that some non government entity is deciding to encoding personally identifying information about you without public oversight.


FWIW: the link in "Other forensic marking techniques have been invented" is broken, now it points to: https://engineering.purdue.edu/~prints/

But it does not redirect anymore, I had to use archive.org


What about B&W printers?


There are methods of varying dithering/half-tone patterns in a way that is invisible to they eye but can carry enough bits of information that forensic analysis can identify an encoded printer serial number. Methods of making practically invisible changes to pure text are available too. Colour printers are starting to use these techniques instead of the (more easily noticed by the general public) extra yellow dots.

See the reply to SomeStupidPoint by schoen a few hours ago (and a couple of other posts on this thread) for more detail.


But that would be possible for a photo. But typically you would print some text. I wouldn't expect any dithering / half-tone in a text document.


"Methods of making practically invisible changes to pure text are available too" as mentioned in the post referred to. If using printer fonts rather than explicit vectors then it is possible to hide enough data in small "mistakes" invisible to the naked eye.

I'm not sure what would be possible for pure vector graphics.


For vector graphics, I suppose you could encode information in the least-significant bits (with redundancy, error correction and whatnot) if the printer can guarantee sufficiently high precision (not accuracy, mind you: https://www.tutelman.com/golf/measure/precision.php)


Could you elaborate/speculate on how dithering patterns would be used?


This isn't an answer, but I used to examine patents in this space. There are very advanced watermarking methods out there that are stable through transcoding, compression, obfuscation, etc. while being invisible to the naked eye. Really amazing stuff, wouldn't be surprised if there were lots of watermarks on media (audio, video, still image) that aren't readily apparent. One of the big use cases I remember was watermarking movies so that it would be possible to identify the time and place that a cam bootleg was recorded. That's a camcorder aimed at a movie screen and then heavily compressed and distributed over the internet, and the watermarks would still be detectable.


A simple, but reasonably robust solution to the cam problem is just to screw around with black frames, that is, the frames in the middle of a fade out. Give yourself 20 places where you can insert an extra frame and choose 10 to insert, and you already given yourself 40 bits to play with.

(It's trivial to deal with the audio sync issues.)

Cams may have a lot of spatial unreliability, but they have a lot of temporal resolution.

And that's just my stupid of-the-cuff answer, which is already off to a decent start. And there are in fact purely-spatial solutions that do work, to which the temporal solutions can be added. The upshot is don't expect to beat these anytime soon. There's just too many bits to hide in, and so few bits needed for the identification.


> The upshot is don't expect to beat these anytime soon. There's just too many bits to hide in, and so few bits needed for the identification.

I agree. Some of the off the top of my head ideas that I literally just came up with now:

- if printing an image, drop a few dots in some rows (or columns); data is hidden in the pattern of dropped dots

- if printing text (as in, actual text goes to be rendered on the driver or printer firmware level, and not by the OS / text editor), slightly alter the shape of some letters (by adding or dropping a dot) to hide a pattern

- if printing an image, try to hide some data in its FFT (e.g. by adjusting differences between low frequencies and hiding a pattern there)

- if recording a video, slightly alter some otherwise stable global characteristic (like avg brightness of a bunch of consecutive frames in an animated movie)

- if recording a video, screw with timing patterns, as you mentioned

There are just so many properties, that the difficulty is probably mostly in picking something that's stable through usual transformations a document will undergo (e.g. scanning, JPG compression).


No. It's not that simple.

For a real example that really works, see, for example, digimarc:

https://www.digimarc.com/support/product/digimarc-guardian-f...

Images can be cropped, rotated, recompressed, scaled, etc. and the digital watermark remains.

Also see: https://en.wikipedia.org/wiki/Digimarc

and read some of their patents, referenced in the Wikipedia article.


>Images can be cropped, rotated, recompressed, scaled, etc. and the digital watermark remains.

And none of those would impact the timing of black interim time lengths.

Also, this makes digimarc sound crappy (from their site):

>Facebook compresses images once they are posted, sometimes heavily, which can damage our invisible identifiers. Fortunately, there is a simple solution: if you pre-compress your images, then apply our identified, they should survive.

So they don't survive compression.


"And that's just my stupid of-the-cuff answer, which is already off to a decent start. And there are in fact purely-spatial solutions that do work, to which the temporal solutions can be added."


I have implemented one of those technologies at a previous company. Their claims and what the software was actually capable of were vastly different.

I would say about 20% of the files we sent over had enough recoverable watermark to be useful.


One of the things I wondered when I read this story is if it would be possible to develop software that would somehow circumvent this type of situation. For example, using autoencoding or something to lose watermarking details intentionally or something. But what you're discussing seems more advanced than the yellow dots idea.

I was surprised to see printers being involved--I thought something like this leak would all be digital. I was also surprised that the Intercept would not be more savvy about printer identification because it's been publicized so much over the years.


Speculation: Dithering algorithms traditionally include randomness. If you can determine a portion of this randomness by inspecting the output of the algorithm, then the dithering can be used to send a message, for example by using the message as a seed to a PRNG or as an input to a hash whose output serves as the randomness for the dithering operation. If the underlying message space is small enough, you could recover the message by brute force, examining each possible message and the results that it would have produced, and seeing which set of results matches the observed document.

The hand-waving part of this is "if you can determine a portion of this randomness by inspecting the output of the algorithm", because I don't really understand how easy this could be made without knowing the exact underlying signal that the dithering algorithm needs to quantize.

An alternative might be slightly changing some of the values in the matrices at

https://en.wikipedia.org/wiki/Ordered_dithering

in a way that barely reduces perceptual image quality (although I'm not certain how well that can be done). Perhaps there is an algorithm that uses statistics to deduce what matrix was used, and then the perturbations can be read out of the matrix.

This is related to research in digital watermarking that's been going on for decades, and I'm definitely not an expert in that or in digital image processing, so I'd love to hear from people who know more.

Nonetheless, looking up close at how printers produce different colors out of CMYK dots, I'm pretty confident that they have some degrees of freedom, and that some of them probably don't make a lot of different perceptually, and can probably be used to encode a message.


Isn't this mostly a driver issue, i.e. something that could change for any model at any time should the vendor decide to add tracking to a driver update?


I'm afraid it's probably built into the printer firmware, not drivers in OS.


Driver updates can usually modify the firmware in modern devices. I don't see a real dichotomy here...


The Lives of Others, which takes place in East Germany and includes a typewriter which is not registered with the government; was one of the best movies I've seen in a long time. And it's ultimately the live we'll live as tracking technologies continue to get better.


When we start evoking the Stasi in a discussion about this surveillance, it always feels a bit like a Godwin point. But the reality is that all of this surveillance is exactly what the Stasi used to do and what the west was fighting the communist block for. If I had told people in the 80s that shortly in all western countries, all communications will be monitored, it will have become illegal to have any discussion that the government cannot eardrop on, the state will be compiling a file on every of its citizens and want to have a list of every book and article every citizen reads, they would have thought that the russians invaded us.


Pedantry mode: All of this surveillance is exactly what the Stasi _dreamed_ of doing.

(I fully agree with your point; however, I'd argue that the (relative, back then) lack of digital storage and communication made gathering of information much, much harder back then than it is now - even for the Stasi.)


The Stasi would've likely loved these tools, but I'm not sure they're necessary. The point of the Stasi behavior was to instill fear in the masses, to turn neighbor against neighbor. The fact that the average citizen today is hardly even aware of things like printer dots means it's serving a different purpose.


Then again, it could be argued that the need to instil fear in the masses arose precisely as they didn't have the resources to scrutinize all the people all the time. (

So they needed to make sure all the people expected to be under surveillance all the time, to keep them from doing anything undesirable (to the state, that is) while not being watched.)

IIRC the Stasi had a hard time connecting the dots (pun not intended) as the massive data sets mostly existed on file cards.

Today's problem is somewhat different: You've got loads of data, you've got the means to rapidly search and index it - but still, for some reason or the other, massive data collection doesn't appear to lead to much by way of desirable (to the populace, that is) results - actual terrorists apprehended, actual conspiracies unearthed, etc.

A cynic would assume that means the data is collected for other, more nefarious purposes. Cough.


I agree. Without moralizing on the motives, it seems like the objective of establishing an institution like the Stasi is control. If you can have that control with a velvet glove and spare the resentment so much the better.


In a Person of Interest episode an ex-Stasi officer was amazed by surveillance cameras, "These small cameras ... they're extraordinary. The Stasi would have killed for this technology!"

Fun-fact: The Stasi installed Caesium-based gamma ray scanners in some border checkpoints. To this day no one knows for sure how strong the radiation exposure was.


Western surveillance has surpassed stasi surveillance in quality and scope. The difference at the moment is in intent. But this is subject to political winds.


Yeah. Back in the days in Poland, when communism was still 'live and well', every typewriter was registered. Also, phone calls were monitored, but at least they had the courtesy to play a message "calls are monitored" before each call.


> they had the courtesy to play a message "calls are monitored"

Probably meant they didn't actually have the resources to monitor all calls, so they were requesting people to not plot rebellions on the phone.

Now we're richer than that.


It would be neat if there was a privacy printing app that added random yellow dots to a document to obscure the info.

Perhaps simply printing a single yellow dot through a few different printers would be enough to accomplish the same thing. Then using the resulting paper for "real" prints.

The more I think about it, this could even be a service. "Preprinted" paper that went though a bunch of printers, each adding their own unique identifier each time, then sold and distributed.

That or just paying cash for a printer.


I'd just tape over the yellow cartridge head


Or just complete the yellow dot pattern to make it a full grid.


What are the odds that you'd be shut down over "national security" concerns?


I hate to be the security is not important if you aren't doing anything wrong guy.... but I use a printer to do things like print a picture for my kid to color. What the hack are you doing with you printer to need pre dotted paper?


Your kid, enjoys the nice things in life in part thanks to the thankless work of activists. Lot's of freedoms we have today are because of people who put themselves under great risks to benefit the rest of us. They might be printing pictures for their kids or evidence of powerful entities doing things they shouldn't be doing.

So it's nice that you print stuff for your Kid, but what about those who print for societies sake?


That's the most important argument. The second has to do with the vast number of different kinds of abuses that will occur--that you previously couldn't even imagine would--when we give up our privacy^1

[1]-https://www.privateinternetaccess.com/blog/2016/09/police-ro...


I'll be honest. I like to reserve the ability to one day do something "criminal", if I have to. If I'm desparate and have to feed my family, and have no other option, I'd like to be able to resort to petty fraud. If the government in my country turns more authoritarian, and into a dictatorship, I'd like to be able to break some laws, too. I can imagine how a printer might be useful in either case.

Also, many things that are legal now, are only so because people have been blazing the trail by breaking laws before. Think marijuana, homosexuality, ... heck even things like religious freedom, freedom of speach, democracy a few hundred years ago.


The idea isn't that everyone needs this privacy every day. The problem is that someone will need this privacy one day and this monitoring will have made it impossible. Think whistle-blowing, anonymous tipping. Think challenging the power in place.

I heard that quote in a Snowden interview though perhaps he was quoting someone else: it's not because one has nothing controversial to say that one shouldn't support free speech.


I'm just not sure printer dots is the fight to make. So many other fights. Snowdon as far as I know never printed off documents. So not sure how it would apply to him.


printing documents that could expose government corruption, without fear of backlash or even death?

If we all lived in a perfect world then we wouldn't need a way to print things without being tracked.


It's about making "hard to track" the norm, not the exception.

Similar ideas apply when using Tor, VPNs, do-not-track, etc for "innocent" internet browsing.


I don't even own a printer myself. However, I don't think you need to be doing something wrong to not want to be tracked by others.


Isn't this just another "If you're not doing anything wrong, you've got nothing to hide" argument?


I need pre-dotted paper so that the next Reality Winner or the next Edward Snowden can print documents showing unconstitutional government activity without fear that the dots will give them up.

Why wouldn't I want pre-dotted paper?


It is the principle of the thing. You know, live free and all that jazz.


It could just print the mac addresses of all the nearby wifi networks. One subpoena to google's mapping service and you know exactly where the paper was printed. Yay, surveillance.


Seems like a lot of data to print. I suppose just unique id would be enough. If the printer has access to wifi or internet connection in general (even through usb drivers) it can report the details.


Well, modern printers print with 100s of ppi, which on doing some maths suggest around 100 dot in a box of side 1 mm(in 250 ppi, most printers print with more ppi), enough for displaying 64 bit number with hash, even if it only works in binary mode. If more that just on/off can be detected, it can store much more.


> ...google's mapping service...

I guess we'd immediately be aware if google's _nomap wifi flag /actually/ did anything or not.


I just realized something.

- Office IT maintenance all hate printers

- I'm sure you've gaped at the control software that came with the little $40 inkjet you bought (or had to use) at some point

- Even the creator of MINIX cited "buggy printer drivers" as his rationale behind prferring nanokernel architecture (all drivers run in userspace) instead of monolithic approach (all drivers run in ring 0; printer driver sits next to crypto keyring).

So. Printers are terrible.

Remember Brinks' fireproof safes that were absolutely rock-solid but were running fully unpatched WinXP and had a USB port on the side of the keypad for "security updates"? Hackable with a keyboard stuffer that looked like a flash drive. https://news.ycombinator.com/item?id=9961024

I remember reading (somewhere) a conclusion that went along the lines of, Brinks are awesome at making safes - and this safe was truly amazing - but they weren't a software company, to a profound extent.

It's clear to me that printer companies are similarly really, really bad at software design too.

So, reverse-engineering the firmware to figure out what things printers are doing use probably wouldn't be all too difficult.

The printer companies were told to write the code, not write it perfectly and make it impossible to unravel as well.

Of course, if anyone actually take a crack at this (excuse the pun) that'll make things change a bit, but printer firmware is probably at the "open sesame in a big way" stage right now, and the printer industry is huge and slow to change, which suggests reverse engineering could remain trivial for a little while, even with publishing.


I was actually discussing this the other day at work (I work in a relatively small laboratory of about 50 people and we don't have a dedicated IT department so most of us contribute a bit to keep the IT infrastructure).

It is incredible how fickle printers are, all the hassle they give, printing problems, network connection problems, special drivers to install even on modern operating systems, paper jams (one would have guessed by now they should have at least solved the paper jams) even on our quite expensive printers.

It's like everything but the print quality got stuck in the year 2000 and never again evolved.


The real question you should be asking yourself is how hard it is to fake these. If I get hold of someones copies, can I use them as template?


I think this is an excellent example of why security through obscurity is a bad idea. Now that we know they are there, it's only a matter of time before they are all broken and duplicated. How hard is it? I don't know, but I can't imagine that its impossible. Given time and technology, someone will figure how to forge these without difficulty.

They were clearly betting on the fact that no one would notice they are there. What scares me is we're just finding this out. How long have criminal organizations and rogue nations known about this and what have they used it for?


I'm confused about why people consistently think that this was a total secret, no matter how many waves of press coverage it gets.

https://en.wikipedia.org/wiki/Printer_steganography

There were press articles about it by 2004 (and I think some earlier), we had written the tool that Rob Graham used to decode these scans by 2005, and I gave a number of TV interviews about it during 2005. A small number of manufacturers (maybe worried about European data protection laws) also alluded to the existence of the technology in their user manuals. Some of the people from industry who contacted me also said that this was common knowledge to people in the printing industry since at least the turn of the millennium.


None of those are enough. Unless the spying feature is directly marketed to consumers, e.g. a TV ad that says "Buy a color printer THAT SPIES ON YOU today!", >92% of the population will never learn about it. (That estimate being from the # who don't read license agreements: https://measuringu.com/eula/)

Generally, anything that less than half of the population knows abut is a secret (e.g., menstruation is still called a "secret" in some circles...), so you shouldn't be confused, just disappointed at how gullible / uninformed the average person is.


>Unless the spying feature is directly marketed to consumers, e.g. a TV ad that says "Buy a color printer THAT SPIES ON YOU today!", >92% of the population will never learn about it.

Heh. The tagline for this car HUD (http://www.jbl.com/connected-car/CP100+LEGEND.html) says, "Now your car can be on the grid too". That's getting pretty close to your tagline.


people consistently think that this was a total secret

As far as I've seen, this isn't true.


Maybe I should say "regularly"?


It sounds like you're being surprised that anybody doesn't know about it, even if they're in a risky position themselves, which seems disingenuous.

Before today, what was the most likely path to this knowledge? As in one month ago...and how many 26 year olds have occasion to learn themselves the details of printers? Nobody uses printers.

Yes, working in that position it would be more likely, but she still could merely be a corner case when it comes to laser printer dot awareness, even within the IC.


I'm curious if there is a "half life" to this knowledge. Or, rather, what that would be.


Print sensitive documents at FedEx Office or Staples and pay in cash. It's the only way to be sure.

There were magazine articles, newspaper articles, and news site discussions about this years ago. They covered it being added to stop color laser printers and dye sublimation printers from being used for currency counterfeiting. That the tech community has this short of a communal memory astounds and saddens me.

Even beyond the public knowledge of this tactic, that Reality Winner was working at an intelligence agency and was silly enough to think said intelligence agency couldn't track what had been printed in its own offices is laughable. Either she had no business working in that environment as she clearly doesn't understand their mission and methods or she's a scapegoat.

* 2014 - PC World - http://www.pcworld.com/article/229647/counterfeit_money_on_c...

* 2004 - PC World - http://www.pcworld.com/article/118664/article.html

* 2005 - Washington Post, stating it had been in use at least ten years, and that at least one version of the yellow dot code had been broken. - http://www.washingtonpost.com/wp-dyn/content/article/2005/10...

* 2004 - Slashdot - https://hardware.slashdot.org/story/04/02/06/1513255/hp-disc...

* 2004 - Geek.com - https://www.geek.com/news/color-laser-printers-allow-feds-to...

I could probably easily find more.


FOSS seems to be the best, if not only, solution. (As usual, when it comes to freedom and privacy...)


It's likely any tracking mechanism would be implemented inside the hardware, not as part of drivers.


It would be in the firmware. It should be possible to hack a printer with open source firmware.

You could even make a printer with open source hardware - something like this, but higher resolution: https://www.youtube.com/watch?v=zX09WnGU6ZY, or think http://reprap.org/ - home made 3d printers made only from commonly available and 3d printed parts.


> It should be possible to hack a printer with open source firmware.

You'd think so. After all, Stallman created FSF in part because of his frustrations with a printer!

> In 1980, Stallman and some other hackers at the AI Lab were refused access to the source code for the software of a newly installed laser printer, the Xerox 9700. Stallman had modified the software for the Lab's previous laser printer (the XGP, Xerographic Printer), so it electronically messaged a user when the person's job was printed, and would message all logged-in users waiting for print jobs if the printer was jammed. Not being able to add these features to the new printer was a major inconvenience, as the printer was on a different floor from most of the users. This experience convinced Stallman of people's need to be able to freely modify the software they use.


(Disclaimer - I know nothing about actual printers.) Really? I would expect that things like location, IP address, time and date, etc would be the most important privacy threats and would be implemented in software. Although I agree that hardware tracking is still an important issue that open-source drivers wouldn't solve.


Do you mean FOSS printer drivers? Are you sure that would make a difference?


No, FOSS printer hardware, firmware and software. Obviously, nothing mainstream like that exists. It's not just the dots. Since printers are connected to networks, and there have been countless printer vulnerabilities, it's a worthy cause in its own right.


I know next to nothing about printers, so I'm not sure of anything!

But what I'm seeing in the discussions here is a ton of uncertainty about what models even use tracking methods and what methods they use. So I'm guessing this is mostly because we don't know what the software in these printers is doing.


Paying cash for the printer should mitigate things. At best they can tell something like that the page was printed by something that passed through a BestBuy warehouse in your town in the first quarter of last year, and that's it.

Buy a printer hundreds of miles away from home while on a road trip, pay cash, and then do whatever you want with it: print yourself a hundred million dollars and enjoy your print-irement. :)

Simply the awareness about the possibility of tracking goes a long way.


If your printer is on a network, it can 'phone home' in most cases. They now know which IP address the printer lives behind. Get compromising document > look up serial number from dots > look up IP address from serial number > get address for IP from ISP (through means fair or foul). Payment option is irrelevant.


In those "most cases" when it's not blocked from doing so by your firewall.

If you were paranoid enough to pay cash for a printer somewhere far from home because of the tracking issue, you will probably block it from sending messages outside of your LAN.


That's all they're going to get anyway. I've never heard of a retail store recording the serial number of all their merchandise, much less associating it with a customer's credit card.

The utility to law enforcement is being able to prove a connection between the evidence and a suspect after they've obtained a warrant. Or in this case, it was a NSA owned printer so they already had the serial number without needing a warrant.

And if you are foolish enough to register with the manufacturer, the NSA already has your serial number without needing a warrant.


> I've never heard of a retail store recording the serial number of all their merchandise, much less associating it with a customer's credit card.

Isn't that how the caught Chelsea Manning? Serial numbers from CD-RWs. Also, the store itself doesn't need to associate it with the customer, they just need to know where those CDs were distributed, and the investigators can follow up the transaction details.


Could they narrow down to any printers of that brand bought with cash in the country and request surveillance footage? Then build a shortlist from there. I imagine cash purchases of printers would be fairly rare and help limit the list.

Buying a printer second-hand with cash might cover this situation.

Burner printers...


The problem is that it gets easier and easier to track everything.

In the future maybe each pack of paper will have a steganographic tracking code - slight white variations for example, so you could track the paper to the selling shop.

The camera, the scanner could also do this.

And as more and more systems get integrated, they could see that your car and your phone were present at the time in the shop where the paper was sold.

A second hand printer can still be tracked if you buy it through craig list, even through a long chain - the original printer was sold in a shop, the CCTV and facial recognition and phone location identified the original buyer, the buyer Craiglist account sold such a printer second hand on some date, he was tracked to some location, you were also tracked to same location at same time, you were pictured carrying a big package, your high-resolution electricity metering suddenly shows that you have a laser printer running in your house, and that you printed exactly 53 pages on the day of the leak, ...

As you can see, it will become harder and harder to do anything anonymously.

CSI will look naive compared to what will be possible in the future. Infinite zoom and camera viewpoint rotation will be trivial if you'll have one little camera in every door, every street sign, every corner, every car, every little thing.


> In the future maybe each pack of paper will have a steganographic tracking code - slight white variations for example, so you could track the paper to the selling shop.

Unfortunately, it might not even be necessary to mark the sheets of paper in order to be able to track them: https://citp.princeton.edu/research/paper/


Then the question will start to be, where to house all the criminals? Because if it's trivial to catch most criminals, but impossible to discourage every idiot who can't resist or doesn't understand they're being tracked, someone's going to be on the hook for detaining or deterring them surely?


This was my first practical answer to it. Buy the printer with cash, from a garage sale not too close to your house. Only ever use it on an airgapped computer, just to be paranoid.


We could just sell "tracking dotted white paper", with 150 different patterns on it.. Or just release a pdf that has them embedded so you can print that first.



This information is the result of research by Robert Lee, Seth Schoen, Patrick Murphy, Joel Alwen, and Andrew "bunnie" Huang.

Does bunnie do any non-cool projects?? He inspires me more than any other developer/researcher today.


If your printer has tracking dots, this doesn't tell the people reading the tracking dots on the printout anything, unless they either

- already suspect you, or

- can trace the serial back to the purchase.

Conclusion: buy your printer second hand and don't get caught.


- in cash


My cursory research into this topic (this morning...) lead me to believe inkjet printers may be uncompromised (not printing steganographic dots). The NYTimes believed this to be the case as well back in 2008 (http://www.nytimes.com/2008/07/24/technology/personaltech/24...) - anyone with more knowledge of the subject have information about this?


I wrote the original article, and that consistently matches what we've learned and heard. However, that doesn't mean that forensics can never identify the printer that printed a document (there's an entire lab at Purdue that's been studying possible ways to do so for over a decade!), just that the printers are not intentionally engineered to track their users.

Edit: It looks like the Purdue lab was only publishing research from 2003 to 2010, or hasn't updated its web site.


So they perfected it and received a gag order under the State Secrets Act.


Good thing this country doesn't have a State Secrets Act!

(Maybe https://en.wikipedia.org/wiki/Invention_Secrecy_Act if they filed for a patent.)


That's what I was referring to. Thanks for the clarification.


We need an open source / hardware 2d printer


I love the term "2d printer". What a time to live in.


Is it possible to add yellow dots or the opposite of yellow dots (assuming they are additive?) to the printed content such that the existing dots become noise?


Just a guess, but I would be willing to bet that some of these manufactures ensure that the tracking appears outside of normal printable boundaries.


Buy two of the same printer and they'll scramble the code!


you could cut those ones off with scissors.


Is there an open source printing platform that allows the printer to be sure no codes are inserted?


I guess you can use a 3d printer as a 2d plotter...?


We're halfway to going back to tablets, in the literal sense.


So if we swap our yellow and (say) blue cartidges then these dots will become more apparent?


if you find a printer which accepts it... (haven't tried, but I fear it won't be this easy)


Time to start printing full yellow background instead of white. Con: lots of yellow toner needed. Pro: no tracking.

Or maybe never refill Yellow toner and then dots fail to appear.


maybe this is why some color printers will refuse to print black and white if they're out of the color ink.


Maybe printers don't just use yellow, but calculate an optimal color based on the background of where they need to print.

The person who wrote the list in the article mentions (at https://news.ycombinator.com/item?id=14502425, in this thread) that there is a second generation of this technology that doesn't produce microscopically-visible dots.

Perhaps the use of all available colors is involved.


There is the latent question of how this works with B&W printers which are still frequently found in offices. I assume that all of these have a similar technique, just sans the colour. I did notice some scattered black dots on a Kyocera printout (FS1000 or so), but with an older laser printer it's hard to tell whether these were actually made by the printing engine, or just spilt toner trickled down by vibrations. These were rather obvious.

However, I - and evidently many others in this thread - can think of many B&W ways to hide data in a printout.

By the way, if someone wants to take a stab at an older printer's firmware — many Kyocera printers from the late 90s and early 2000s used some small PowerPC with the firmware on a mask ROM on a SIMM-like module. Doubtful that there is anything protected there.


..or a return to dot matrix printers? Good thing I never threw those away.


I think that might make you easier to track. Even driving slowly round the neighbourhood listening carefully would ferret you out.


LOL I'm still not throwing them out. Nostalgia. They'll sit on a shelf next to my Nintendo and typewriters.


A daisy wheel printer would actually be less susceptible because a dot matrix could add small dots to the text.


Old manual typewriters were also uniquely identifiable just by virtue of being shitty.


The Lives of Others


> the source of documents produced with other printing technologies are also possible, but, as far as we know, other kinds of printers do not deliberately encode their serial numbers in their output.

To clarify, other here refers to anything other than a color laser?

Also, are color LED printers included as color laser? (I would think so).


> To clarify, other here refers to anything other than a color laser?

Right.

> Also, are color LED printers included as color laser? (I would think so).

Yes. We could also perhaps say "color printers other than inkjet or dot matrix".

Also color photocopiers (using the same codes).


For people wondering what the dots looks like: http://static.snopes.com/app/uploads/2017/06/printer_EFF_dot...


Why not just use dot matrix printers?

Can be had for free from business throwing them out, ribbons are readily available, plus they can be used as generic printers (limited graphics capability, but supported both on windows and CUPS) or by writing directly to /dev/lp0


Someone from Purdue (I think) did research on being able to identify printers that used stepping motors due to unique characteristics in various models as well as individual printers.

I'm not sure if dot matrix was in the study (I want to say it was inkjet?) but the principle remains the same.

Not as precise as embedded serial numbers with watermarking, but it could get a whistle blower identified.


Time to start reverse engineering printer firmware :)


Are there any known forensic marking methods used by monochrome laser printers?


No, but it may still be possible to tell them apart in some ways.

It would be good to start with

https://engineering.purdue.edu/~prints/publications.shtml

and then see who's cited them in the last few years.


I'm hypothesizing, but you could make the slightest of changes to the print.

For example, if it's text, modify the font ever so slightly so it's invisible to the eye but measurable to someone looking for it.


It's basically impossible to rule out that an image (or anything rasterized) is watermarked.

The best way of ensuring this isn't to test it, but to simply ask the producer whether they would do this. If they say "no" or refuse to answer then don't buy that product. Not even if you had a printer from a manufacturer that had public firmware and driver code could you be sure by just inspecting the code and the printed output.

If they clearly say they don't watermark output then you probably have to trust them or simply not use printers.



this is why open-source printers are so essential.


So... Maybe people should start printing some documents directly in yellow paper?

I wonder if printing a blank page and then double printing the document on this page would distroy the pattern

It seems that there is an opportunity here for creating a program able to print a random layer of light yellow points to a blank page.


I don't understand what could be done by governments with this kind of information. Please enlight me.


An NSA contractor leaked information about the Russians meddling in the 2016 election to the press. She had been identified using these techniques.

https://www.nytimes.com/2017/06/06/us/politics/reality-leigh...


Note that the Times qualified this with: "And there may have been an even more glaring telltale that the F.B.I. did not mention in court filings." (So they were more cautious in that respect than other journalists, unfortunately.) In particular, the government has not acknowledged having used this evidence.


With the serial number of the printer, they can track it through retail channels to the buyer. If someone paid cash, they can review the surveillance footage at the point of sale at the time the sale was made. And when they find you in possession of the printer that emitted the printout, they can conclude you printed whatever it was.


I wonder what is the motivation for printer manufacturers to implement this kind of tracking?


I wonder what is the motivation for consumers to care?


Even without deliberate watermarking, it is probably provable that an x document comes from a y printer, based on the way how letters look under microscope. Relevant, for example, if one would be dumb enough to print secret documents at home.


Yes, it probably is. But it is a very big difference that someone can prove that a specific printer they are in possession of was used by comparing the printed papers, or if they can just look at the paper and get the serial number and time the document was printed. (And if they have a database of who purchased the that printer, they know where they should start to look)


I wonder if printing the exact same document on the same paper might mess up watermarks


ICYMI this is linked to this thread from yesterday:

"Secret Dots from Printer Outed NSA Leaker"

https://news.ycombinator.com/item?id=14494818


Makes the following book pretty fascinating reading:

https://books.google.com.au/books?id=1XKqCAAAQBAJ


Why create such a list? It seems like things like this should just be in the background to be used by people who may need this information in the future.


Has anyone here tried building a laser printer following Horace Labadie's book Build Your Own Postscript Laser Printer and Save a Bundle?


The best bet is to assume that everything is traceable. Best you can do is do ocr on what you've printed and pass that on, i'd say.


Matrix printers may suddenly get back in fashion.


This seems to be only color printers. Does that mean B&W printers are too hard to add the tracking information to?


The ostensible reason for this tracking feature is to deter counterfeiters (of currency). Since no currency is B&W, there would be no defensible excuse to coercer the manufactures into complying.


There might still be statistically significant alterations that allow tracking; I'd be very careful.


Do scanners also add unique identifiers to output, that seems like the low-hanging fruit in this space.


Would printing on yellow paper solve the yellow-dot problem, or would the still be detectable?


Is there a reason that this would be a bad thing, pragmatically speaking?


Not necessarily. In fact the Stasi thought it was a swell idea!


Whether or not it's a good or bad thing relies entirely on each person's own definition of pragmatism itself.

It's practical for my Mom to be indifferent about it. Reality Winner, not so much.


can I just use up all my yellow ink to solve the problem?


Then your printer will stop printing & demand that it be refilled. (At least some are known to do this now.)


Thanks, handy to know for my ransom notes.


Privacy printer startup anyone?


Samsung printer business was just sold to HP, so there goes that option.


anyway to make it print someone else's dots?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: