Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My wife hit a wall trying to upload a hefty PDF - every “shrink” tool we tried barely compressed the size, and some even made it larger! Frustrated by the state of PDF compressors (looking at you, Adobe), I turned to LLMs - Claude, Deepseek, and Gemini came up short, but OpenAI’s o4-mini saved the day with a perfect solution. That inspired me to build pdfmini: a tiny, open‑source, client‑side HTML app that crushes PDF sizes right in your browser!!! No installs, no fees, zero privacy worries - all your data stays on your machine.

Try pdfmini now:

https://den-run-ai.github.io/pdfmini/

Source code for pdfmini:

https://github.com/den-run-ai/pdfmini



This gave me an idea. You seem to be the right person to talk to.

Here is my workflow. Have a bunch of PDFs and images I need to combine.

I go to tools.PDF24.org, Merge pdfs, then compress them, then more compress them because of size limits, then add or remove pages. Then add page numbers.

These are multiple steps.

Could we have a way of defining these terms at start, either textual or no-code-like or something where we could define stuff like

Take input, merge > compress with greyscale, Max size 1MB, add page numbers on bottom right

Or

Convert input to jpg with image size 8cm by 8cm

I know many people who simply fail at such stuff. They just throw their hands up in defeat.

Not saying we should have llms do the job but if we could have multiple actions so that people could tell the software what they have in mind.

People dont just compress PDFs, often merge and then compress.

I recently say pdfux.com but it is not as featureful as PDF24 but PDF24 crashes a lot.


#!/bin/bash

# Convert images to PDF

img2pdf *.jpg -o images.pdf

# Merge PDFs

pdfunite file1.pdf file2.pdf images.pdf merged.pdf

# Compress

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook \ -dNOPAUSE -dQUIET -dBATCH -sOutputFile=compressed.pdf merged.pdf

# Remove unwanted pages (e.g., page 3)

pdftk compressed.pdf cat 1-2 4-end output final.pdf

# Add page numbers

pdfjam final.pdf --outfile final_numbered.pdf --pagecommand '{}' --landscape


You know what. I will share my script in the morning.

I used scantailor go scan a book. That gave out tif files.

So I built a script to convert them to jpg, then merge into PDF. Then OCR and add the text layer on PDF. Then compress.

I know this for a niche automation..... web OTOH where normies reside and are scared by terminal, it wont work.

Been using pdftk for years now but im only person who can use it in my office.


I'll be adding compression support for BreezePDF, so this can be done in a click


Merge/compress with Max size / color-greyscale/ remove pages / multi format import like PDF and images as input / export options/ export into multiple files if file size exeeds certain size.

And like my earlier comment, a way to define these multiple steps in a flow so that people can do multiple steps with a single file without having to learn command


This is very cool, are all these command-line tools open-source?


Yes


If you can define this as a feature request to pdfmini, please submit it on github, e.g. drag-and-drop flow builder


Well so I glanced at what that project does.

Congratulations, you've managed to "compress" PDF files by rasterizing every page to JPEG, while destroying all the vector and textual information in it.

The resulting PDF is nothing like the input -- it's just a bunch of blurry JPEG images wrapped in a PDF format.

You can't search or copy the text, and trying to print it will just make a blurry mess of the text.


Nail it. I requested a 50% compression for a 200MB PDF file that contained pictures, and the tool made it an illegible mess. I can't imagine using this tool for anything serious, like tax returns, that requires a machine-readable file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: