Hacker News new | past | comments | ask | show | jobs | submit login
Pelican static site generator 4.5 (getpelican.com)
193 points by dragonsh on Oct 2, 2020 | hide | past | favorite | 100 comments



Pelican is an underdog among static site generators but still has a very simple mental model [1] of readers, writers and generators.So it becomes easy to extend and work with. Not sure why it’s not as popular, but after working with Jekyll, Hugo and Nikola Pelican seems better designed with decent code structure. From the documentation:

“What Pelican does is take a list of files and process them into some sort of output. Usually, the input files are reStructuredText and Markdown files, and the output is a blog, but both input and output can be anything you want.”

[1] https://docs.getpelican.com/en/stable/internals.html#overall...


Pelican maintainer here. I really appreciate all the kind words. Maintaining any open source project can often feel like a thankless task, and so all the positive vibes in this thread mean a great deal to me.

We put a lot of thought and work into Pelican, with the intention of making web publishing as easy and fun as we can make it. We welcome your participation as well, so please don't hesitate to help out and contribute! https://docs.getpelican.com/en/latest/contribute.html


I've gotten started with Pelican recently, and so far so good. Looks it has enough flexibility to do what I can reasonably conceive of wanting to do.

However, its documentation is unfortunately weak for someone coming into it fresh. It jumps into writing content without first walking you through concepts, terminology, and how inputs ultimately get turned into outputs. This is too bad, since it seems like a really nice piece of software, and the kind of issue that might drive away new users.

Even the example link provided to the Internals section is at the wrong level of abstraction for a new user uninterested in the internals to understand the set of concepts they need to grasp to generate a whole site.


Pelican is more popular than Nikola, judging from their Github repos. As for Jekyll and Hugo, I'd surmise that Python is too mainstream for the hip crowd. Plus, Jekyll is the "default" generator for Github pages.


Hugo is written in Go, hippy enough for me. It was my first experience with static websites, and I was surprised how smooth it is: easy installation, easy testing, easy deployment.


I was disappointed how Gitbook, after going the SaaS way with v3, has broken compatibility with their own v2 cli. If you export markdown files from gitbook.io, gitbook-cli v2 is unable to import them and turn into static HTML files because Gitbook now uses custom liquid tags which are not open-source.


There's a fork of gitbook v2 https://github.com/honkit/honkit

not sure how well maintained it is though

MdBook is another good one, as hip as it goes, written in Rust nonetheless :-)


Jekyll is written in Ruby, but you're right that being used for Github pages makes it popular.


m000 is saying that the usage of Jekyll (Ruby) and Hugo (Go) are driven in part by hip trend-following programmers not wanting to use Python, not that Jekyll and Hugo are written in Python.


Right, I think the counter here is that Ruby isn’t trendier than Python is.


Not anymore, but it sure used to be, around the time Jekyll came into the mainstream.


Having worked with both Pelican and Jekyll, I couldn't agree more. Pelican is simple and easy to install, compared to what I had to go through with Jekyll.


IMHO, anything python is crazy complicated to install. rbenv / rvm are 100x better than anything python has attempted to produce.


I can't relate. Here is how I install pelican on my niche Linux distro:

    sudo xbps-install pelican
I've used rbenv before and found it to be a PITA.


In what ways is pyenv inferior to rbenv?


Another vote for Pelican. I've been using it since around 2012. Easy to use, and usually you can find a plugin for what you need. If you can't, it's not hard to write one, as long as you know Python. I wrote something to allow me to use org files as input, instead of md or rst. Many others have as well.


Sounds like make(1).


Yea I made a static site generator with about 50 lines of code using make and pandoc:

http://stephenbalaban.com/static-site-generation-in-50-lines...

Most of the work is just creating the index.


make would be a nice companion, but it does not process files, it handles the order of files processingf (indeed, static sites generators and make go together quite nicely)


Pelican does in fact ship with a makefile for running tasks.


Here comes the "I use org mode BTW" comment.

When it comes to my personal websites, I use org mode, because publishing is actually an exception in my day to day life. I use emacs for org mode only, but I use it all day long to keep notes, track todos, document projects, write prose and messages. Sometimes, I think something is worth publishing, so I mark it with the appropriate tags, run the publish command and let syncthing update the websites.

I have used hugo for a dozen significant project and I like it. I have chosen hugo over the alternatives because I prefer go to javascript, ruby and python in almost all possible regards. I even teach classes to journalism students with hugo and gitlab.

Static site generators are great for "projects" but as a individual willing to share some mildly interesting content as a side effect of his daily activities, I still consider them too much work and maintenance.


Have you considered org-mode to markdown and ikiwiki? That seems to be the simplest single-user publishing system I’m aware of.


No, I have not. I have chosen org for its numerous features and publishing is just one of them.


For anybody that wants to start with pelican I had written a tutorial post about it at https://spapas.github.io/2013/10/07/pelican-static-windows/.

Also that blog is made with pelican and hosted with GitHub pages so you can see some of pelican's features. I'm using it for more than 7 years and am really happy with it (the fact that I am familiar with python and can debug some problems myself definitely helps)


I’d very much like to commend whoever posted this for saying what Pelican is right in the gorram title!


Thank you for the firefly reference


Does Pelican allow to use external content files that are not written in Markdown? I'd like to load content from a JSON file to enrich certain pages or even create whole site structures dynamically. I couldn't find anything related in neither the documentation nor the plugin repo.

edit: It seems one might be able to write a custom reader for that according to https://docs.getpelican.com/en/stable/internals.html#how-to-...


Use the reader plugin pelican-json-reader [1] and modify for your own use case.

[1] https://github.com/jessaustin/pelican-json-reader


Thanks. It seems that Pelican is very restrictive in what content structures it allows. Do you know whether it can work with arbitrary JSON structures somehow?


In my other post I put a link [1] to pelican internal model. You can customise it the way you want as it’s a simple model. It’s not restrictive. You can write a plug-in to do what you would like and it’s easy with version 4.5.

[1] https://docs.getpelican.com/en/stable/internals.html#overall...


Write a short python or javascript to do what you want and put it in a build script.


My blog (ahelwer.ca) uses Hugo hosted on gitlab pages; I notice that privacy badger flags two trackers on it, cloudflare CDN and google fonts. I'm assuming gfonts is because the Hugo theme I'm using uses a google-provided font and cloudflare CDN is there because it's automatically added by gitlab or something; anyone know how you can create a completely tracker-free experience? Do you basically have to stand up your own webserver on a cloud provider (or at home)? I guess without cloudflare or some other CDN you'd also have to hope your server can handle serving all the traffic directly. Or would a different static site generator like pelican help?


You should add a CNAME record for the www subdomain. As it stands now, appending "www." results in a server not found. Browsers append that automatically sometimes.

The cloudflare CDN is there because your theme includes "auto-render.min.js" and "katex.min.js" from cdnjs.cloudflare.com. They are in script tags in your homepage HTML. You can verify this with: > curl -L ahelwer.ca | grep 'auto-render.min.js'

You can check my domain if you want an example of a CNAME record: > dig CNAME www.jeremypoole.ca

I am also using Hugo, I modified a theme called "coder".

As far as avoiding trackers, it would depend on what you mean by tracker. In your case, the external requests are for fonts and a couple of JavaScript dependencies. You could easily host those yourself instead of using the CDN, and then there would be no third-party HTTP requests.

As a fellow .ca HNer, I would gladly help you out. Feel free to hit me up.


GitLab Pages also apparently needs a TXT verification record for the subdomain as well, otherwise it'll serve a 401. (GitHub Pages does not seem to require this.)


I host my blog on netlify.com, works great and automatically builds on git push. Instead of cloudflare CDN they use their own CDN network.


I also use Netlify. On the free tier of Netlify a site is deployed to 6 points on the Netlify Edge CDN (Frankfurt, Singapore, San Francisco, New York, Sao Paolo, Sydney, Moscow as of August this year, but it might have changed).

On the free tier of Cloudflare a site is deployed to 194 locations. That can knock a couple of hundred ms off of a page load time. It might be worthwhile if you need something to be especially fast.

That said, Cloudflare is much more of a hassle to set up. Netlify is ludicrously easy. That's one of the many things I like about it.


Why is Cloudflare more of a hassle? Point your nameservers to them, done.

I use Cloudflare in front of Netlify.


Step 1: set up on Netlify

Step 2: add the domain on Cloudflare and let it grab the DNS entries Netlify made


Gitlab pages doesn't inherently use trackers - See link below showing Privacy Badger for my page that's also built with Hugo. Your trackers probably come from your Hugo template, I had to edit the template to remove them.

https://otekeng.gitlab.io/privacy_badger.PNG


You could get your own hosting, and not use Cloudflare. You could then host the font on your own site.

My site doesn't load any files from a different domain since I removed the Twitter share button javascript[0], except for one page.

I moved the mailchimp signup to a separate page to isolate their JS code to just that page.

[0] https://tacosteemers.com/articles/2020-08-30-twitter-share-b...


To add to my answer:

If we depend on other people's hosting and themes then we just kind of have to accept what they offer.

But getting our own hosting doesn't have to be difficult. There are many cheap providers that give sftp access, for example. Pelican comes with a script for uploading to sftp.


> You could then host the font on your own site.

Only if you have a licence for it — font licences are typically prohibitively expensive for individuals' use.


All Google Fonts are open-source.


That's ... interesting. I must admit, I think somewhere in the back of my mind I knew that, but I'm so used to font services requiring to be used remotely, I hadn't even really considered that. So there's absolutely no breach of any kind of law or terms if you host them locally? That could be a game-changer, thanks!


So there's absolutely no breach of any kind of law or terms if you host them locally?

Open source doesn't necessarily mean free to use for whatever you want. Which licence a font on Google Fonts uses varies - common ones are Apache and Open Font Licence, but you should check the licence file in any font pack you download.


You can 'npm install' any of the google fonts, there are a variety of options like webpack plugins. The only thing you miss out on is apparently they serve slightly different variants to different OS/device types if you use google directly.

Personally, I avoid third-party requests. CDNs are so HTTP1.1.


The licenses used are quite liberal, so pretty much. (I'm not a lawyer)


I quick look on what happens, it would seem cloudflare is used to load katex.js

You may be able to disable this in the theme config if you don't use it.


Ahhh thank you, looks like that's used to render MathML, which I do use. Perhaps I can add katex.js to the git repo and serve it from wherever it is gitlab hosts their websites.


I use pelican-katex plugin to render math once during website generation, and then not have to load the katex.js file at all. Only a katex.css file and fonts loaded from my own server.

https://github.com/cqql/pelican-katex


I'm no hugo expert, but that's likely possible. After all, the JS files are static content. You can probably modify the theme's template to load the js from the origin.

I guess the whole point of using cloudflare as the cdn is so that people can reuse this from cache.


That's an interesting trade-off. I guess static sites are small enough that downloading an extra script wouldn't be terrible, by modern heavyweight website standards. Still, maybe I should look into what cloudflare CDN actually does with its data.


The google font issue is obviously down to you, but GitHub pages doesn't add anything that gets flagged by Privacy Badger.


Thanks for this question, it allowed me to tune some of the cloudflare optimizations I didn't need (Rocket Loader) and remove the dependency for the script pull and thus the tracking flag with privacy badger! I hadn't even thought to check if my own site met my own personal bar/criteria for trackers and ads!


Your posts seems to highlight why I'm wary about these extractable site generator things - it appears that people much smarter than I use them out of the box but don't really peer into that box, so end up with queries like yours.

Whereas dumb people like me hand code a site and keep small chunks of code or 'boot strap' type folders where we can re-use old assets quickly and with the knowledge of exactly what's in them.

I don't use Hugo, etc, but it seems odd to me that it's not abundantly clear how to nix CDN and Google asset references from the site generation code.


> I don't use Hugo, etc, but it seems odd to me that it's not abundantly clear how to nix CDN and Google asset references from the site generation code.

All static site generators do is take a bunch of inputs (markdown files, templates, config, images, etc.) and process them to output a directory of HTML and assets. It's usually trivial to do a search on the input side and identify the source of the problem code. Commenting it out and reprocessing results in it being removed.

You only run into complications when there is some odd dependency between the static site generator and the inputs. Generally, it'll delegate everything to the templates it uses (good separation of concerns), but sometimes there is a weird config option, or convention you are supposed to use.


Static site generators are really useful for ongoing projects with new pages being added all the time. I hand-roll my discrete "here's a page for a thing" sites, and use SSGs for things like blogs, classes, etc., stuff where I might just decide to add a page when I'm sitting on a train, and would like to just be able to type it out in markdown on my phone, push a commit, and have it appear.


No disrespect intended, I've just never enjoyed all the tinkering that comes with managing this stuff myself. I just want to write & publish my posts somehow; used to use medium but then they added all these popups and annoying things, so I switched to Hugo. Now, it might be a good stepping stone to eventually just setting this all up myself, since I've been frustrated with the limitations of markdown & hugo shortcodes and occasionally just want to write things in HTML (yes I know you can write HTML in hugo markdown pages). But I never would get to that stage straight from hosted blogs like medium, so it is good that static site generators exist I think.


No disrespect assumed! :) tbh, when I see people using things like Hugo, i tend to assume they're on a part of the web dev spectrum that is equal-to or further-along than mine. This my failing. I made assumptions about your situation. I just tend to assume that people default to things like Wordpress when they need to exist in that hinterland between publication and bother.

Perhaps I need to look into these things a bit more!


I have tried couple of them. Then I realised all I needed was some HTML, few lines of CSS, and zero JS and definitely no static site generator.

I am not saying no one else needs it. However, the majority of sites I've seen that are generated via static site generators are using static site generators just because they can.


If it's more than two pages I'd use one just because I hate copy-pasting stuff for headers/footers. I know vanilla PHP allows templating but static hosts can't run that. If only HTML allowed for importing fragments.


Something like staticjinja makes simple edit-side templating quite reasonable. It even does live rebuilding. Nice simple lightweight way to do static site building if you don't need all the CMS-type stuff.

https://github.com/staticjinja/staticjinja


You can use PHP while hosting in static if you generate all the pages upfront. This is what I do for my website (3 pages, soon 4, for i18n). In addition of i18n, it allows me to generate most of the tedious context by factoring them in functions. I build with a 4 lines script, and upload with another.


I found Sergey[0] a great solution for this one particular problem.

[0]. https://sergey.cool/


FWIW, a leaderboard of SSGs:

https://www.staticgen.com/


Why is this space so bloated?


Probably because of bike shedding. It’s not exactly hard to write some simple code from scratch to read the input files, pass them through various parsers, templates, processors, and write out the result. The heavy lifting is done by the processors, which you typically don’t need to write yourself.

So an SSG is an attempt to put an abstraction on top of this, which may have some benefits but also introduces a lot of relatively uncalled for complexity and fragile dependency chains. And because the nature of the task is very heterogeneous, the API will either have to be very allowing (and less useful as an abstraction) or impose various assumptions and restrictions on how you do things. That’s very fertile ground for a lot of people to want to create their own take on it as they run into limitations of existing SSGs.


100% agree. Thus, the real question is not:

  "Why are there so many static site generators?"
It is rather:

  "Why isn't there a static site generator generator?"


> It’s not exactly hard to write some simple code from scratch to read the input files, pass them through various parsers, templates, processors, and write out the result.

I think this is it exactly. You could roll your own if you wanted and package it as a library without too much difficulty and I haven’t found compelling reasons why any one static generator is superior to another. It seems to come down to which generators are supported by default by your hosting service, or your language of preference if you’re building custom extensions.


I think Jekyll is only popular since its the only official static site generator supported by github pages


I'd never heard of Pelican but I have to say their site is doing an really bad job at explaining what it does and how :)

I'd expect some screenshots and some points. All I learned from reading it is that it supports markdown but not really what kind of site it's for (e.g. blogs like wordpress etc).

I'm looking for a static generator to make a blog (static mainly for security and because I don't want to do anything complex anyway). But not sure if this will suffice.


That's a dev-blog, try: https://docs.getpelican.com/en/stable/

The site itself is sort of the screenshot.


I am a Pelican user and I love it. My blog is made with Pelican. I also tried Hugo, Gatsby, and Jekyll, but they require much more orchestation than Pelican. There are two areas that can be improved: documentation and themes. I got difficulties trying to customize my blog (syntax highlighting, integration with jupyter notebooks, latex support, and so on). Most of the time the answers were in another blog, not in the docs. Also, the default theme is not attractive for new users. That prevents them to even give it a try.


A comprehensive list of static site generators can be found at https://www.staticgen.com/


It's a good list, but the best static site generator is a human person writing .html files. All this static site generator stuff is unneeded complexity for those that enjoy complexity.


> All this static site generator stuff is unneeded complexity for those that enjoy complexity.

I don't think so, and will give a few examples why. I like Pelican because when working manually it is easy to mess up some HTML tags, forget to update an article listing, category listing, or RSS feed. Creating and adding PlantUML diagrams is more work manually than it is when integrated with Pelican. Pelican even tells me when I accidentally mess up an internal link.

It simplifies my publishing such that I actually have enough time and energy to do so, and in that sense it is not unneeded complexity.


These static site generators are mostly operating under the same conceptual model of what a website is and that's both very restrictive and brings in requirements that aren't actually requirements. Posts and categories and date dependent lists. Blogs aren't the only way. You can just make pages. You don't have to 'publish' and have every single page be exactly like every other page on your site.

Maybe they should be called static blog generators instead.


Pelican does have support for random pages. If also doesn't stop you from copying your own pages to the output directory after generating. In that sense it is more of a markdown or RST to HTML converter. It has great blog support but doesn't exclude other uses, such as documentation generation.

PS. I liked your post about the returned parrot.


I had the same frustration, but my experience with Gatsby and Hugo is that you just need to get past the how-tos and quick-starts. To get started quickly, you need a lot of structure in place, and most people just want a blog, so that's what the starter themes and tutorials cater to. If you get past the tutorials and get comfortable in the official documentation, the functionality turns out to be more general.


Partially agree with you.

When I tried Hugo the first time, I took me more than a day to have a website with the look and feel I wanted. The static generators are good for getting some initial website up and running, but they have so many hidden and poorly documented features that it takes enormous amount of time to customize them.

On the other hand when the site is up and running changes a very fast and easy - mostly because of the markup language support.


Has anyone compared Pelican to Zola? I use Lektor myself (and love it), but I sometimes find myself pining for something fast and self-contained.


Author of Zola, who used to have sites in Pelican, here.

Pelican is way more customisable, you can write Python scripts to do whatever you want. The downsides: you need to have a virtualenv and that it is pretty slow (especially syntax highlighting). It is awesome overall though, my favourite after Zola ;)


Zola [0] has a comparison table between Zola, Cobalt, Hugo and Pelican. I tried out Zola because I saw somewhere that it was single binary (had heard of Hugo, but didn't know it was single binary). I moved my blog from Jekyll to Zola as I found a theme I liked.

[0] https://github.com/getzola/zola


Ah, that's useful, thank you. I'm also looking for some personal experience, because, for example, while Hugo looks great on paper, its data model and template language frustrated me to no end the one time I tried to make something with it.


Zola is fast and simple, but there is a hard upper bound on what it can do, and the author has no desire to change this. One example is requests to support PostCSS instead of SASS. Zola's author doesn't want a dependency larger than the entire application.

I'm currently using Zola + python/js scripts to do things Zola can't and considering a move to something like Pelican.


> One example is requests to support PostCSS instead of SASS

In practice to use PostCSS, you need npm to install plugins, and maybe a `postcss.config.js`, so there are no advantages of having that PostCSS call in Zola's `config.toml` or as a script in `package.json`.

I don't really see how others SSG would handle that. Hugo just shells out a command to `postcss-cli` that needs to be installed (https://gohugo.io/hugo-pipes/postcss/), along with whatever plugins you are using. It's not really something "built-in".


I have used Jekyll for years on a GitHub pages site. I like working with Jekyll overall, but the site has grown to a moderate size and now it takes about 8 seconds to render changes locally. When I started a new personal site last year I went with Hugo because one of its main selling points is speed.

Can anyone share how Pelican's rendering speed compares to Jekyll and Hugo?


I migrated from Pelican to Hugo. Build/render times are a fraction with Hugo. But my main issue with Pelican was git submodules for plugins. With Hugo I just find a theme I like and then everything works.

For the record, I use Python daily and I’ve never written a line in Go. But with these static site generators, most people won’t ever need to look under the hood.

Edit: I see the main feature for Pelican 4.5 is namespace plugins, which should prevent some headaches.


> but the site has grown to a moderate size and now it takes about 8 seconds to render changes locally

Have you tried the --incremental flag after your initial creation of the page?

My site has 300 blog posts, over 150 drafts and dozens of other pages but when I'm actively writing a post there's about a 1 second delay between saving the file and seeing it reload in the browser with live reload using --incremental and about 20 seconds without it.


Thank you, and to others as well who have mentioned --incremental! I can't believe I never thought to look at the documentation for `jekyll serve`. This will make my life much easier.

Another note on using Hugo. I have really enjoyed working with Hugo, both for the speed and for the overall experience. I ran into an issue where the table of contents was not rendering properly. I believe it's a known bug, perhaps just with some themes. Since I'm not familiar with Go yet, I came up with a quick fix by writing a quick Python script that converts all the incorrect TOC links to the correct links for my site. It's plenty fast for me, and I can always make it faster by using bash or sed, or learning a little Go and patching Hugo. At some point I'd like to do that just to contribute back to the project. So it's quite possible to use a static site generator in a language you're less familiar with, and create quick tooling scripts around it for any issues you run into that you don't want to deal with in the generator's native language.


Haven't used the others but pelican is rather slow. For be blog, it takes a couple of minutes to generate the site and around 10 seconds to render any changes. That's my main complaint of pelican.


I tested Hugo a few days ago on 10K posts and it took 40 seconds for a full build. It builds in a fraction of second if you've only got a few pages.

I don't love the template language but aside from basic template features like if/then/else/not/forEach I don't really find I need to do much fancy when jumping between different static site generators or even WordPress. Hugo's speed wins out for me a lot of the time.


Nothing comes close to Hugo in speed.

However, before migrating I suggest:

- using Jekyll's --incremental flag

- move your asset pipeline into a separate process (webpack --watch)

- reduce any Jekyll gems (each one adds considerable overhead)

If you want to leave Jekyll, I also recommend Eleventy. It's fast, require very little work to migrate from Jekyll, and can create pages from data.


Pelican is slow. If 8 seconds bothers you, you won't be happy with Pelican. I think there are settings where it kind of detects what has changed and only renders that output to speed it up, but I've not experimented with it. I rarely make changes to my site, so waiting is not a concern for me.


based on the language alone, hugo is probably winner here.


Incase it benefits others, I did a brief survey of static site generators and while Pelican looked great it had more functionality than I needed.

I ended up using staticjinja instead. And after sticking with it for over a year now, I'd still recommend it if Pelican features like RSS feeds and code syntax highlighting seem like overkill for your project.


Did your survey include Nikola [1]? It seems like the most direct competitor to Pelican, and I'd be curious to see how they stack up against each other.

1: https://getnikola.com/


Pelican has a lot of features, but it's not as if you need to use all those features. If you want a plain site, it's trivial to use.


Is there any reason to not have or generate RSS/Atom feeds?


I use Pelican, and am happy with it, but I wish it had better options for simple templates/themes.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: