Hacker News new | past | comments | ask | show | jobs | submit login
[dupe] Busiest hour ever in the history of Gov.uk 17M page requests just 21 5xx errors (twitter.com/therealnooshu)
230 points by robtaylor on Jan 5, 2021 | hide | past | favorite | 103 comments




I have lived in five countries now, but not one (local or state) government had an online presence that was nearly as good as the gov.uk infrastructure.

Their websites might not look the part, but they excel (again, in my experience) in their practicality, accessibility and ease of use. And that is exactly what you would expect from a government website.


>Their websites might not look the part, but they excel.... accessibility....

I remember they had a design document ( Something similar to [1] [2], but not the exact one I was looking for ) where they list accessibility as their number one priority and a very long list of accessibility requirement. Simply because Government Web Site needs to be accessible by everyone. That means there are certain fancy things and colour scheme they cant do.

I wonder if the stack for Gov.UK are the same as their Petition Site [3] or was it based on something else? They have an Open Source Github account [4] for Government Digital Service which list everything they have open sourced.

[1] https://www.gov.uk/guidance/government-design-principles

[2] https://design-system.service.gov.uk

[3] https://petition.parliament.uk

[4] https://github.com/alphagov


If there was any remaining doubt about the UK Gov's design chops, check out their Github projects. Just to highlight one... they've published one of the most accessible and intuitive reverse-engineering tools I've ever used: https://github.com/gchq/CyberChef


There's also huge amounts of UK Govt infrastructure code available online and open source. For all kinds of things, from all departments.

You could pretty much spin up the code needed to run the boring bits of a country out of open-source code. How about the application for booking secure prisoner moves? [0] Or some nice open source security guidance? [1]

And that's just one department, most others are similar!

[0] https://github.com/ministryofjustice [1] https://github.com/ministryofjustice/security-guidance


Or run your own version of government tax system https://github.com/hmrc



Slightly different group of people working on GCHqQ's recruiting tools and gov.uk, I would imagine.


It's worth noting that GCHQ and GDS (Government digital Service, responsible for gov.uk sites) are different.


How have I never seen this before? This is amazing! I've been looking for something like this for ages.


We have some form of this in Norway too where companies can be fined for having inaccessible web pages. We will also adopt the EU wide Web Accessibility Directive (WAD).

The department overseeing this has guide lines [1] and tests for web pages/apps/forms etc [2]

[1] https://www.uutilsynet.no/wcag-standarden/wcag-sortert-etter...

[2] https://www.uutilsynet.no/regelverk/testprosedyrar-nettstade...


https://www.gov.uk/service-manual

The service manual is an awesome piece of work.


May be if we are in UK, Govt. jobs are good you mean?


If you're looking for another example that does it well, I'm currently in Singapore. Their websites are fantastic. highly functional and relatively good looking [0].

They have SingPass[1] which is a centralized OAuth for government ID that lets me into things like taxes, identity, etc.

There's a fair argument someplace for too much reach/tracking, but it works really smoothly.

[0] https://www.gov.sg/

https://www.mom.gov.sg/newsroom/press-releases/2020/0624-pub...

[1] https://www.singpass.gov.sg


This doesn't strike me as functional as gov.uk. It's more like your standard wordpress stories site (there is even a section labelled "stories").

edit:

> The www.gov.sg Portal is the official online communication platform and repository of the Singapore Government, providing you with the latest policy announcements, information and news on Singapore. You can also translate Government terms using our translating tool, and search for contact information of public service agencies.

> The Portal is managed by the Public Communications Division of the Ministry of Communications and Information. We welcome your suggestions for the Portal, to help us to serve you better.

Indeed, it's more a branding website than an administrative tool.


I on the other hand find the design of gov.uk websites one of the best out there.


I agree UK services are ahead of other countries I've experienced but they are still terrible in my experience. That said I know very well some contractors that worked on gov.uk and they're top notch. That's probably were the technical excellence come from. The bureaucracy and process are the problems.

They're much worse than any UK private company or UK bank I had to work with.

HMRC is terrible, there are tons of error on the payslips, they mess up tax codes all the times, part of the website uses a previous UI and not the cool modern one, several functions need you to call and queue for 1hr or send letters and hope. Having an accountant helps as they have access to a special version of HMRC they can use to do things mere mortals can't.

Childcare services are embarrassing and require you to have a subaccount in your governments account. I've spent 3 months without access because they asked me security questions for the childcare service on HMRC; I provided HMRC security answers and got locked out; the procedure to restore the account involved verifying my identity by providing via email certified copy of various documents. They were extremely slow in replying (sometimes not replying at all), they never returned any copies or originals I sent and they were extremely picky about certifications so I ended up having to send 3-4 certified and original documents.


Most of the problems tend to occur when you are forced to interact with legacy versions of tools that were built prior to the buildout of the new gov.uk.

The old sites are all over the place.

I think part of the reason gov.uk has been a success is exactly that they were not forced to replace old systems wholesale, but were allowed to build out a platform and methodology that works well and replace public facing services bit by bit.


I interviewed at a large UK gov Ministry for a technical architect role a couple of years ago and one of their interview questions was:

"How would you architect new features on top of a legacy system implemented by an external agency that is hard to work with and has many bugs but we are unable to change in any way for contractual reasons?"


I am curious to know how one would answer to this. Is it related to DDD concepts like maintaining upstream/downstream dependencies, boundary contexts and migrate piecemeal?


How did you answer?

It isn’t a technical problem...


Sure, it is a technical problem too. It's a technical workaround to a political problem of not being able to touch the other platform.

My approach to this has been to embrace, extend, extinguish (yes, that was an intentional reference, sorry):

Insert a constrained API "around" the service you can't modify. Change all your services to go via that API instead of direct. Now you can apply the fixes you can to the new layer while finding ways to gradually replace it in parallel.

At the same time the new layer provides a constraint: It should be as narrow as you can get away with - creating a "new contract" with the API consumers about what they're actually allowed to do without negotiating with you, and ensure that any expansion of dependencies on the service you can't touch get documented and appropriate attempts made to avoid increasing complexity (it's good to do this in general, but realistically it's easier to get buy-in to enforce in limited cases like this).

This intersection of tech with procedures and politics lie at the heart of a lot of technical architecture work. A huge proportion of the work we do is not technically hard in and of itself - it becomes technically hard because of organisational considerations.


i'd expect nothing less. only surprise is that there hasn't been any 'hypothetically' here.


I've been doing self assessment for 20 years now and it almost fills itself in these days - the P60 stuff and pensions etc do anyway. Getting your car taxed is embarrassingly simple and takes seconds as does checking your car MOT and insurance status. I don't have an accountant for SE and I do my wife's as well.

The amount of documentation available is amazing. This is a great example of what's available: https://www.gov.uk/transition . Scroll down to Announcements, hit the first link and you have the full withdrawal agreement in multiple freely open formats.

No it not perfect but it is rather good for this user anecdote.


I have always had to fill in pension and P60 by hand (not that it is particularly hard); has this changed in the last year? Which reminds me that I need to do the self assesment before the end of the month.


I think it (P60 and Pension) was filled in last year for me as well. The service is being changed quite quickly but I can imagine the pain behind the scenes. Screens that used to look like the old green/yellow thing are becoming rarer and the front screen has more options.

My overall impression is that we have a rather sensible GOV.UK thingie (not sure of the real brand)

Make sure you settle up before Jan 31 otherwise you'll be fined. HMRC are merciless. Get it done mate.


I started the return a couple of days ago and the P60 was indeed filled in; I'm pretty sure that it wasn't last year, but I might misremember. The P11 was not, but I don't think I have anything to report as everything is taxed at source for my current employer (I'll double check of course).

The pension (Assuming you refer to the private pension scheme) is not filled in and I already know I have to pay some tax on that as I overshoot my allowance.

I normally get it done a couple of weeks before the 31 and doesn't take more than a couple of days even with triple checking everything, so I should be quite in time.

And yes, TBH filling in the tax return has never been a big problem. It would b e nice to automate a few more things (like computing exchange rates for dividends/capital gains instead of having to look them up) but it is fairly straightforward.


What's wrong with their aesthetics? I've always found their simple design to be quite nice.


I think he means they don't look particularly "high tech" or something. Which I'd say is a feature.


It's simple and clear. The aim is to convey information, not to be fancy and bloated, indeed.


And gov.uk's clear design looks better than most of the wild wild web out there, to be honest.


I believe it is: given the focus and care on accessibility for all users, there's a limit to how many whizbang widgets are ok to use.


Gov.UK had a good article a while back (appeared on HN) where they pointed out the importance of designing for no JS. All official gov.uk sites that I'm aware of will work without JavaScript.

They had an analysis showing how many % of their visits had broken JS support and therefore the need to build for the lowest common denominator.


https://www.gov.uk/service-manual/technology/using-progressi...

When a team builds a service using the service manual as a guide so many things become so much easier.

Need to explain something to a non technical manager? The service manual has got you covered:

> Progressive enhancement is a way of building websites and applications. It’s based on the idea that you should start by making your page work with just HTML, before adding anything else like Cascading Style Sheets (CSS) and JavaScript.


That's the one, progressive enhancement.

A great principle, and nice to see they even think about Lynx accessing the sites [0]

The post I was thinking about with the analysis of why JS/CSS shouldn't be assumed to work was [1].

[0] https://technology.blog.gov.uk/2016/09/19/why-we-use-progres...

[1] https://gds.blog.gov.uk/2013/10/21/how-many-people-are-missi...


I was about to ask this myself. I think their simple design is quite intentional—and it perfectly fits it's purpose.

Addendum: plus being accessible (essential for any govt. service).


It all works with JS disabled. Their accessibility callouts are amazing, and they provide you callouts wherever accessibility-related concerns have not been closed (and a form to raise it as well).


UK only excels in static .gov.uk serving. They are great for informational purposes. The best one actually is the one serving Travel Guidance for any country in the world.

They are light years behind countries that have fully-fledged dynamic services provided by the government accessible through the ID Cards, as it is in the Baltics for example.


The lack of ID card based auth isn't really the fault of the team behind gov.uk, who provide the best service they can for government SSO in the form of Verify: https://www.gov.uk/government/publications/introducing-govuk...

National ID cards are a particularly thorny issue politically, there's been a few attempts to introduce them, all of which have been noisily shut down in parliament because of privacy concerns. There isn't really any single database of people in the UK, with different ministries having their own databases. In principle this seems like a pretty good approach, even if it does mean that things like authentication with web services is a bit clunky.


GDS are clearly aware of this problem and are working on this as well, with "GOV.UK Accounts" [0]

https://gds.blog.gov.uk/2020/09/22/introducing-gov-uk-accoun...


> UK only excels in static .gov.uk serving

I'd put forward Gov.uk/pay and Gov.uk/notify as very successful transactional services that operate on a massive scale. They have been particularly good for local councils who just hook up to Notify and Pay. The parts where security is paramount are handled centrally. It stops councils having to "roll their own".

Universal Credit is a transactional system handling billions in benefits. When the conservative government shut down the economy in March it seems to have coped pretty well:

https://dwpdigital.blog.gov.uk/2020/12/14/dwps-agile-respons...


There's a nice episode of Yes, Minister showing why ID cards are a particularly thorny issue. One of the indicated reasons was how it's akin to Nazi armbands.


> I have lived in five countries now, but not one (local or state) government had an online presence that was nearly as good as the gov.uk infrastructure.

It is a neverending source of wonder to me that, whilst our government may be a shambles[0], our government services are so fantastic. They really are very impressive and, for the most part, very easy to use. Clearly a group of people have invested a lot of time and effort into this, and to me they demonstrate very clearly the value of investing heavily in user experience.

Also, the site is almost never down, which is good for somebody like me who tends to remember that they need to do thing X at random-o-clock at night.

[0] Not intended as an especially partisan statement as I have been known to vote Conservative (and over the last few years have criticised both Labour and Conservatives pretty equally), but I think it's tricky to argue that, as an example, sending 3 million kids back to school for 1 day before changing your mind demonstrates competent leadership.


This is the main advantage of western democracies, right? Institutions and the day to day running of the country are separate to the executive.


I believe they're known as "Unelected bureaucrats"


I think you're seeing differences between job cultures more than politicians vs civil service. Politicians are doing crazy stuff and reversing direction 24 hours later all over the world because that's what public health "experts" are doing. The WHO is notorious for it, they have routinely made contradictory announcements within days or weeks of each other in 2020 and this will probably continue in 2021. Public health is just making it up as they go along and this is reflected through politicians, who basically just do whatever they're told by their "expert" committees even if it makes them look foolish.

gov.uk is run by software engineers, presumably pretty good ones. They make decisions based on data and know how to interpret it.

One thing that has become super clear to me in 2020 is the huge quality gap between CS culture and public health culture, at least in research. Public health research papers and especially epidemiology papers contain not just severe errors all the time but are flat out deceptive in ways I just never see in the output of CS research or the software industry. For example I never encounter bogus citations in CS papers but that happens all the time in epidemiology papers. Public health research superficially looks and feels like science but when you scratch the surface the scientific method has gone AWOL, good data practices have gone AWOL, basic logic and integrity has gone AWOL. It's just a disaster zone.


In the US, 18F does a good job:

https://18f.gsa.gov


Agreed. I like the US design system more than anything else out there: https://designsystem.digital.gov/

It’s incredibly well thought out and designed.


If there is one thing that the UK government gets praise for by the general public, it has to be gov.uk (secretly).

Everything else, they're at the top of the Tower of Blame.


I know some of the people that worked on the gov.uk project. They are really good.


i m glad the greek govt copied their design: https://www.gov.gr


I have worked at a company where every 500 error is a reason to go check the logs and try to reproduce... We would literally page an engineer.

I've also worked at a company where "as long as the error rate stays under 0.1%, it's fine". As long as each app server doesn't have more than 1 crash per week it's fine.

I can tell you that engineers at the first company end up finding all kinds of super rare bugs, usually in the OS, platform, malloc library, load balancer, etc. They then commit fixes for those bugs which ends up helping the latter company...


It's an interesting take. We definitely don't have the capacity & skills to be in the first company, although it would be awesome.

I think most places I worked we had no choice but to be in the second company, although I think the rate is a bit below .1%. I'm grateful for all the people hunting down these bugs and making the world better for the rest of us.

I remember around 2006, we still found PHP segfaults that were critical enough I reported at least a dozen. Everything feels so much more stable today (at least on that layer).


While it's nice to get paged, and look at every 5xx error; it doesn't really scale all that well once you get past a certain point, particularly if your application is gracefully degrading.

That said, I love the wisdom in your comment that you find all sorts of super rare bugs, or conditions that could seriously effect performance, or availability if they become more common (which they often do). Past a point, I've found that an approach which works well is to encourage engineers/operators to drive by metrics, and pay close attention over time to p100's (max), as you've suggested with your 500 errors. Lots of goodies can be hidden behind them, just like you've found with the 500 errors.


curious if you can share a little more on the first company you mentioned, what was their area of business?


When you serve traffic at this volume, it's no longer the requests per second that matter.

You can easily have static content on lots of metals, but the new problem is saturation of peer links on egress, or unintentionally triggering DDoS mitigations along the path that the traffic takes (or on your own or the CDN services).

That the content could be placed on enough metals to serve it is the easy part... but a nicely designed solution for serving the requests isn't as impressive as keeping the network solid and operating smoothly.

This is also why in the tweet thread that performance, optimisation of resources, etc is called out explicitly... fail to do this and you kill your network.

Fastly did great here, but the gov.uk people and GDS also did great in making the job for Fastly a lot easier.


Don't forget Varnish the real hero. Or the magic behind the curtain more like it.


Isn't this a little unimpressive, given that these are static pages served by a CDN?


Eh, it's the UK government. Any lack of fuckup is a surprise.


On the whole, the UK Gov is actually quite good at digital services. Much better than any of my experiences with services offered by the US, Canada or France.


> Much better than any of my experiences with services offered by the US, Canada or France.

That's not hard to be better in the case of France, half of the procedures require to send some physical letter and the rest are accessible on some atrocious websites.


You're absolutely right, there's still a very old-school mentality when it comes to French bureaucracy.


The UK government is often quoted as a leading example on how to run digital services well. A fuckup would be a surprise.


The UK civil service is great - although it strikes me as someone really good was given the power and budget to make it happen.

The UK government however...


You are spot on there. Here is Tom Loosemore (one of the originators of the Government Digital Service) explaining how it came to be:

https://parliamentlive.tv/event/index/abfe49d3-f24c-4b93-b8e...

Unfortunately the spend control that GDS had, has been removed.


Cache Rules Everything Around Me (C.R.E.A.M.)


I didn't see in the Twitter thread--can anyone tell me what was driving this huge spike in traffic on Gov.uk? I'm not in the UK so I'm not up to the minute with political announcements.


UK Prime Minister announced new COVID England-wide lockdown https://www.theguardian.com/world/2021/jan/05/covid-lockdown...


I was one of the people whose first reaction was to look on gov.uk to understand the scope of the new lockdown, rather than some rehash of it in the media. It took me just two mouse clicks (open gov.uk favourite, click on a link [0] near the top of the page) to find this info. This is just awesome.

[0] https://www.gov.uk/guidance/national-lockdown-stay-at-home


> A support bubble is different to a childcare bubble and a Christmas bubble.


Even if it's just static pages, one must admit the website is blazing fast, even on mobile.


There is a new lockdown: https://www.bbc.com/news/uk-55538937

specifically, this would seem to coincide with the PM addressing the nation on TV.


The sort of thing that multicast was intended to solve, instead of everyone requesting their own stream of the same data. But MC never gained traction even though most ISPs in the UK are now IPv6-enabled.


That's because people want to start videos from the beginning and seek around, not just watch TV.

Every time I see this comment about multicast, I think people are missing the point of the internet.


IPv6 doesn't have anything to do with multicast - it works fine on v4. For some definition of works - the only industry where I've heard of it being used extensively is finance.

For live events like press conferences or sports this is already solved with broadcasting.


> the only industry where I've heard of it being used extensively is finance. > For live events like press conferences or sports this is already solved with broadcasting.

16 million people watched that press conference on BBC TV, broadcasting is very efficent at getting live pictures out. The BBC did experiment with multicast in the past, but CDNs seem to be a more scalable solution ironically.

However in this specific case, the pictures actually left downing street via multicast (SDI form camera into encoder, MPEG-TS over multicast to the studio, back to SDI to get encoded on whichever output chain it is - Terrestial, Satelite, Cable, online)

Indeed the standard to replace SDI (2110) is built around multicast, broadcast certainly uses multicast a fair bit.


This wasn't any stream of the announcement driving this; this was people looking up statistics and information about COVID-19 and related restrictions.


Not only did the PM announced a new full lockdown in a TV address, as others have already said, but he also specifically gave a gov.uk URL as way to check details and rules.

Huge audience plus that equalled massive traffic spike.


Yet, it seemed that the guidance the PM mentioned wasn't online at the moment that he mentioned it.

Like many, I was hitting F5 repeatedly on this page waiting for a document that didn't seem to be there, eager to know restrictions on various aspects of life. But the /coronavirus page just contained stale advice from December.

Eventually at 8:15pm, somebody found a link to a PDF of the new guidance and posted it on Twitter:

"This has all been done with such crashing urgency that they haven’t even been able to transpose the PDF version of the guidance into the website yet, it is here:"

https://twitter.com/AdamWagner1/status/1346188409300258816

[this is how it played out for me at least, did anyone else see this differently?]


Nice. I guess that this only increased the spike.


Prime Minister Boris Johnson announcing another lockdown.


from the top active pages table, it looks it's due to the new lockdown in England.


gov.uk is a fantastic resource, the team behind it, GDS, should be proud of what they’ve accomplished with it


There are very few good things I have to say about the Conservative government, but the early days of GDS make it onto that short list. They intentionally structured GDS as much like a startup as possible, attracting a huge number of highly competent people, and giving them oversight of government digital projects.

They're the inspiration for many other country's digital teams, including the USDS who are strongly modelled on GDS, and have spawned equivalent teams within other branches of the UK government such as the Ministry of Justice and NHS.


Many of the people who built GDS now work for public.digital who help governments around the world to build these services.


GDS is /government-digital-service [1]

[1] https://www.gov.uk/government/organisations/government-digit...


Well done for doing a victory lap but on the other hand they have unforced errors like this: https://www.independent.co.uk/news/uk/home-news/government-w...


And the UK Government is willingly sending details of every one of these page requests and website visitors to Google so Google can profit by better serving ads to the people of the UK. How nice.

Also their content security policy in production exposes all their development and staging infrastructure and their AWS service endpoints.


This is false. Gov and other entities have direct deals with google that restrict how the data is used.


Nice finds, good diligence.

Will you leave it as only a comment or try to fix it by contacting them or a news service?


So from the graph aprox. 250 pages per second.

Not a bad number, I wonder how it compares with a HN hug of death. Sure, caching helps but it's not the whole story (LB, configuring the cache, benchmarking, etc)

Also what dashboard is that?


17M page Requests in a single 1 hour block would suggest significantly more than 250 pages per second.

https://twitter.com/TheRealNooshu/status/1346419468151488512


You're right, it seems that was the ramp-up, this one shows a more realistic value https://twitter.com/TheRealNooshu/status/1346187935088054276...

17M per hour is ~ 4700 per second.


thats google analytics. specifically the "live" feature they added to commodify/undercut people who were moving to or splitting their spend with chartbeat (who's main thing was that live-on-site counter).

fwiw, the HN hug of death rarely breaks 5-figures for that metric. Its plenty to destroy a wordpress blog on a two core VPS, but its nothing compared to something getting shared in a large facebook group or trending on twitter. News sites are several orders of magnitude larger than any other type of sites besides the mega-platforms (fb) and comms tools like gmail/slack. Its actually a core part of why the entire sector's business model failed together. Everyone chased "reach" (raw audience size) as their most important revenue generating metric (because it directly factored as higher CPMs). Thats how we wound up with dozens of "news" sites all catering to the exact same 100M people clicking awful headlines in facebook shares and then closing the tab to go back to facebook. They (we) solved for scale, but in a way that turned everyone into a commodified copy of each other with no meaningful connection or relationship to the audience. Then facebook just captured all of the ad revenue by gatekeeping/aggregating/"curating".


Anecdotally, HN is probably ~1/100th of this. The "hug of death" is much more about bad servers and very low usage limits for things like shared hosting plans than it is about the volume of traffic.


Various hugs of death may be correlated. People re-post things found on one service to different ones. You can easily get a Reddit hug on top of HN hug.


> HN hug of death

Being on the front page of HN (or Reddit for that matter) doesn't bring in as much traffic as you would imagine. It's not insignificant but it's not a deluge like what you are picturing. Turns out most people don't bother to click into the article. They head straight for the comments section. Source: work for a company that has been on the front page of both HN and Reddit on many occasions and have seen the traffic.


Based on my experience, If you are on the front page of HN (top 20) you will get around 1 to 5k visitors in total. I think every site should be able to handle it. Reddit brings you less traffic than that. Twitter, Facebook and some Chinese sites can be much more effective.

I only experienced a hug of death once and it was a combination of factors. I had a website on a very particular topic, and this topic made big news (this was unexpected), so I got huge amounts of traffic from every where, (Forums, News sites, social media, and mainly Google searches...). I was down for hours my Apache server just kept dying. One day later it was already old news and my traffic quickly returned to normal.


The dashboard is Google Analytics.


The few times I was near the top of the front-page, the google analytics realtime dashboard has been in the hundreds, never thousands.


Note that 770k concurrent users reported by GA are not really concurrent. Each user times out after about 30mins only AFAIK, so it most likely means 770k in the last 30mins, not currently browsing.


You can follow the tweets - it was 770k at 20:16, 615k at 20:13, 384k at 20:10, 256k at 20:07, 117k at 20:04, 39k at 20:02, and 45k at 19:55.


yes yes GDS is legendary in web design/accessibility and influenced the US's 18f




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: