As a developer I immediately though of the power of queues. Twenty people trying to submit same form does not work for everyone, but a queue processing one person at a time might allow the twenty people to submit within a short time. It is flattening the curve! If I was contracted to fix this ASAP, I would set up an nginx front-end proxy config that doesn't allow more than X sessions and suggest a time in the future when they could try again.
Having worked on this type of application in the past they should find a new company to work with if they can't handle this traffic. We were handling hundreds of requests per second with ease 10 years ago. That was with MySQL and the app running on the same server.
It doesn't take many resources to show the user a form, validate it, and save to a DB.
A bunch of armchair developers seem to have been summoned to tell the Federal government how to handle form submissions for an extremely security and privacy intense application using their fancy modern techniques.
You are talking about comparing a basic web form with an application for unemployment benefits which must go into a federal tax database and be processed using a what I assume is a garbage mainframe system.
It not only needs to be validated, it needs to securely store records, be able to compare them, and hook up to the system that handles payments, etc.
They can't just circumvent it and dump it into some silly Amazon or MySQL database and call it a day. That would require the employees to basically copy and paste that data into the actual warehouse and considering they have 3+ million to go through as it is making it easy for them to process is just as important as allowing people to submit.
For the time being the correct response is a queue gate.
Yep, USDS and 18F folks would have to agree with you here. The arcane crap that we have to deal with in payment and government information systems is beyond frustrating and makes it extremely tough. I read an article about having to fix a multi-decade Cisco router bug to get CI/CD and automated deployments working after USDS / 18F started setting up faster deployments but still needed to figure out how to deal with legacy stateful DB connections.
The reality of government paperwork systems on the backend is much, much closer to this hell and is part of why so many like myself ran screaming from public sector because when you see so many peers doing so well at FAANGS, why would you subject yourself to something that resists change and wants to keep it the same way? https://www.washingtonpost.com/news/federal-eye/wp/2014/03/2...
The point is that backend pain shouldn't stop you from accepting it on the front end and putting it into a queue. Making the problem of getting the application through backend systems the states' to deal with, not the applicants'.
Yeah, even in the SF/SJ locality which has the highest Locality Pay Adjustment (at 41.44%)[1], the position would likely have to be GS-12/GS/13 to start being competitive.
There is the option of going to some area with a much lower cost of living and trying to hire there, but the problem might be getting enough people together to form a team. If you can easily get enough people with skill and experience, the area probably has jobs for them that pay better, and if those jobs don't exist, it might be hard to find the people.
Eh, USDS and 18F jobs are kind of contract-based and do hit past six figures last I saw. However, they were defunded a lot since last I saw by POTUS45 so it's not clear what the state of comp is. DC area tech is a mish mash of rather enterprise-centric businesses and can be challenging if you're in the wrong domains of expertise.
Unemployment services are ran by the state. The entry level Software Engineer salary by the state of California is around $64k, with senior level salaries between about $75-$105k in Sacramento. I do not know if if this is normal, above average, or below average when compared with other states.
Virginia, DC, Maryland have similar cost of living but VA, MD, and DC have drastically different governments, tax rates, rights, and laws despite people working in roughly the same 40 square miles. Even a federal employee graduating and writing software should make more than that. Senior salaries are between $110k and $140k with not a lot of outliers on either end (the distribution matters more to me than a median when talking salary these days for white collar jobs).
California is a huge state and the Bay Area is going to have drastically different stats for even the same industry comparing San Diego, Los Angeles, Sacramento, and San Luis Obispo (yep, there's software jobs there too).
> The point is that backend pain shouldn't stop you from accepting it on the front end and putting it into a queue.
What if the backend rejects the form? The user's already moved on before their form made it through the queue. So then you're stuck re-implementing all the validations the backend needs in order to give the user feedback (which you may not even be able to do) or trying to get the user to come back later to try again.
> Making the problem of getting the application through backend systems the states' to deal with, not the applicants'.
Reducing permanent staff involved in processing applications is probably one of the main reasons the automated system was built in the first place. If they still have to do that, then you might as well just replace the frontend with a printable PDF.
You can pick a balance between some validations and 100%, and I don't think it's that hard unless you're invested in saying this is just UNPOSSIBLE.
There is already processes (a workforce and/or outbound written letters) to reach out to applicants in the case of eg a dispute (terminated for cause vs laid off).
> You can pick a balance between some validations and 100%, and I don't think it's that hard unless you're invested in saying this is just UNPOSSIBLE.
The point is that it's easy to say things should be easy when you don't know anything except the very surface details of the problem, and it's not your job to actually solve it.
Maybe the team that built the system in question were a bunch of dumb-dumbs who just needed a rockstar developer to show them how easy it is to scale, or maybe the problem is actually more complicated than it seems due some hidden complexities or constraints none of us actually know anything about (either technical or business).
Put it in a workflow where a form is filled out until it reaches a point where the back-end needs to do some heavy lifting, queue the form for processing, and then notify the user to continue to the next form in the workflow.
They could, they just don't want to pay for it. The government has no interest in being known for easily handling a huge spike of traffic during a crisis. They can just take the lower road and get by with less and saying 'try again later'. There's no repercussions here because it's the government.
Hence mainframe maintainers should really move to charging $1 million/year in a decade or two.
They aren't choosing to have crap infrastructure, their infrastructure is intentionally defunded as part of a political campaign to engender distrust in government functions and increase privatization. Government is incompetent because if it is, its easy to justify selling off the country to the incredibly wealthy so they can get wealthier.
It's a bit worse than that. The infrastructure isn't actually defunded, there are huge funds allocated to projects, but they're being consumed by managers at Deloitte, Lockheed, Booz Allen, Accenture, etc. The times when we see success is when enough funding trickled down to the few engineers who could make it work with what they get. Other times we see success is when there is enough public oversight by sufficiently independent stake holders. I see this in many local government agencies that are small, and projects accountable to city council, and so on.
So, legitimately, how to we make it so the government does have repercussions? I see a lot of people making jokes about guillotines and nooses, but is there no better way?
I suggest by campaigning to bring logic and critical thinking into early childhood education. Then philosophy, the classics. Science education.
Once you have more people who can understand that there are scientific and moral issues with manifest destiny, and religion isn't going to solve global warming, there will be some shifts in the public discourse and public policies.
It's the creation of pseudo-scientific explanations for coincidental advantages Europeans had, that created extreme intellectual complacency and bias that is holding back progress.
Validation is the problem. If someone thinks they’ve successfully applied, rejecting them asynchronously is often worse than not letting them apply in the first place.
I’m sure, but there’s still, I’d wager, an order of magnitude difference between the paperwork rejected now and that if users were unable to receive immediate feedback in order to correct their input.
You need to think of all of these things and many, many more to run a robust online service that can handle spikes hundreds of times bigger than the usual level. It's really not straightforward or simple.
Or it's a much simpler problem that they didn't make it semi-fast because it didn't need to be semi-fast.
When "hundreds of times the usual level" is still only 50 page loads per second, and 10 milliseconds of CPU per page would be extreme overkill for anything written in a reasonable way, it actually is straightforward.
Even 5 seconds will work if the actions can overlap. If it can't do things in parallel then we have issues much more fundamental than "performance", and there's no defending it as a competent system.
(That is not to say it's necessarily the devs' fault.)
I don't mean to defend it too much, because realistically it should be possible with relative ease to handle much more traffic than that - but my point is that in the enterprise and government worlds, things are often not as simple as you think.
Aside from potentially having to interface with dozens of unreliable, painfully slow SOAP-based web services, everything is often hosted on creaking, over-subscribed VMWare hosts, in VMs that would be under-specced regardless.
There is also often a "governing body" that severely restricts your tech stack choices.
Want to use Postgres? Nope, our standard is SQL Server - 2008 edition, actually!
Want to use Python/Ruby/Elixir/Clojure/Kotlin? None of that hipster nonsense here, we use good ole Java/VB.NET here!
Message queue, you say? It's Windows Message Queue with distributed COM all the way down here!
"Containers"? What's a one of those? You'll get a crappy VM with 1 vCPU and 1GB of RAM, and you'll thank me for it! etc...
As a dev, it's horrible and soul-destroying to work under such limitations, but if you have no choice...
All of those items are manageable. Some are simple setup or programming errors, some require a bit of added complexity but are normal in modern web apps.
Completely agree with the sentiment. I think most often it is inadequate default configuration that bottle-necks somewhere, that never got tested with more than a handful of users at a time. Going to a hundred highlights some bugs. going to 1000 others. On the other hand, I have worked on a project for USDA and they had 10 year old servers running 15 year old software and did not allow any system administration, while the system admins were some unknown government employees completely inaccessible.
I have had to build python distribution completely in home/user-space in some cases, working on conservatively managed servers.
Usually it's not so much the form that causes things to fall down but some validation step that they are trying to do synchronously, that might have to access an IBM mainframe, and things time out. When you're getting a few an hour, it's not a big deal.
At this point introducing a new company could cause more problems than it solves, and I think it's understandable to not be prepared for a volume of jobless claims that is almost an order of magnitude more than at any point in US history.
Put the web form (plain static assets JS/CSS/HTML) on a globally accessible CDN. Then use SQS intake for each unemployment application form. Then firehouse it out, wherever it needs to go, at a rate which you can realistically deal with it.
Queuing access to the form itself and telling someone to wake up at 4:52 AM so they can then merely access the static assets is a less-than-desirable user experience.
It is more desirable than 504, and first thing I would do in 15 minutes with zero context. If I can get more context, of course something like your solution is more desirable, depending on the issue. It would take some time to figure whether it is necessary to bring in AWS or just database connection pooler, or whatever.
Ocado (the IaaS for online supermarkets company, and, in the UK, online-only supermarket itself) has done this in response to the increased demand, and makes you wait in a 'virtual queue' (virtual relative to what in America you call a 'line-up', but we call a 'queue', at a physical supermarket) before you can place or edit your order.
You’re assuming that the people who built it in the first place (or the people that may or may not be contracted to fix it later) know or care. Remember, this is government contracting we’re talking about - lowest bidder wins. How do you win the lowest bid? By doing it as cheap and quick as you can. That means hiring inexperienced/cheap developers who can build something that looks like it will work for far less money than you can build something that actually will.