Hacker News new | past | comments | ask | show | jobs | submit login
Simulated Hospital (github.com/google)
191 points by axutio on May 17, 2023 | hide | past | favorite | 100 comments



When they say "Most EHRs use a message format called HL7v2, which is ugly and tedious to type." they aren't kidding, here's an example of it:

MSH|^~\&|FROM_APP|FROM_FACILITY|TO_APP|TO_FACILITY|20180101000000||ADT^A01|20180101000000|P|2.5| EVN|A01|20110613083617| PID|1|843125^^^^MRN|21004053^^^^MRN~2269030303^^^^ORGNMBR||SULLY^BRIAN||19611209|M|||123 MAIN ST^^CITY^STATE^12345| PV1||I|H73 RM1^1^^HIGHWAY 01 CLINIC||||5148^MARY QUINN|||||||||Y||||||||||||||||||||||||||||20180101000000|


I designed HL7v2 and v3 (xml based) parsers in Mirth. It was horrible, but boy did I learn a ton. Subtle bugs could lead to scary outcomes like drug doses being way off. Seeing the string “PID” still causes nightmares.


Pelvis Inflammatory Disease as a data format - a perfect description for HL7.


Mirth is so useful for debugging, though.


Don't forget the most ridiculous part of EDI-style protocols (X12, HL7, etc.), which is that they don't have an explicit looping construct; you have to look at each implementation spec and your trading partner's specific companion guide and determine where repeated or looped sequences area through context. Makes for some exciting issues in implementing parsers.


Agreed.

On the positive side, being domain specific and well known, Hl7v2 "interface" specs are more simple and predictable than other IDLs I've used. (Just vibes. I haven't done a formal comparison.)

We had a stock human-readable "interface" template to use. It's basically literate programming. Paragraphs of descriptive text and well-formed tables.

I wrote a simple parser which turned those well-formed tables into actual code. Our HL7 tools made it trivial to subclass generated classes whenever a special case (difficult to describe in the spec's tables) had to be hard-coded.

Our HL7 tool stack allowed our "business analysts" (domain experts talking with customers) to do most of the implementation themselves, with speedy deploys, low cost of change, and super easy to verify. For instance, our team could make changes and deploy and verify while on the phone with customers / partners.

What could be easier?

Our customers loved us.

So of course the mega-corp which acquired our startup had to strangle our effort in the crib, dumping our corpse in a land fill. Mostly out of spite.


The wire format is a bit messy and hard to read. But developers mostly work through libraries like HAPI which have a decent API.

https://hapifhir.github.io/hapi-hl7v2/


HAPI is amazing.


Last Published: 2017-06-23...is it still active?


HL7 has different lifetime to JS-framework-of-the-week. It doesn't really change since the software itself is updated at most every year. (not including small patches)


This message object seems to be missing the HIPAA^ENCRYPTED field (pun intended)...I am assuming the encryption is implemented above this layer? Is there a standard for HL7v2 encryption?


HL7v2 is just the schema - the mechanism for sending these messages is normally something called MLLP which is just a simple framing protocol and has no built in security. It is possible to send/receive HL7v2 over other protocols, MLLP is the most common.

Its normal to secure the endpoints via network level security - ipsec etc. HL7v3 transformed into FHIR which is done over HTTPS instead.


or, hear me out, you also blast it unencrypted to the flex pagers your employees don't even use anymore


It's normal to not encrypt it, in my experience.


It's really not normal to not encrypt HL7 V2 messages. Every interface that I've seen uses a VPN.


I'm in the UK in the NHS. Perhaps that's the difference. They just do network lockdowns inside the hospital.


oh neat. i was aware of hl7, but didn't realize i'd seen it in the wild til today.

you can often find these messages flying unencrypted over flex pager channels in the us


Typically these messages are not encrypted. This is a late 80's spec based on a 70's era EDI spec.

IMHO, when these messages are transmitted outside the hospital typically a VPN used. There is a spec for posting these messages to an web service over HTTPS but I haven't seen it in use.


Usually encrypted via a TLS connection right into an HL7 channel/listener. Or the entire connection is encrypted via a VPN connection between healthcare systems.


If there's encryption, it's generally further up the stack (eg. wrapping in a TLS connection or IPSec tunnel).


no shot anyone actually types this out, right? surely this must be generated by machines for machine ingestion.


They probably do, and if NOTAMs are any indication, it's probably also read by people too.


Not comparable. HL7 is a machine interchange format - virtually no docs or healthcare personnel except those involved in informatics (and then generally only superficially) are familiar with HL7.

These NOTAMs were designed for human consumption and brevity, they date back to the 40s (debate about their ergonomics nonwithstanding).


No, not any more than anyone types out XML anyway.


Generally, it's machine generated. I've only ever typed it out manually if I needed something specific for a unit test and that only for small messages. It's tedious to do by hand.


yeah


I've worked a lot with DICOM which has plenty of warts (as expected with a 30 year old living standard), but HL7 is a bit much for me. At least FHIR and DICOMweb are considerably saner.


You haven't lived until you've tried to debug it. The array isn't zero based. The line breaks have different characters for breaking the lines, and you get to guess which ones are doing it.

Certain characters are ok (ASKII) but when a Mac gets a nice ' into the field, all hell breaks loose. Things like "lesion at 9 o'clock" then break the system, as does Mr O'Leary. The hours spent chasing down how some failure happened are countless. Pity those working with HL7.


Healthcare is the perfect storm of layered, ossified systems created across decades + input data that's normal just often enough... that everyone thinks bounds are actually handled.

And then Mr. O'Leary checks in.

Fuzz testing tools would be great, but I haven't seen many institutions that have functional 1:1 lower environments (i.e. including all system-system integrations working in test).


Add to that the fun bits like "let's attach a whole base64 encoded PDF as a field of the hl7 message". Which honestly is not a terrible solution given you want to keep it all together, but still... It would be nice if you could attach binaries after the ASCII content.


It is possible to include multiple binary attachments and text content in certain HL7 V2 message structures. Just use multiple OBX segments.


You can ask GPT to expand it for you without any context and it will fill in the headers and explain each field.


Good learning tool if you use it on generated data.

Don't paste actual patient data.


Why not? Can you articulate what could go wrong?


You're sharing someone's sensitive data without permission. On top of that, patient data has additional protections in many countries. You can get into legal troubles because of that.

Major providers like Azure or AWS have programs in place to handle sensitive patient data. I'm not sure if OpenAI has one too. Either way, you would have to enter into a special agreement to do so.

Bonus point: ask ChatGPT if it's ok to paste private patient data there.


What does HL7v3 look like?


The HL7 V3 standard is mostly dead. Only the Clinical Document Architecture (CDA) offshoot of V3 still survives in widespread use. You can find examples here.

https://hl7-c-cda-examples.herokuapp.com/

Most new HL7 work has moved on to FHIR which is essentially "HL7 V4".



R4 is where things get a bit kinder and gentler.

https://www.hl7.org/fhir/us/core/Condition-condition-duodena...


I think you're kind of mixing up version or release numbers there. HL7 has several major standards: V2 Messaging, V3 (including CDA), and FHIR. And each of those has gone through multiple releases. FHIR was on R4 for a couple years and R5 was recently published.


Yeah, I’m doing that thing where I speak sloppily. HL7 is the standards organization, so talking about “HL7 messages” is like talking about “ISO messages”. It’s a misnomer.

But as someone at the sharp end of the stick wrt integrations? If I hear someone talk to me about HL7 I’m thinking HL7v2, and if someone says FHIR I’m thinking US Core v2 + FHIR R4. That just seems to be lingua franca.


Ah that makes more sense.


Seems readable to me


I am frequently asked about why systems in institutional healthcare are so hard to modernize, this has come up a lot related to pricing transparency. While HL7 is in theory a standard, in practice it is a semi-parseable "email" between two parties that know each other.

Widely used systems like EPIC have bugs and quirks that have existed so long that the bugs themselves have become their own standard. Because HL7 relationships typically happen between longterm consistent partners both parties tend to evolve the format to suit localized needs and this business logic and the reasons for it are lost to time. It isn't that rare to find HL7 interfaces that have been in use for 20+ years that have become vital black boxes. HL7 2.x, still widely used, was originated in 1989.

In a lot of case modernization like FHIR is nothing more than taking the old garbage and putting it in a new fancier bag.

As whacky as HL7 may seem it is really nothing compared to its much bigger uglier older brother, X12 837/835, used for communication of billing information from performing entity to insurer.


I don't have enough fingers to count the times I was doing automation in healthcare, implemented the process to spec, and then had a bunch of test cases flagged as failing.

Turns out, for large known counterparties, there are indeed de facto processes that incorporate each party's eccentricities and are "known" on the floor (processors typically have long tenures in their jobs) but unknown above a certain level of management.

E.g. a large children's hospital that reliably spit out misformatted requests to the local payer, but which the payer papered over on their side by converting them to payable requests (naughty, but kept them from bouncing back and requiring resubmission)... and had been doing so for 10+ years.


Bizarro-world that doctors actually defend EPIC! They all complain about the ASP.NET 1.0 UI. It's just the convenience of viewing all patients from all hospitals in one virtual "chart" ;)


As a doctor, Epic is mediocre software that just happens to be less mediocre than most of the alternatives.

My biggest problem with Epic is that things are so heavily silo'ed by your job description (RN, MD, PharmD, etc.) and job context (without getting too deep into the weeds, it is a fact that the Epic implementation at my hospital does not allow anyone other than anesthesia personnel to see an intraoperative anesthetic record - not even the surgeon who performed the surgery (!); and that certain contexts do not allow a nurse who is running the schedule in one area to edit their case status board, which is a view of all cases in that area that allows them to see what's been done, what's left to be done, etc., - but if they switch context, they can change things, make their own boards, and then change back to the "proper" one and use the ones they've made).


It's more than just the job description, I believe many EHR vendors require people who use particular modules to be "certified" for that module. Obviously this leads to a silo effect where the information is available but one clinician may need to request another clinician to actually read the data to them.


Which is beyond moronic.

If you need the anesthesiologist on call to come back to the hospital just so the surgeon can see what drugs were given during the case and what the vital signs looked like, your EMR is less useful than paper.


A loved one was inpatient for a few weeks at a local epic hospital/medical system and then a more specialized/academic hospital with a federated collection of cerner systems.

It’s a night and day difference. The GUI may be ugly, but the Epic implementations tend to be soup to nuts. Everyone, from the transporters moving patients to the doctors to the primary care providers to the patients know what’s going on in real time.

At the teaching hospital, they had awesome medical capability, but nobody had a clue what was going on. That leads to risks if you need care from multiple specialties.

I’m sure that it’s garbage enterprise software of course.


Epic is the worst EMR system, except for all the others.


The baseline HL7 V2 Messaging standard is really intended as more of a toolkit. It's very generalized to allow for a wide range of uses anywhere in the world.

In order to establish real interesting between two organizations you generally need to follow an Implementation Guide which constrains and profiles the baseline standard. HL7 publishes some IGs itself and others are available from organizations like the CDC and DirectTrust.


Agreed. While FHIR is clearly more modern, in my experience it's the same data I'm seeing provided over HL7v2. It's easier to parse JSON and it's nice having the data labeled but values are still inconsistent.


It's worse than this. EPIC will have its own conventions, but then each hospital system will have its as well. It is the least rigorous data environment I've ever worked in.


This has nothing to do with Theme Hospital https://en.wikipedia.org/wiki/Theme_Hospital


Fun link: Theme Hospital is the follow-up to Theme Park, Theme Park was co-created by Demis Hassabis, Demis Hassabis founded DeepMind, DeepMind was acquired by Google, DeepMind has been incorporated into Google Health, surely Simulated Hospital sits within Google Health.


That's crazy. I remember playing Theme Hospital too.


Demis Hassabis wrote Theme Park? Wow, I did not expect that.


From what I've heard he had a small role in it.


Only $885 to cure Bloaty Head!

https://www.youtube.com/watch?v=Le_znuXcP2M


for those who miss Theme Hospital, the recent Two Point Hospital is a (good) spiritual sequel


Thought it was a clone of that one too! I never played much of the "sim" games but Theme Hospital was a good one.


When I did pentesting I couldn’t tell you how many times I’ve reviewed apps that used real patient data in their UTM and dev environments. Absolutely a big HIPAA violation to anyone who isn’t a covered entity. This is going to be a huge suggestion in reports for that kind of work.


Interesting. Having just been discharged from hospital after a 22 days stay, I would hope an awful lot of data is in my record. I had blood taken daily, I think standard obs (observations) at least 4 times a day, electronic scans (CT, Xray, EEG, ECG), lots of medicine via IV, oral and subcut. Multiple discussions daily with doctors and nurses (usually with the Drs, there was one asking questions, one listening, and another directly transcribing on a PC on a mobile workstation/crash cart. As meds and other care changed daily, presumably back room consulting resulted in treatment plan updates. BTW I'm very happy with the direction my health is heading, and feeling a lot better (and there is data being that up).

I haven't checked what my record is on our (Aussie) national health record yet [1]. I'll have a limited view for sure, but when I meet with my GP (general practitioner, MD, physician) to discuss my discharge summary on Monday, I might ask him to see what he has access to, and his complete it is.. I know I have been disappointed at what seemed to be quite devoid of info, despite all the controversy that ensued when it was introduced a few years back.

[1] https://www.digitalhealth.gov.au/initiatives-and-programs/my...


So I work with Oz GP practices. It's not very likely you'll get a reasonable export of your data. It's doable, but may require some convincing. Your practice will likely get a copy of at least some of the hospital documents - that's common. You could ask for the same export of your data that they'd send when moving you to another clinic if you want a complete view (you may get it on a CD though...). It's not that there's really any issue, it's just that the staff likely never handled this request for the patient and wouldn't be sure how to do it.

There shouldn't be any issue to view all the received documents on the screen during your visit though.


Australia doesn't have a unified system. Across the ditch we don't either, but at least we have a patient identification system that allows the searching of different systems for an individual via the National health Index number (NHI).


Every time I dip a little into healthcare tech I want to throw up. It's so bizarre, and even offensive, how convoluted and outdated these systems are. It's patches on top of patches on top of patches for decades. To the brave engineers working on maintaining these systems, I salute you.

Hospitals, banks, government... just think of the tech these heavily regulated industries could have had, and how it could have improved our experience, if they actually had a strong incentive to improve. And it goes far beyond the tech stack.

In the long term, regulation often kills competition, which inevitably kills innovation.


As someone working in this arena, I offer an alternative perspective for your consideration: healthcare was an early adopter of information technology and as a result many of its most core technologies come from a nearly unrecognizable time in computing. These systems are “outdated” as a result of success.

The current prevalence of these venerable technologies may be in part due to regulation, but more often has to do with their success.

HL7v2 is just token delimited ascii. Not unlike the similarly primitive but ubiquitous csv. The fields within it are defined by standards documents and once you use it a little, you can read enough to get the gist of most messages. As you might guess, modules in your language of choice are used to parse and compose HL7v2 so its detail isn’t that important.

[Edit: looks like the project has written Synthea out in favor of an integrated data generator]

Something I’d like to point out about Google Hospital is that under the hood it uses MITRE’s Synthea to generate synthetic patient data.

https://www.healthcareittoday.com/2017/09/13/open-source-too...

https://synthetichealth.github.io/synthea/


That's a thoughtful perspective.

It's also useful for expecting similar in future industries. Ultimately, I think it doesn't even matter if the reason is regulatory complexity, industry rigidity, or technical debt.

The question is how do we dig out of such holes.

If we were starting from scratch, it's almost certain we'd have better software, better firms, norms, standards and such within a short period of time.

How do we do that when not starting from scratch?


> In the long term, regulation often kills competition, which inevitably kills innovation.

So does non-regulation, often. A lot of this carp is "emergent" bureaucracy between insurers, other insurers, hospitals, etc. Regulation and monopolies often support eachother, merging into a nasty complex such as this... it's impossible to tell where one begins and the other ends.

I see financialization as a bigger factor currently than regulation. It's not a coincidence that "medical billing" is the epicentre.

I just don't buy the idea that less regulation means less mindless complexity.

The best if the early neoliberal/austrian economists was IMO Schumpater. Creative destruction. Complexes evolve into cludges over time.

Creative destruction is often presented as a feature of free markets... But that's theoretical. Irl, we see firms and complexes become effectively immortal... Immune to market forces and moated from competition. These tend to be the massively inefficient ones.

Regulated or not, insurance consortiums are much more like a government department than they are like a local restaurant. There are a lot more "private" regulators in the mix than public ones.


Nice effort, and its something. But something may not be good enough to solve hard problems.

Can simulated hospital simulate efforts by staff to quickly add into many charts a medication Rx - by taking one patient note and copying 1 note, then pasting that into several?

Does it simulate staff forgetting the dosage was not edited after pasting into all those charts ?

Or staff not logging out of their userID, to avoid the need to log in every time someone needs to add a note/document into a chart? So that , in fact, the doc added by Dr Smith, was actually done by his assistant Tully?

Life is complicated. People use systems in wild and unexpected ways.

Real solutions require real data.


In your view, what was the purpose of your comment? Who are you lecturing?


To warn developers of potential blind spots and wasting time, hoping this helps them contribute meaningfully to fix healthcate


> Can simulated hospital simulate efforts by staff to quickly add into many charts a medication Rx - by taking one patient note and copying 1 note, then pasting that into several?

What CPOE system involves notes? None of the handful I've used in the US.

No modern EMR here involves any copy pasting for orders.


Presumably you could try to simulate these kinds of mistakes.


This is pretty cool, but the pathway logic where the juiciest part is and it looks like it has been gutted for public release.


Nice effort. Would love to see more open source work from tech companies in this space. I remember working on a healthcare product (which involved HL7) years ago and it was a pain. The lack of documentation and any supporting tooling was "mind blowing".


This would be super useful once you can set up your own pathways. It will also need FHIR and the ability to generate CCDs (which FHIR is capable of) to more fully represent current health care integration technologies.


The large EHR vendors have fairly open sandbox environments for “partner” (loosely defined, can be anyone) development purposes. The advantage of these is that they use real implementations instead of the standard as hypothetical implementation.


How can you know if the fake data is sufficiently similar to real data? I wonder what Google's use case for this is.


Google has a whole bunch of products/tools in the healthcare space, and it seems like their contribution there is only growing. I've been working with FHIR/EHR adjacent tooling lately for a personal project, and a good number of both open source resources and SAAS products I've seen have been from Google.

More broadly, all big 3 cloud providers (Azure, AWS, and Google) have offerings for FHIR data storage and API access, as well as common NLP based healthcare data analysis workflows. Many of these seem relatively new, or as if they have had a lot of recent attention focused on them. I'm definitely interested in how/why these companies (as well as some other VC funded ones, like Medplum), are entering this space with products that are not directly sellable, but are rather things that other tools would have to build upon. It seems like AWS works directly with end-customers to use their APIs to build products, but I'm not sure what Azure and Google are doing.

This one's probably too complex for my use case, but I thought the concept looked very neat and wanted to share.


Ultimately this is just an orchestration tool for publishing HL7v2 messages. You have to write your own pathways and segments, or push preconfigured HL7v2 messages. Take a look at the dashboard, you'll see what I mean[1].

Fairly sure they built this to test their HL7v2 store[2]. Also presuming it hasn't been very active given that Google surprisingly shut down their Healthcare division (and probably restarted it somewhere)[3].

[1] https://github.com/google/simhospital/blob/master/docs/dashb... [2] https://cloud.google.com/healthcare-api/docs/concepts/hl7v2#... [3] https://www.businessinsider.com/google-health-shutting-down-...


I'd probably use this to estimate the effect of staffing changes or the impact of modifying staff workflows.


Baiting healthcare forcussed devs/startups for acquiring? Analyzing how people are building apps using this data to build their own? Free training data for their next product? Amazon is kinda growing in healthcare and maybe Google doesn't want to be behind.


Well it certainly was built for some future product, but it was probably released as open source solely so an engineer could have "community contributions" on their promo packet.


I don't believe this would count as a community contribution, which is generally I believe about contributions to Google's community.

While I don't think it's a bad career move, and that may have been part of their consideration, I find the trivialization of the work put in by someone here to open source something they thought would be a positive contribution hurtful.


Looks like this is related to their Care Studio product: https://health.google/caregivers/care-studio/


Is the patient data coherent? Like if you have a history some illness may have high prob to occur given certain history.


Are there any decent parsers in Python/Java/Ruby that someone can recommend?


https://hapifhir.io/ for open source.

Mirth is the commercial offering many people use - it is nice because it handles the MLLP protocol, gives you a mapping system, lets you transform and call your own endpoints. That’s also why it sucks: it’s a big heavy system.


Well there is the FOSS Mirth option and the NextGen Mirth Option which is the "premium" corporate-backed version. Mirth is like the glue for healthcare in the US. Most of the companies I've worked with use it. Whats another options? Redox? HAPI? Qvera?

It might not be great but I think its better then most offerings.


Redox has already priced themselves so high, it's unclear what value they are providing. A big reason to go with Redox is that you don't have to setup a VPN with the target hospital, have staff monitoring the VPN to ensure it's up and you don't have to parse your own HL7v2 messages.

IMHO Redox is now so expensive that you have to wonder if you might be better off doing all of this work yourself. As others have mentioned, there is a free Mirth offering and parsing HL7v2 is not that bad.


I think it really depends on your scale. If I'm a small healthtech startup and I can interchange with people hooked up to the candidate QHINs like eHealth and CommonWell I just saved thousands of independent VPN setups and HL7 mappings.

But I already have thousands of independent VPN setups and API integrations, so the cost is harder to justify at scale.


That is exactly how my previous employer (HIE) operated. Mirth was fairly low-cost and if we wanted. We could bootstrap our own Mirth Servers. But managing that many VPNs was a giant pain.



This would be such a great use case for an LLM.


Maybe for one shots, but not for continuous integration testing. Don't LLMs generate different responses for each request?


Why?


Probably could also train anonymous patient data on LLM and have it generate.


> Disclaimer: This is not an officially supported Google product.

It's okay Google, we know you'd abandon it a few months anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: