Hacker News new | past | comments | ask | show | jobs | submit login
WatchMe – Watch for Changes on a Page (github.com/vsoch)
182 points by Bilters on April 26, 2019 | hide | past | favorite | 35 comments



A couple of weeks ago I was listening to a podcast and the guest worked at this data integration company. In order to provide integration with external APIs they needed to do this in case they changed any endpoint without backwards compatibility. They would monitor the API docs for any change so they could see if something had to be updated on their side. I thought it was kind of "brute" but then I realized there's no good way of doing this today.


Hey, Tim here. I was the one talking about this on the podcast. As I mentioned in the podcast, the monitoring of APIs was a method that we quickly found out didn't really scale or help. It suffered from a few issues, mainly the vendors were updating their api's but not their docs. The way that we tackled it in the end was to let things fail. We have actually taken this approach with many things i.e. fail, but have a method which cleans up after the new changes have been deployed.

The way we handle it now is that our integration will start throwing serialisation errors. This will then have our platform send a probe to get a RAW response from the system and then we let the admin see side by side, the old data and the new data. This allows the developers to schedule a new deployment to make these fixes. The good thing is that when the new deployment is made, the orchestration around it will handle fixing the data that it couldn't resolve while the serialisation was failing.

We do get other benefits out of this, including the ability to better handle integrations where you have absolutely no idea what to expect from the API e.g. old Oracle, IBM products that don't have discovery endpoints like Dynamics, Salesforce etc does.

Our recommendation after managing so many integrations is "let things fail". Embrace a data integration pattern that allows things to fail.


I think you mentioned in the podcast that managing integrations is one of the toughest parts of your whole operation. (Apologies if I am misrepresentating what you said.) I can 100% believe that embracing failure is the only realistic way to handle this at scale.

But as someone responsible for producing API docs, it really pains me to say this! I know my team goes to great lengths to ensure that our API docs are up to date; we even maintain a changelog of every single doc modification, in large part because we have people integrating directly with our Swagger/OpenAPI based docs.

After a bit of Googling, it looks like CluedIn has an integration with us (Zuora). In case it can help you in any way, our changelog is available at [0].

Full disclosure: I work at Zuora, but am speaking for myself only.

[0]: https://community.zuora.com/t5/Developers/API-Changelog/gpm-...


Swagger/OpenAPI goes a fair way towards solving this problem by recommending semver, but it's far from widely adopted...


I'd be really interested in listening to that podcast. Are you able to share a link?


Sure. Here you https://www.dataengineeringpodcast.com/cluedin-data-fabric-e.... He talks about that around minute 40.


API versioning is the contract by which this is supposed to be managed...


Breaking changes to production APIs with little to no notice occurs more often than you'd think. Fun at scale!


Haha I never revealed how much I think this happens, I just stated what the theoretical solution to this problem is.


This looks cool, but it seems like a great example of when a gui would be better than a command line.


I disagree.

All GUI tools are limited and soon you get out of the comfort zone if you use them for more then 2 things.

That is the main reason I created AU framework[^1] for chocolatey which main choco repo uses to daily watch and update 250+ packages[^2]. Previously they used Ketarin tool which is GUI and was very limited and hard to on-board too. Its basically this, but you program simple PowerShell functions that can check for changes using whatever available in .NET.

The same goes for any declarative tool - you simply need a good programming language all the time.

[^1] https://github.com/majkinetor/au [^2] https://gist.github.com/choco-bot/a14b1e5bfaf70839b338eb1ab7...


I completely agree, would be interesting to see if a Qt GUI can be made with this as the underlying engine.


I use Distill for this which is implemented as a browser extension. Works pretty well: https://distill.io/


Anything in the open source realm with a similar use case of watching for something specific to change on a web site?

Would be nice to run this from a home server/raspberry pi.


I don't know about OSS server side software, but I use the free version of Distill on my own computer in the browser, so it's run locally. My browser is always open, so it's a good platform for checking changes.


Change Detection on F-Droid is pretty great, not sure if you wanted it on a PC or not.


Very much like kibitzr https://kibitzr.github.io/ I like your use cases though.


I use the similar (in concept) android app Web Alert https://play.google.com/store/apps/details?id=me.webalert which pops up notifications when a chosen part of a web page changes; I've configured it to watch for updates to all of Randy Harmelink's newly free kindle ebook lists at Ogre's Crypt.


How is this different from urlwatch?


Have you looked at the README? It's not just for web page changes.

https://vsoch.github.io/watchme/watchers/index.html


Wasn't there a website that would send you diffs of a webpage too? I remember using something like this to keep tracks of updates to a blog without RSS...


DiffBot does something like this. https://www.diffbot.com/


I was wondering if someone would mention RSS (or Atom).

What I'd like to see is a site that generates diffs from webpage changes, and makes them available as an RSS feed you can subscribe to.

Ideally this would be a free service, and I wouldn't mind if the RSS entries were prepended with adverts (or required you to click through to a page hosted by the service in order to see the diff, surrounded by adverts).


Was it Kimono Labs? I remember years ago that everyone was raving about them, so I decided to give it a try. It was simple, nice, and a pleasure to work with. Fortunately for them, they got acquired a couple of years ago.


Another option is https://visualping.io/ which I've used several times.


I liked changdetection.com which they acquired(?)... it was to-the-point and possibly the fastest-loading non-trivial site I knew.


Yeah, I was sad about that. The old changedetection.com was vastly superior for my use case and, not to mention, free (or at least much much more generous with its free tier).


> much much more generous with its free tier

And thus became the buy-ee instead of the buy-er...the price of altruism!


Before GitHub made it possible to watch for releases only, I used blogtrottr [1] for a few repositories (just another alternative).

[1]: https://blogtrottr.com/


Ah nice. I was just going to ask what tool would be good for checking that my wife's blog is still show the right posts (and not something a hacker has modified). I'll check these things out.


Sometimes I am surprised how much time some people invest in building tools for use-cases that can easily be achieved with ubiquitous cli tools. I have a handful of scripts which are triggered by cron jobs / systemd timers and notify me via XMPP.

The only thing that is not present on a normal Linux system is the xmppsend [1] command for which I use a simple go binary that can easily be deployed.

[1] https://github.com/arendtio/xmppsend


Yes, and you can just chain together ftp, curlftpfs, and svn to make Dropbox.

https://news.ycombinator.com/item?id=9224


So you think that my post reflects the common HN Meme?

I see the similarity in making a critical comment regarding a presented project but I think it is different. One of the biggest things Dropbox tackled was to make it a plug and play experience (e.g. you do not have to own a server). With WatchMe I have more the impression that you have to learn a new tool (incl. configuration) and have a more or less limited system afterwards (limited to watching urls or psutils).

I don't really see there any advantage over using shell scripts unless you want to limit what the job creator is allowed to do (which might be a valid use-case). Maybe I am missing something, but from my perspective, it looks like learning to use the existing system tools is of higher value than to learn how WatchMe works (even if their documentation looks nice).


Very interesting!


piece of work!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: