WatchMe – Watch for Changes on a Page

diehunde · on April 26, 2019

A couple of weeks ago I was listening to a podcast and the guest worked at this data integration company. In order to provide integration with external APIs they needed to do this in case they changed any endpoint without backwards compatibility. They would monitor the API docs for any change so they could see if something had to be updated on their side. I thought it was kind of "brute" but then I realized there's no good way of doing this today.

jerrong · on April 27, 2019

Hey, Tim here. I was the one talking about this on the podcast. As I mentioned in the podcast, the monitoring of APIs was a method that we quickly found out didn't really scale or help. It suffered from a few issues, mainly the vendors were updating their api's but not their docs. The way that we tackled it in the end was to let things fail. We have actually taken this approach with many things i.e. fail, but have a method which cleans up after the new changes have been deployed.

The way we handle it now is that our integration will start throwing serialisation errors. This will then have our platform send a probe to get a RAW response from the system and then we let the admin see side by side, the old data and the new data. This allows the developers to schedule a new deployment to make these fixes. The good thing is that when the new deployment is made, the orchestration around it will handle fixing the data that it couldn't resolve while the serialisation was failing.

We do get other benefits out of this, including the ability to better handle integrations where you have absolutely no idea what to expect from the API e.g. old Oracle, IBM products that don't have discovery endpoints like Dynamics, Salesforce etc does.

Our recommendation after managing so many integrations is "let things fail". Embrace a data integration pattern that allows things to fail.

dwilding · on April 27, 2019

I think you mentioned in the podcast that managing integrations is one of the toughest parts of your whole operation. (Apologies if I am misrepresentating what you said.) I can 100% believe that embracing failure is the only realistic way to handle this at scale.

But as someone responsible for producing API docs, it really pains me to say this! I know my team goes to great lengths to ensure that our API docs are up to date; we even maintain a changelog of every single doc modification, in large part because we have people integrating directly with our Swagger/OpenAPI based docs.

After a bit of Googling, it looks like CluedIn has an integration with us (Zuora). In case it can help you in any way, our changelog is available at [0].

Full disclosure: I work at Zuora, but am speaking for myself only.

[0]: https://community.zuora.com/t5/Developers/API-Changelog/gpm-...

mjlee · on April 26, 2019

Swagger/OpenAPI goes a fair way towards solving this problem by recommending semver, but it's far from widely adopted...

dwilding · on April 26, 2019

I'd be really interested in listening to that podcast. Are you able to share a link?

diehunde · on April 27, 2019

Sure. Here you https://www.dataengineeringpodcast.com/cluedin-data-fabric-e.... He talks about that around minute 40.

diminoten · on April 26, 2019

API versioning is the contract by which this is supposed to be managed...

toomuchtodo · on April 26, 2019

Breaking changes to production APIs with little to no notice occurs more often than you'd think. Fun at scale!

diminoten · on April 26, 2019

Haha I never revealed how much I think this happens, I just stated what the theoretical solution to this problem is.

soared · on April 26, 2019

This looks cool, but it seems like a great example of when a gui would be better than a command line.

majkinetor · on April 26, 2019

I disagree.

All GUI tools are limited and soon you get out of the comfort zone if you use them for more then 2 things.

That is the main reason I created AU framework[^1] for chocolatey which main choco repo uses to daily watch and update 250+ packages[^2]. Previously they used Ketarin tool which is GUI and was very limited and hard to on-board too. Its basically this, but you program simple PowerShell functions that can check for changes using whatever available in .NET.

The same goes for any declarative tool - you simply need a good programming language all the time.

[^1] https://github.com/majkinetor/au [^2] https://gist.github.com/choco-bot/a14b1e5bfaf70839b338eb1ab7...

nsriv · on April 26, 2019

I completely agree, would be interesting to see if a Qt GUI can be made with this as the underlying engine.

dmortin · on April 27, 2019

I use Distill for this which is implemented as a browser extension. Works pretty well: https://distill.io/

casperc · on April 27, 2019

Anything in the open source realm with a similar use case of watching for something specific to change on a web site?

Would be nice to run this from a home server/raspberry pi.

dmortin · on April 27, 2019

I don't know about OSS server side software, but I use the free version of Distill on my own computer in the browser, so it's run locally. My browser is always open, so it's a good platform for checking changes.

acct1771 · on April 28, 2019

Change Detection on F-Droid is pretty great, not sure if you wanted it on a PC or not.

peterdemin · on April 26, 2019

Very much like kibitzr https://kibitzr.github.io/ I like your use cases though.

1bent · on April 27, 2019

I use the similar (in concept) android app Web Alert https://play.google.com/store/apps/details?id=me.webalert which pops up notifications when a chosen part of a web page changes; I've configured it to watch for updates to all of Randy Harmelink's newly free kindle ebook lists at Ogre's Crypt.

jarfil · on April 26, 2019

How is this different from urlwatch?

dewey · on April 26, 2019

Have you looked at the README? It's not just for web page changes.

https://vsoch.github.io/watchme/watchers/index.html

Jerry2 · on April 26, 2019

Wasn't there a website that would send you diffs of a webpage too? I remember using something like this to keep tracks of updates to a blog without RSS...

dgavey · on April 26, 2019

DiffBot does something like this. https://www.diffbot.com/

dane-pgp · on April 26, 2019

I was wondering if someone would mention RSS (or Atom).

What I'd like to see is a site that generates diffs from webpage changes, and makes them available as an RSS feed you can subscribe to.

Ideally this would be a free service, and I wouldn't mind if the RSS entries were prepended with adverts (or required you to click through to a page hosted by the service in order to see the diff, surrounded by adverts).

filoleg · on April 26, 2019

Was it Kimono Labs? I remember years ago that everyone was raving about them, so I decided to give it a try. It was simple, nice, and a pleasure to work with. Fortunately for them, they got acquired a couple of years ago.

exhilaration · on April 26, 2019

Another option is https://visualping.io/ which I've used several times.

dataflow · on April 26, 2019

I liked changdetection.com which they acquired(?)... it was to-the-point and possibly the fastest-loading non-trivial site I knew.

Jordrok · on April 26, 2019

Yeah, I was sad about that. The old changedetection.com was vastly superior for my use case and, not to mention, free (or at least much much more generous with its free tier).

c0vfefe · on April 29, 2019

> much much more generous with its free tier

And thus became the buy-ee instead of the buy-er...the price of altruism!

arendtio · on April 26, 2019

Before GitHub made it possible to watch for releases only, I used blogtrottr [1] for a few repositories (just another alternative).

[1]: https://blogtrottr.com/

mavsman · on April 26, 2019

Ah nice. I was just going to ask what tool would be good for checking that my wife's blog is still show the right posts (and not something a hacker has modified). I'll check these things out.

arendtio · on April 26, 2019

Sometimes I am surprised how much time some people invest in building tools for use-cases that can easily be achieved with ubiquitous cli tools. I have a handful of scripts which are triggered by cron jobs / systemd timers and notify me via XMPP.

The only thing that is not present on a normal Linux system is the xmppsend [1] command for which I use a simple go binary that can easily be deployed.

[1] https://github.com/arendtio/xmppsend

reificator · on April 26, 2019

Yes, and you can just chain together ftp, curlftpfs, and svn to make Dropbox.

https://news.ycombinator.com/item?id=9224

arendtio · on April 26, 2019

So you think that my post reflects the common HN Meme?

I see the similarity in making a critical comment regarding a presented project but I think it is different. One of the biggest things Dropbox tackled was to make it a plug and play experience (e.g. you do not have to own a server). With WatchMe I have more the impression that you have to learn a new tool (incl. configuration) and have a more or less limited system afterwards (limited to watching urls or psutils).

I don't really see there any advantage over using shell scripts unless you want to limit what the job creator is allowed to do (which might be a valid use-case). Maybe I am missing something, but from my perspective, it looks like learning to use the existing system tools is of higher value than to learn how WatchMe works (even if their documentation looks nice).

ysw0 · on April 27, 2019

Very interesting!

federiconbo · on April 26, 2019

piece of work!