Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Monitors.txt - lazy webapp monitoring (monitorstxt.org)
101 points by eliot_sykes on Nov 5, 2011 | hide | past | favorite | 78 comments


I don't think I like this idea.

* Exposing a "monitors.txt" is a potential security hole that reveals more about your infrastructure than you meant to.

* Restricting monitors.txt to a given service (by htaccess IP restriction?) is brittle and gets you out of "write feature"-land and into "do sysadmin"-land, which you're trying to avoid anyhow.

* Cucumber is an unnecessary extra layer that doesn't do much besides just frustrate the user who has to write just the right English syntax to get what he wants. See also: Monit config definitions. The files read nicely once you get them configured, but it can be a downright bear to find out what just the right magic combination of keywords and phrases is to get it to do what you want.

* If you want client-side managed monitoring, why not have some interface to your chosen monitoring service that you invoke via your Cap deploy? That's something that you can make reality today, rather than waiting on service providers to implement it.

The concept of being able to define your monitoring as a part of your application and then just deploying it with the rest of the package is really appealing, but you still have to go tell monitoring services where to find your monitoring file, you still have to wait on monitoring services to implement support for it, and monitoring just isn't that difficult of a problem. How often do you change your monitoring rules?


monitoring just isn't that difficult of a problem.

Monitoring is a very difficult problem when you consider false positives, notifications (repetition, escalation etc), and reporting.

Monitoring isn't just about whether a service is up or down. It's about the quality of that service right now and over time.


I should have been more clear. Monitoring configuration isn't a very difficult problem. It doesn't change very often, and isn't difficult to set up with a decent UI. It's certainly not hard enough to require the establishment of a new standard.


Exposing a "monitors.txt" is a potential security hole that reveals more about your infrastructure than you meant to.

Everything you put in monitors.txt has to be publicly accessibly anyhow. What could is reveal that the actual website doesn't?


Not true. I can have a service monitor "http://foo:bar@mysite.com/password/protected/endpoint/ or "http://mysite.com/protected?auth_token=2caf5e77cfa057ad2f36e... just fine.

Passwords are security by obscurity, sure, but putting them in a world-readable file at a common known location is the height of ridiculous.


Then you can put monitor.txt behind that password protected endpoint too. My point is that whoever you need to give access to the file also has access to the actual website, which has all the information (it has to, since it's supposed to be able to test against it).


Just because it's all public does not mean I want one single source that in plain English (err.. cucumber?) defines key areas of my website that I want to ensure are always up.

Seems to me this just gives potentials hackers a map to the public end points of my most important areas.


English is an incredibly crappy language for writing tight specs in. The only appeal of english as a programming language is that people who cut big checks speak it. It's completely inappropriate for expressing formal logic.

The problem is when people think "Hey, I have a computer science problem. I know, I'll express that problem in English!" and now they have two problems


Only until they "parse" the English using regexes. Then they have three problems. Four if you count "there are lots of trivial variations on the correct English phrasing, but none of them work, so you have to refer to the exact grammar anyway, this just made it bulkier."


That's my exact gripe with Cucumber. It's basically just regex soup that's designed to make other people think that you just feed raw intent to your computer and it divines the correct behavior. I honestly don't see how it adds actual value, when Ruby is already so damn readable.


I've tried using cucumber a couple of times and failed, and I knew there was something wrong with it. Thanks for pointing that out.


Lots of people like it, so I'm sure that there's an argument to be made for it, but it feels unnecessary and crufty. Personally, I think that the strongest case for it is that it lets your manager without any technical ability feel like he's a useful part of the process, but it just seems like a solution in search of a problem.

Something like Steak (https://github.com/cavalle/steak) feels far more comfortable and readable to me.


I thought this was going to be as easy as 'if you can't reach /monitors.txt, then the site is down'

but seriously, does nobody else see a problem with your monitoring configuration being hosted on the site being monitored? It is a bit like asking a hospital patient to keep an eye on his own charts and to let you know if anything goes wrong ..

do people really switch monitoring companies often enough to justify having a standard configuration format?

ps. cucumber syntax is horrible because it perpetuates a 'everybody speaks english' view of the world


> but seriously, does nobody else see a problem with your monitoring configuration being hosted on the site being monitored?

So, if the file is unreachable, something is probably very very wrong.

Malicious changing of the file could be addressed by firing an alarm when the file changes.

> It is a bit like asking a hospital patient to keep an eye on his own charts and to let you know if anything goes wrong ..

No, it's a bit like leaving the chart with the patient, trusting that he won't destroy it.

> do people really switch monitoring companies often enough to justify having a standard configuration format?

Do people really switch monitoring companies rarely enough to justify keeping a myriad of incompatible configuration formats?

This also has the benefit of you owning all of your own data.

> ps. cucumber syntax is horrible because it perpetuates a 'everybody speaks english' view of the world

Cucumber syntax is great because a lot of people are already used to it. Oh, and it's completely language independent too. https://github.com/cucumber/cucumber/tree/master/examples/i1...


Seems like a service would want/need to cache older versions of monitors.txt, and continue running them for some configurable persistence period before accepting a missing/new/possibly-malicious monitors.txt as giving a true read.


Lol yes I'll add something on this to the site...

My thoughts are the monitoring provider would keep a copy of monitors.txt and work from that, and check for changes regularly. If they can't get to monitors.txt that would trigger an alert.


It is meant to be used mostly as a configuration tool for the monitoring host. If the developer has recently updated/deployed then they are probably working on the site and can ensure monitors.txt is available.


I'd much rather have something akin to an appcache.manifest file:

    # monitors.txt - see http://monitorstxt.org for more info

    GET:
        http://monitors.txt
        http://otherservice.monitors.txt
    
    POST:
        http://monitors.txt/service q:lalala user:trololol
        http://monitors.txt/service2 id:250
    
    RESOLVE:
        monitorstxt.org 207.97.227.245
    
    PERFORMANCE:
        http://duckduckgo < 2s
        http://duckduckgo/images < 3s
IMO feature testing should be part of your test suite pre-deploy, not the monitoring service.


Thanks for the suggestion, I like how succinct it is.

monitors.txt's purpose isn't to replace testing new features during development. Monitors.txt replaces the process of logging on to a 3rd party's app to configure the monitoring for the new feature.


This syntax seems a lot less confusing and more powerful than the subset of English presented.

Another bit of syntax that I think is missing is something like:

GUAGE:

  http://domain/stats/logins 60s
COUNTER:

  http://domain/stats/logins 60s
AVG:

  http://domain/stats/logins 24h
Something like this would allow you to setup not only with simple external monitoring systems, but with internal systems like Nagios, Ganglia, ZenOSS, etc.

Of course, I do have to agree with many people that I am a bit uneasy PUBLISHING all of this data for all to see.


Interesting, incorporated this into the YAML sample: https://github.com/eliotsykes/monitorstxt/blob/gh-pages/moni...


Really love the idea of text file based monitoring, hate the cucumber interface. I stomach it in rails but I'm not a big fan of English in my code


I'm curious about this -- don't you use APIs which use largely, if not universally, English identifiers? Or is there a version of, say, the .NET Framework where the API itself is written in French, German, or Spanish? Or what language has a custom parser for each language so that you can write keywords ("if", "for", "do") in your native language?


When discussing code in English, there's a huge difference between:

  result = object.doSomething(someString.toUpper(), someInt);
and:

  with the instance named object, call the doSomething method
  passing as parameters the result of calling the toUpper
  method on someString and someInt storing the method return
  value in the result variable


Believe it or not, VisualBasic (and every built in function) in Excel is translated in localized versions, and of course the English keywords are not kept. It's a mess.


When you specify and equation do you use english or mathematical notation?

five times twenty divided by three to the power of twenty eight is a lot harder to understand, much less defined than:

(5*20/3)^28

Also, most imperative languages are crap because of the very tight name binding and difficulty in aliasing as well as insanely large number of unnecessary keywords.

In many functional languages if you speak spanish simply put:

let si = if

and voila you don't have silly keyword issues. (Yes, you can't redefine let easily, but it's a far better situation than the keyword smorgasbord that most imperative languages have)

Cucumber solves the problem of: "My manager with an MBA can't read my unit tests" and exchanges it for the problem of "Programmers don't know what the fuck the test does"


He means of "English, the natural language", dummy.


Downvotes?

1) Friends of the parent poster? 2) People thinking "dummy" might hurt one's feelings? 3) People willing to accept needless pedantry when what was meant is obvious? 4) People that also think that the original comment was against english keywords?


Well, most people around here seem to think that throwing around invectives lowers the quality of discourse, so yes, you probably got downvoted for that. But actually, and, yes, I may be stupid and slow, but I actually thought the author was a non-native English speaker who didn't appreciate blocks of English in his code. I made that jump after considering his statement in light of other large blocks of natural language that often exist in code -- comments.


Fantastic idea. I see humans.txt (http://humanstxt.org/) was an influence. Why not stay in a concise format like that?


I've not been able to figure out an alternative to cucumber that has the same flexibility and readability.

For example, I've not a clue yet on how to improve on this example:

  Feature: DuckDuckGo Search
    It should continue to kick ass
    And I should be able to search

    Scenario: Homepage performance
      When I go to http://www.duckduckgo.com
      Then I should see the page downloaded in less than 0.5 seconds
      And I should see the page assets downloaded in less than 2 seconds

    Scenario: Search over SSL
      Given I go to https://www.duckduckgo.com
      When I fill in "q" with "site:news.ycombinator.com"
      And I press submit
      Then I should be on https://duckduckgo.com/?q=site%3Anews.ycombinator.com
      And I should see "Hacker News" within "h2"


The problem is that this looks tempting, but what if I change:

    Then I should see the page downloaded in less than 0.5 seconds
to

    Then I want the page to download in under a second.
The natural English looks very tempting, but the actual terms you can use are very restrictive.


Thinking monitors.txt would need a validator, plus the monitoring providers can alert customers when they've entered something invalid.


Brilliant.

Given the target audience, I think a more concise description language that pseudo natural language would work better. Regardless, such a service can bring basic alerting to the huge number of sites which currently have nothing.


Thanks for the feedback. I'd love to hear any ideas on language alternatives.


I agree with the parent. Something very simplistic in json or yaml would be preferred.


Interesting idea. The proliferation of convention-dictated top-level URLs offends the design sense of many web architects. Possible alternatives are to:

• shoehorn a pointer to a varying URL inside an existing convention-dictated place – as for example with sitemap pointers inside robots.txt files.

• let the resource live anywhere but specify its location to consuming services via some out-of-band mechanism. That is, you still have to tell any monitoring provider where your particular monitoring-spec lives, probably via its signup interface, but you can still use the same format and unmoving file with multiple providers.



I'm keen on your idea for an out-of-band mechanism, and would be needed for the obfuscated URL idea

"For the more gung-ho, obfuscate the URL and use SSL (e.g. https://yoursite.tld/something-unguessable/monitors.txt) and tell your provider where to find your monitors.txt"


Another approach could be to support HTTP Basic auth.


I like the idea, but not the URL. How about moving it to "/.well-known/monitors.txt" in conformance with RFC 5785?


Thanks for this, looks like a step in the right direction, do you know what successful applications there have been to this RFC?

http://tools.ietf.org/html/rfc5785


The biggest one I know if is "host-meta" (RFC 6415), which is used by webfinger (properly supported by gmail), and a few other "social"-type protocols.


Nice idea. The monitors.txt should/could also contain selenium or imacros scripts for transaction monitoring, for use by monitoring services like alertfox or browsermob.


Ugh, natural language. Never works well.


I'm open to implementing alternative formats to Gherkin.

If you've got an idea of some psuedo code you could write that'd make the monitors.txt idea more attractive, please write some, I'm keen for feedback.


"Should see the page assets" is kind of vague. You don't want to get 3 AM alerts because somebody else's transcluded widget is slow, if you can't do anything about it and the page is usable without it anyway. You probably want to distinguish your resources and third parties which may or may not have their own monitors.txt, as well as your resources that are expected to be slow because they're not static.

And is this presuming a headless browser? There are a lot of poorly-authored documents which blow up if their js can't get certain resources, even though the URLs don't appear in the markup where a scraper would see them. I'm all for having a pure-HTML mode that always works, because that's basic competence, but I'd want my progressive enhancements monitored as well at a lower priority.



Love the idea of using natural language to write all the specs! Also, got amused by reading the HN comments, where all jedis, ninjas and rockstars are arguing against using English... kind of expected.


Big thanks to HNers for the encouraging feedback. Based on it I'm going to:

- Put some monitors.txt examples in JSON, YAML, Gherkin and XML up on http://monitorstxt.org to help get the syntax right and satisfy the majority of language preferences. Feedback/contributions welcome, fork on github here: https://github.com/eliotsykes/monitorstxt

- Continue work on prototype monitoring app, with support for the above formats

- Open up the prototype to interested devs


I like the idea. What about if the definition was in the form of HTTP requests, something like:

    GET /

    GET /login

    POST /login
    user={{CUSTID}}&pass={{PWORD}}

    GET /search?q={{QUERY}}
The monitors.txt file could contain pointers to the definition files (one file per usecase), ideally with arbitrary metadata.

If there's some kind of consensus, now or later on, I'll implement it in our monitoring service (blamestella.com).


Thanks for the psuedo code and the consideration of implementing it if there's consensus, much appreciated.

I'm working on implementing monitors.txt for my own sites and will open it up to a few devs to get the API sculpted just right. After then I hope they'll be other monitoring services as keen as blamestella to have a go at it.


So far people seem to like the idea, but not so keen on cucumber for the language. Please contribute any pseudo code ideas below as I want to get this right


Honestly, I like the cucumber format, but what you ought to do is publish an API of phrases so that monitoring providers can target a standard set of monitoring operations. Right know, I don't see how they can execute a generic spec as described.


Cucumber devs might not like this, but at the moment I'm taking cues for the API phrases from cucumber-rails-training-wheels (web_steps.rb for capybara) - I see why the devs wanted it removed, but it works great for something like this that needs wide support.


Something Javascript-based might work. Perhaps the site-wanting-monitoring supplies a collection of JS scripts to run, each of which will be provided a standard object providing assertions/conclusions/reporting about specific features/URLs/scenarios.


Wouldn't sending raw Javascript upstream limit adoption due to the increased burden of securing the provider against malicious or buggy scripts? For this use case, a declarative solution seems like a requirement.


It's a burden, sure, but is the burden that large? Accepting and sandboxing remote Javascript seems relatively well understood, with lots of open reusable code already available. And perhaps the burden is offset by the familiarity and flexibility.


Why make it human readable?

Especially if you want lots of providers to support it, pick something super easy to parse.. JSON is the obvious choice there:

   [{ method: "GET",
      url: "index.html",
      tests: [ contains: "nyaruka",
               response_in_ms: 500,
               status: 200 ] },
    { method: "POST",
      url: "login.html",
      params: { foo: "bar" }
      tests: [ "status": 200 ] }]


I rather use YAML:

    - method: GET
      url: index.html
      tests:
        - contains: nyaruka
        - response_in_ms: 500
        - status: 200
    - method: POST
      url: login.html
      params:
        foo: bar
      tests:
        - status: 200


JSON is harder to read than XML, to my human eyes.


Wow, first time I've ever heard someone say that. Most people I know think XML is horrid to read, myself included. I'd take json or yaml over XML any day


I'm probably in the minority on this one yes.


depends if it's pretty printed or not (like the example above)


Maybe need to support JSON, XML, and YAML.


Ah, that sort of monitoring... I was hoping it'd be something like a sitemap with pages' timestamps, so that monitoring sites like http://www.followthatpage.com wouldn't need to poll actual pages and needlessly pollute my access.log (completely disrespecting robots.txt along the way).


People can't even agree on one programming language even though they all manipulate the same bits. Now we have a monitor.txt. So you are telling me that the site simply existing and being up is not enough to satisfy it's state of being up? Keep making extra crap people and you dig your own hole.


Cool, but you won't get anywhere without a formal grammar or, at the very least, a reference implementation...


Agreed, working on it...


I think there is some point in this .We deploy the same webapplication to multiple servers. In some cases different versions of the app should be monitored in different ways. With monitors.txt I could include the information on the web app and the monitoring tool would automatically pick it up.


I like this! I wish there was a way to authenticate though. So, as you can gauge in deeper and check if services are really working. But of course, if the file is public, that throws spanner in the plans.


Talking about monitoring deeper in the app...Monitoring cron jobs is one thing that I've never done in a way I'm comfortable with, or enough. That realization is what lead to monitors.txt. I'm working on something like this along with a rails plugin for checking jobs run on time:

  Feature: Jobs should run on time

    Scenario: Search Subscription Job ran on time
      Given the job started and ended at:
        | start                   | end                     |
        | 2011-10-30 05:30:01 GMT | 2011-10-30 05:37:42 GMT |
        | 2011-10-29 05:30:01 GMT | 2011-10-29 05:37:42 GMT |
      Then I should see the job took less than 10 minutes
      And I should see the job started every day at "05:30 GMT"
The rails plugin generates the text above and start/end run times with something like this:

  class SearchSubscriptionJob

    def perform
      # Fire off the search subscription emails
      ...
    end

    add_job_monitor :job_method => :perform, 
      :should_run => [:daily, "05:30 GMT"],
      :max_duration => 10.minutes,
      :alerts => {:email => 'admin@yoursite.tld',
                  :sms => '+441234567890',
                  :twitter => 'twitteruser'}


Like monitors.txt? Henchmon Beta wants you http://news.ycombinator.com/item?id=3202535


So how many of you like the cucumber/gherkin format over anything else mentioned? Would be helpful to know to gauge interest.


If I ran a monitoring service I would honestly never bother implementing it. Parsing natural language can be really, really difficult. I realize this is just pseudo natural language as it has a very restricted subset of terms, however, monitoring is a complex field and to fully support most features it'll eventually become really complex and really error prone.


Interesting idea, I'm not so sure I agree about "For the more gung-ho, obfuscate the URL and use SSL (e.g. https://yoursite.tld/something-unguessable/monitors.txt) and tell your provider where to find your monitors.txt'.

http://en.wikipedia.org/wiki/Security_through_obscurity


If "something unguessable" is a 64 bit number from /dev/random, and it's disclosed only to monitoring providers, and indexes are turned off on the server, it's not "security through obscurity"; it's a key.

Virtually every web app in the world relies on a similar security system. You just don't notice, because we call the "key" in those systems a "cookie", not a "dynamic URL path component".

There are reasons why the URL key is inferior to other keys used by web applications, but they are fiddley. If monitors are unlikely to have extremely sensitive information in them (and you'd hope they wouldn't given their intent), it's fine to use URL keys.


That's why I added "gung-ho"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: