I guess the issue is that real world does smell terribly. I wish I could just have the perfect World like my side projects always have, but not the case with the commercial ones making money.
What if user sends some sort of auth token or other type of data that you yourself can't validate and third party gives you 4xx for it?
You won't know ahead of time whether that token or data is valid, only after making a request to the third party.
- info - when this was expected and system/process is prepared for that (like automatic retry, fallback to local copy, offline mode, event driven with persistent queue etc)
- warning - when system/process was able to continue but in degraded manner, maybe leaving decision to retry to user or other part of system, or maybe just relying on someone checking logs for unexpected events, this of course depends if that external system is required for some action or in some way optional
- error - when system/process is not able to continue and particular action has been stopped immediately, this includes situation where retry mechanism is not implemented for step required for completion of particular action
- fatal - you need to restart something, either manually or by external watchdog, you don’t expect this kind of logs for simple 5xx
You are not the OP, but I think I was trying to point out this example case in relation to their descriptions of Error/Warnings.
This scenario may or may not yield in data/state loss, it may also be something that you, yourself can't immediately fix. And if it's temporary, what is the point of creating an issue and prioritizing.
I guess my point is that to any such categorization of errors or warnings there are way too many counter examples to be able to describe them like that.
So I'd usually think that Errors are something that I would heuristically want to quickly react to and investigate (e.g. being paged, while Warnings are something I would periodically check in (e.g. weekly).
Like so many things in this industry the point is establishing a shared meaning for all the humans involved, regardless of how uninvolved people think.
That being said, I find tying the level to expected action a more useful way to classify them.
But what I also see frequently is people trying to do the impossible and idealistic things because they read somewhere that something should mean X, when things are never so clearly cut, so either it is not such a simplistic issue and should be understood as not such a simple issue, or there might be a better more practical definition for it. We should first start from what are we using Logs for. Are we using those for debugging, or so we get alerted or both?
If for debugging, the levels seem relevant in the sense of how quickly we are able to use that information to understand what is going wrong. Out of potential sea of logs we want to see first what were the most likely culprits for something causing something to go wrong. So the higher the log level, the higher likelihood of this event causing something to go wrong.
If for alerting, they should reflect on how bad is this particular thing happening for the business and would help us to set a threshold for when we page or have to react to something.
Well, the GPs criteria are quite good. But what you should actually do depends on a lot more things than the ones you wrote in your comment. It could be so irrelevant to only deserve a trace log, or so important to get a warning.
Also, you should have event logs you can look to make administrative decisions. That information surely fits into those, you will want to know about it when deciding to switch to another provider or renegotiate something.
For service A, a 500 error may be common and you just need to try again, and a descriptive 400 error indicates the original request was actually handled. In these cases I'd log as a warning.
For service B, a 500 error may indicate the whole API is down, in which case I'd log a warning and not try any more requests for 5 minutes.
For service C, a 500 error may be an anomaly and treat it as hard error and log as error.
What's the difference between B and C? API being down seems like an anomaly.
Also, you can't know how frequently you'll get 500s at the time you're doing integration, so you'll have to go back after some time to revisit log severities. Which doesn't sound optimal.
Exactly. What’s worse is that if you have something like a web service that calls an external API, when that API goes down your log is going to be littered with errors and possibly even tracebacks which is just noise. If you set up a simple “email me on error” kind of service you will get as many emails as there were user requests.
In theory some sort of internal API status tracking thing would be better that has some heuristic of is the API up or down and the error rate. It should warn you when the API is down and when it comes back up. Logging could still show an error or a warning for each request but you don’t need to get an email about each one.
I forgot to mention that for service B, the API being down is a common, daily occurrence and does not last long. The behavior of services A-C is from my real world experience.
I do mean revisiting the log seventies as the behavior of the API becomes known. You start off treating every error as a hard error. As you learn the behavior of the API over time, you adjust the logging and error handling accordingly.
This might be controversial, but I'd say if it's fine after a retry, then it doesn't need a warning.
Because what I'd want to know is how often does it fail, which is a metric not a log.
So expose <third party api failure rate> as a metric not a log.
If feeding logs into datadog or similar is the only way you're collecting metrics, then you aren't treating your observablity with the respect it deserves. Put in real counters so you're not just reacting to what catches your eye in the logs.
If the third party being down has a knock-on effect to your own system functionality / uptime, then it needs to be a warning or error, but you should also put in the backlog a ticket to de-couple your uptime from that third-party, be it retries, queues, or other mitigations ( alternate providers? ).
By implementing a retry you planned for that third party to be down, so it's just business as usual if it suceeds on retry.
> If the third party being down has a knock-on effect to your own system functionality / uptime, then it needs to be a warning or error, but you should also put in the backlog a ticket to de-couple your uptime from that third-party, be it retries, queues, or other mitigations ( alternate providers? ).
How do you define uptime? What if e.g. it's a social login / data linking and that provider is down? You could have multiple logins and your own e-mail and password, but you still might lose users because the provider is down. How do you log that? Or do you only put it as a metric?
You may log that or count failures in some metric, but the correct answer is to have a health check on third party service and an alert when that service is down. Logs may help to understand the nature of the incident, but they are not the channel through which you are informed about such problems.
The different issue is when third party broke the contract, so suddenly you get a lot of 4xx or 5xx responses, likely unrecoverable. Then you get ERROR level messages in the log (because it’s unexpected problem) and an alert when there’s a spike.
> This might be controversial, but I'd say if it's fine after a retry, then it doesn't need a warning.
>
> Because what I'd want to know is how often does it fail, which is a metric not a log.
It’s not controversial; you just want something different. I want the opposite: I want to know why/how it fails; counting how often it does is secondary. I want a log that says "I sent this payload to this API and I got this error in return", so that later I can debug if my payload was problematic, and/or show it to the third party if they need it.
My main gripe with metrics is that they are not easily discoverable like logs are. Even if you capture a list of all the metrics emitted from an application, they often have zero context and so the semantics are a bit hard to decipher.
> * Is it possible for humans to get a vague impression of other humans' thoughts via this mechanism? Not via body language, but "telepathy" (it'd obviously only work over very short ranges). If it is possible, maybe it is what some people supposedly feel as "auras"
If any of it was possible, it would be easily scientifically provable by very simple experiments. The fact that it hasn't been proven while people would have very high motivations to prove it, suggests it's very probably not happening.
How can you tell it is not a placebo? I guess it's just weird for me to think that it seems to do absolutely nothing to me, yet some people claim effects?
Even if it's a placebo, that it got the job done is what matters. But how would I even test Ig it was a placebo effect? I already have the experience of not being able to go the lengths I did with it, without it. Like I really drove myself to meet some bad deadlines, and paid for it several days after; I couldn't drive myself like that otherwise (I don't drink coffee, energy drinks, etc).
Aren't we having major issues with there being too many small libraries right now and dependency chain that grows exponentially? I have thought LLMs will actually benefit us a lot here, with not having to use a lib for every little thing (leftpad etc?).
That's primarily a culture problem, mostly with Javascript (you don't really see the same issue in most language ecosystems). Having lots of tiny libraries is bad, but writing things covered by libraries instead of using _sensible_ libraries is also bad.
(IMO Javascript desperately needs an equivalent to Boost, or at the very least something like Apache Commons.)
That was probably a node / npm thing, because they had no stdlib it was quite common to have many small libraries.
I consider it an absolute golden rule for coding to not write unnecessary code & don't write collections.
I still see a lot of C that ought not to have been written.
I'm a grey beard, and don't fear for my job. But not relying on AI if it's faster to write, is as silly as refusing a correct autocomplete and typing it by hand. The bytes don't come out better
Both are taken into account. Potential profitability is taken into account with growth companies. Circular funding has no effect on that. With unprofitable companies case is made on how risky the company is and what the potential profit will be in the future.
I would disagree, at least in the short term. Exhibit A: AMD's stock rose 36% at the announcement of their OpenAI circular deal. If 1+1 = 3 and there is potential profit to be gleaned from such a deal, then it isn't circular, and is just plain good business. But the fact that AMD's stock collapsed back to where it was shortly after suggests otherwise
This isn't to do with this being circular. It is moreso that AMD is thought to be falling behind in AI race, but OpenAI doing a deal with them is a strong indicator that they might have potential to come back.
The deal allows OpenAI to purchase up to 6GW of AMD GPUs, while AMD grants OpenAI warrants for up to 10% equity tied to performance milestones, creating a closed-loop of compute, equity, and potential self-funding hardware purchases. Circular.
From the announcement per se, AMD's stock rose to a level that effectively canceled out whatever liabilities they were committing to as part of the deal, so it was all gravy, despite it being a press release
Why is that generous? This is clearly showing OpenAI's belief in AMD, which in turn would give investors a large amount of confidence. A lot of that market cap came from Nvidia, which lost around 50B that day while AMD gained 70B in market cap. It all makes sense to me.
Where do you see the 70B being erased? But in any case it is also plausible that a confidence changes given new stream of constant information, so I don't see how it would be problematic if it did lose given new information.
If you think this comment was not written by the account holder or that account sales are occurring, please do as asked by the guidelines in such cases and contact the mods to report that, instead of posting about it in a discussion.
I wouldn't know if account sales was occurring, my first thought was that they sincerely wanted to have AI critique this article and post here. I don't think it is a good enough reason for me personally to flag.
I think if someone was trying to use AI to farm they wouldn't post these types of critiques, but something safer rather.
But I did wanted to see what their reasoning for posting those AI critiques was, and they answered, so I got my curiousity satisifed to an extent.
Irrelevant; “If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.” and especially “Don't feed egregious comments by replying; flag them instead.” both apply.
reply