Show HN: I built a service to help companies reduce AWS spend by 50%

FujiApple · on Feb 3, 2022

The IAM policy shown on the website allows sts:AssumeRole without any restricting on resources or conditions which will be a deal breaker for many. Presumably you can restrict this to certain AWS principals?

kavehkhorram · on Feb 11, 2022

Hi FujiApple,

We use the sts:AssumeRole policy to create temporary short-lived credentials for us to get access to the AWS APIs on your behalf. The assume role permission is constrained only to the policy we've defined on our landing page and in our app, which are read-only + the ability for us to manage your reservations on your behalf.

gigatexal · on Feb 3, 2022

Yeah this should be addressed.

Also kudos I guess for landing kik as a customer.

freediver · on Feb 3, 2022

A good place to start with cloud savings is just knowing what is out there. I built CloudOptimizer.io [1] for this purpose, aggregrating 10 cloud providers in one place.

Running it as a free, hobby project.

[1] https://cloudoptimizer.io

conductr · on Feb 3, 2022

Nice. Are you calculating alibaba $/month correctly? Multi million $ per month? Maybe a currency thing?

Mizza · on Feb 3, 2022

This is cool, anything in this space is useful, but everywhere I've been with significant AWS spend has already negotiated something directly. Caching and proper autoscaling policies usually take care of the rest, I've found the tricky thing to be RDS..

On the other hand.. can I buy time on abandonded RIs directly from you for extra savings?

kavehkhorram · on Feb 3, 2022

Yes - you can! Feel free to shoot me a note and we can chat more about this: kaveh@usage.ai

phamilton · on Feb 3, 2022

RIs like this are great, but the biggest savings we've found is moving everything we can to Spot Instances. We're hoping Aurora on spot becomes a thing as that's really our only remaining RI/on-demand cost.

matwood · on Feb 3, 2022

Yeah, spots is where it's at. The problem is to leverage spots, the application in question needs to be 'cloud native'. Many companies moving to the cloud are simply picking up legacy app servers and dropping them on an ec2 instances and declaring success. Those will simply not survive the properties of spots.

tedmiston · on Feb 3, 2022

Relevant: https://github.com/cloudutil/AutoSpotting

I've seen some third party services that automate migration to / replacement with spot instances, but haven't used them yet personally.

Going serverless, in many places, has been the most effective cost optimization for me.

vasili111 · on Feb 3, 2022

Why serverless is cheap? How Amazon can offer serverless on lower cost than instances? What I mean, Amazon anyway needs to run instances and build on top of them serverless. So were the cost reduction happens with serverless for Amazon?

phamilton · on Feb 4, 2022

The amount of waste we experience can be really high. At Remind we started tracking this using a metric we call OUCH (Overprivisioned Underutilized Cpu Hours).

Consider a k8s/ECS installation that autoscales.

Each application will have a CPU target. In order to prevent spiky traffic patterns from overwhelming the running containers, we target 70% CPU. As usage goes above 70% CPU, we will launch more containers.

The k8s cluster will have a reservation target. In order to allow fast launching of new containers, we want the cluster to have only an 80% occupancy rate. If more than 80% of the cluster is reserved, we launch more instances and expand the cluster.

So to run my autoscaling container-based applications in a way that will be reactive and respond to incoming load, I have to leave 44% (1 - .7*.8) of my hardware idle. If we also factor in that AWS itself doesn't target a 100% occupancy rate (because then nobody could launch new instances), each unit of CPU I actually use requires a significant amount of idle infrastructure. Easily double, possibly triple in the larger scheme of things. All of that is either directly paid for (k8s nodes) or indirectly paid for (ec2 pricing inevitably has idle capacity costs baked in).

With serverless, we would eliminate many of those inefficiencies. With ultra fast launch times, we don't need to give containers headroom to handle spikes. By not running our compute cluster, we can instantly launch everything we need, eliminating k8s overprovisioning. We're left with AWS idle capacity as the only waste, and AWS mostly has solved that with the spot market.

The math doesn't always come out ahead, but there's a lot of opportunity for serverless to be cheaper in many cases.

mayank · on Feb 3, 2022

> Aurora on spot

How would that work for a database? And have you considered or tried Aurora Serverless?

Thristle · on Feb 3, 2022

I guess the same way as aurora serverless. In general, RDS uses a seperate storage layer from the actual instances so you can do vertical scale/upgrades with 0 downtime (either read replica goes down or replica becomes master)

phamilton · on Feb 4, 2022

We run 3 nodes 24/7 (writer + 2 readers) and have RIs for them. But during daytime hours we autoscale and run an addition 8 or 9 readers. Some of these run for just a few hours and could easily run on spot (especially with minimum duration).

Serverless couldn't provide enough capacity for us (at peak we use up to 300 vCPUs in this cluster). That was on v1. v2 might change that when it supports postgres support.

andrewstuart · on Feb 3, 2022

I started an MVP a little along these lines.

The idea was a single page that showed all your AWS resources across all regions and all accounts.

The good thing was it ran purely in a browser via the AWS JavaScript APIs, so you did not need to create users or roles or give access to any third party - you just put the AWS key into the browser and it ran locally.

It's still there but effectively abandoned.

https://www.singlepagecloud.com

Simon_O_Rourke · on Feb 3, 2022

Part of a recent project I worked on involved this very issue, and significant savings are definitely possible. As an idea this is great, there's a gap in the market for this and very few addressing the issue, least of all AWS.

However, there's a couple of little things that may block its wider adoption.

1. It's a big ask for some companies to create any sort of IAM role for an external company or contractor. Even though they send and receive sensitive data from any number of 3rd party APIs, most will be uneasy about IAM access. It's just a hang up more than a concern, but still.

2. Engineering managers either don't understand or don't care about cloud spend. They get their budget at the start of the year, and they grow it based on the previous year. They usually don't have anywhere to put savings later on in the year, and don't want to reduce spend, and hence budget targets, for the following year.

3. I'd half expect your idea to be bought out by Amazon and shuttered. Kudos to you if that's what happens! But it's costing Jeff Bezos another yacht, so he may not like that.

boynamedsue · on Feb 3, 2022

There are dozens upon dozens of companies doing exactly this sort of service for AWS. And they all use the describe API calls requiring IAM permissions.

Thristle · on Feb 3, 2022

about 3 - they didn't buy any of the other companies that already does RI cost optimization. So he is probably safe

xwdv · on Feb 3, 2022

If it’s not getting bought out what’s the point? There’s faster ways to make money.

rusteh1 · on Feb 3, 2022

I previously worked in this space and have a fair bit of experience optimising AWS spend for customers.

Cool idea! I see the pain point you are addressing here is friction in the marketplace, rather than simply cost optimisation? Traditionally the use of standard RIs was a pain, due to the inflexibility of moving between instance families. And the fact that the marketplace requires a US bank account which made it a no-go for non-US customers.

However, these problems were addressed first through the use of convertible RIs. Which allow exchange of instance types, but can't be sold on the marketplace (from memory). But, to be honest, they were still a pain to manage. You needed a good cost person, or a good TAM to keep on top of the required conversions. So, secondly, savings plans were introduced. I generally recommend compute savings plans these days, as they are much more set and forget, though I acknowledge provide less discount than standard RIs. My personal opinion is that EC2 based RIs will probably be deprecated by AWS at some point in the future. For this reason I don't think its likely they'll release this automated marketplace as a feature in future.

I work almost exclusively with enterprise customers and see very little use of standard RIs these days, which given the marketplace angle I'm assuming is the only purchase option you are working with? Are you doing zonal or regional scope? But if you can find an angle that means customers can make more efficient use of standard RIs all the power to you, that is a win!

Recommendation wise, the native tools (Cost Explorer and the CUR) can deliver reasonable recommendations that are good enough for most customers. Especially when using more flexible purchase options like compute savings plans the need to be super accurate just isn't there anymore.

testplzignore · on Feb 3, 2022

How do you handle the risk of not being able to resell instances that you've bought back? What if I buy 10,000 instances and sell them back to you after 30 days? Seems like a competitor could do something like that to intentionally sabotage your business. Though maybe there's so much liquidity in the market that this isn't much of a risk, and in the worst case you could probably find someone you could sell to at a loss.

yotsumi · on Feb 4, 2022

Excellent question!

stunt · on Feb 4, 2022

I guess self-service is good most of the times, but how is it going to handle temporary increase in resource usage? Think of a big infra refactoring project where a team may create lots of temporary instances and terminate them after a couple of months.

What's your view on Saving Plans vs Reserved Instances? Saving Plans seem to be much more flexible overall. Why only RIs then?

kavehkhorram · on Feb 4, 2022

Usage refreshes its recommendations on a daily basis. When your instance count increases, Usage buys RIs. When it decreases or changes, Usage sells RIs.

Our RIs are actually more flexible than SP. There is no commitment and if you want to change region or instance type, Usage will buy the old type and sell you the new type.

We chose RIs because AWS allows us to buy and sell RIs. There is no marketplace for SPs at the moment.

viraptor · on Feb 3, 2022

That feels like something that AWS would want to shut down if the business ever gets large enough. AWS has its own partners / AWS distribution program, which usage.ai doesn't seem to be a part of.

Do you believe you'll be able to continue running this once someone high enough in AWS "notices" you?

limpigninja · on Feb 3, 2022

This doesn’t make sense, there is no reason AWS would want this shut down. They are buying reserved instances, which AWS sell because it helps them to do so. They are charging the customer 20% savings to help them. They are buying the reserved instances back themselves if needed.

From an AWS perspective this is simple market usage of AWS RIs and cost savings for the customer while easy/reliable usage predictability for AWS for cpu forecasting. It’s a win. And as below looks like they have a healthy relationship with AWS.

viraptor · on Feb 3, 2022

They're effectively a kind of bulk resellers. AWS may choose to be happy about it (happy customers keen to spend more on services) or not (less AWS income). It really depends on how the management sees it.

Johnny555 · on Feb 3, 2022

I don't think it would result in less AWS income -- AWS knows exactly how much on-demand, reserved, and spot instances cost them and they price them accordingly.

foota · on Feb 3, 2022

I don't think this is true, if it shifts usage from spot instances to the reseller's spot instances backed by AWS reserved instances they'll be making less money.

Johnny555 · on Feb 3, 2022

They shift to reserved instances, not spot instances.

It'd be hard for a service provider to shift their customers instances to spot instances unless the customer could tolerate the spot instances being shut down on short notice, and if they can, that customer may as well just use the spot instances themselves.

Worst case, this will increase usage of reserve instances and reduce on-demand usage, but AWS priced them accordingly, so they don't care

stu2b50 · on Feb 3, 2022

That’s only true in a nominal sense. Money right now is worth more than money in the future. Money spent on reserved instances is money Amazon has right now, whereas net AWS spend from spot instances is money in the future.

How much money right now is worth more varies, but Amazon knows best here and prices accordingly.

kavehkhorram · on Feb 3, 2022

We have a strong positive relationship with AWS and will be partners with them in the coming months!

vmception · on Feb 3, 2022

like how Hollywood has a strong positive relationship with China until the censors suddenly deny all films access? Last year was pretty tough.. for example.

well, hope you get a few months of nice payouts! individuals don't need ARR :) one or two nice paychecks is good enough for lifelong success, so you only have to be right once or solve a market need once!

hatware · on Feb 3, 2022

Does it break TOS?

Did Amazon shut down Snowflake despite losing a bunch of Redshift dough to them?

I'm not sure why you feel AWS would shut down a company who is using their resources in a clever manner.

everfrustrated · on Feb 3, 2022

Interesting idea however in my experience EC2 is generally not where I need to start optimizing my AWS bills. RDS & other state [1] are by far the largest line items on my bills.

[1] RDS / Aurora / Elasticache / OpenSearch

kavehkhorram · on Feb 3, 2022

We have optimization features early in R&D for RDS, ElastiCache, and OpenSearch. If you'd like to try them out at some point, feel free to shoot me a note: kaveh@usage.ai

nunez · on Feb 4, 2022

EBS and data egress are the big ones IMO.

secondrow · on Feb 3, 2022

ottertune.com optimizes RDS & Aurora, just fyi

sokoloff · on Feb 3, 2022

Do users also get the reservation? If there is a capacity constraint in an AZ, are the instances reserved to the user’s account or to usage.ai’s account? (This is important to a minority of users, probably.)

kavehkhorram · on Feb 3, 2022

The user's account!

tthun · on Feb 3, 2022

Isn’t this one of the features by cloudhealth

[0] https://www.cloudhealthtech.com

planetsprite · on Feb 3, 2022

I wonder what percentage of Amazon's AWS revenue derives from designing its interface in a way that maximizes unnecessary spend?

a-dub · on Feb 3, 2022

it'll be interesting one day when cloud computing units become fungible and futures trading starts on a commodities exchange alongside an actual spot market for excess capacity.

wcdolphin · on Feb 3, 2022

Has anyone used Cast.ai or Spot by netapp as a comparison?

Thristle · on Feb 3, 2022

Spot has an almost idential service to this called Eco https://spot.io/products/eco/

It reads your cost and usage report + AWS APIs and offers RI options for you to purchase or ignore. What Eco is lacking is being a reseller like usage.ai and buying your unused RIs

theogravity · on Feb 3, 2022

We use cast.ai and it certainly has saved us a LOT of money. I unfortunately am not on the implementation / devops side of this, but it has been worth using, or can speak to exact figures, but it was definitely a significant difference in expenditures.

yonixw · on Feb 3, 2022

Do you support EKS or ECS on EC2?

kavehkhorram · on Feb 3, 2022

Yes -- as long as it's backed by EC2, we support cost reduction for EKS and ECS.

nunez · on Feb 4, 2022

this is like energyogre.com but for EC2 RIs. really nice.

cyberpunk · on Feb 3, 2022

Honest question, who is still on bare ec2 anymore?

say_it_as_it_is · on Feb 3, 2022

Are you asking that cuz serverless is the new marketing hotness? There's a big world of tech beyond your bubble and it's not running lambdas.

ComradePhil · on Feb 3, 2022

Everyone running profitable companies on the cloud.

"Serverless" is basically AWS making money off of your technical debt and uncertainty.

It is cheaper to run your prototype with lambdas and "managed services" but as soon as it scales to a certain point, you're much better off with bare VMs reserved and orchestrated as per your business needs.

vasili111 · on Feb 3, 2022

What y about containers that can be freely moved between clouds and on premise?

mostlysimilar · on Feb 3, 2022

Huh? What is wrong with "bare" EC2 instances?

matwood · on Feb 3, 2022

Not sure what you mean by bare, but my ECS clusters are backed by a mix of ondemand and spot instances.

tedmiston · on Feb 3, 2022

I have found ECS Fargate just running containers directly to be more convenient for most of my services and workloads. But I do occasionally miss some of the features that don't have analogies in ECS Fargate (yet).

stackedinserter · on Feb 3, 2022

We are, AMA.

kavehkhorram · on Feb 3, 2022

Many of our customers are on EKS or ECS backed by EC2!

xyst · on Feb 3, 2022

I cut costs of AWS by 99% by not using AWS at all