Hacker News new | past | comments | ask | show | jobs | submit | pkiv's comments login

Congrats on the launch guys!

Really love the decoupling of the logic and the runtime for the actual tool calls.


Browserbase | Multiple Roles | San Francisco | ONSITE

We're building infrastructure that enables developers and LLMs to programmatically interact with the web using our hosted headless browsers.

A headless browser is just like the browser you're using right now, but running on a server. Running a single one isn't too bad, but running many of them becomes a complex exercise in stateful, distributed systems. We handle that, as well as provide great observability and other helpful features (like cookies management) to make developer's lives easier.

While the infrastructure product is how we make money, we also maintain Stagehand (https://github.com/browserbase/stagehand), the AI-powered successor to Playwright. We built it to show how you can use LLMs to build dynamic web automations that don't depend on deterministic code.

We're looking for developers who are interested in working on products that cater to other developers. You're an especially good fit if you're experienced in cloud infrastructure or distributed systems. We're a team of 10, full time in-person in-SF. You can learn more about our work culture here: https://x.com/pk_iv/status/1860762063490158642. We have product-market-fit, and have raised $27M from Kleiner Perkins, CRV, and Okta Ventures.

I've hired several people from HN and I'm excited to continue meeting great people like yourself!

Open roles are here: https://browserbase.com/careers You can also email at jobs@browserbase.com


I’d recommend checking out Stagehand if you want to use something that’s more AI first! It’s like the AI powered successor to playwright: https://github.com/browserbase/stagehand

(I am one of the authors!)


If you're open to it, I'd love to hear what you think of what we're building at https://browserbase.com/ - you can run a chrome extension on a headless browser so you can do the semantic markdown within the browser, before pulling anything off.

We even have an iFrame-able live view of the browser, so your users can get real-time feedback on the XPaths they're generating: https://docs.browserbase.com/features/session-live-view#give...

Happy to answer any questions!


This is super neat and I think I've seen your site before :)

Do you handle authentication? We have lots of users that want to automate some part of their daily workflow but the pages are often behind a login and/or require a few clicks to reach the desired content.

Happy to chat: username@gmail.com


You must get a lot of test emails to that FANTASTIC gmail address. Funny how it might even be worth some decent money.


That's not literally his e-mail :D. He means that you have to replace it with his HN username. It would have been better to write it like this: [HN username]@gmail.com


Personally I thought it was a LLM reply to a LLM marketing post to fake engagement. Lol


Instructions unclear, here's a haiku about faking engagement:

Beneath the deep waves,

False likes in shadows do dance,

Submarine ploys drift.


Hahaha okay I feel dumb now.


Well if you're dumb then we're dumb.


I'm also curious about this! I've been learning about scraping, but I've had a hard time finding good info about how to deal with user auth effectively.


You login and grab the session and save it. Then you mount the session to the requests.


Am I correct that the use case of doing this is 1. Scale and 2. Defeating Cloudflare et. al?

I do scraping, but I struggle to see what these tools are offering, but maybe I'm just not the target audience. If the websites don't have much anti-scraping protection to speak of, and I only do a few pages per day, is there still something I can get out of using a tool like Browserbase? I wonder because of this talk about semantic markdown and LLMs, what's the benefit between writing (or even having an AI write) standard fetching and parsing code using playwright/beautifulsoup/cheerio?


Awesome product!

I was just a bit confused that the sign up buttons for the Hobby and Scale plans are grey, I thought that they are disabled until randomly hovering over them.


Good feedback! We'll take a look.


I don't see any difference than browserless?


The price and the dashboard are a great start :)


Romania is missing from the list of phone number countries on signup, not sure if on purpose or not.


Congrats on the launch!! Collaborative browsing is something I've been looking for a few use cases of mine. Excited to try it out.


The supabase team always delivers. Excited to give this a try!


If you want to build it yourself, you could try using https://browserbase.com/. We offer managed headless browsers work everywhere, every-time. It costs $0.10 per browser session/hour (billed minutely). Feel free to shoot me an email if you want access! paul@browserbase.com


They recently increased their pricing quite a bit. We're looking to offer a much more affordable pay-as-you-go pricing model at https://browserbase.com/

Feel free to shoot me an email if you're interested in trying it out! paul@browserbase.com


In the end, the best way to avoid being blocked is to be a good actor. All of these hacks won't stop someone who's determined to prevent access (ie: LinkedIn).

That's actually one of the reasons why I started https://browserbase.com/. Maintaining headless browser infrastructure can be such a pain. I've spent a lot of time managing headless chrome fleets at scale, so happy to answer any questions.


Are there any stories you're willing to share, any tough nuts you've had to crack to improve some aspect of operations, whether it be reliability, performance, bot detection evasion, or something else completely?

I've only dealt with scraping on a small scale and I quickly realized that running "browsers as a service" is a pain in the ass, they're not exactly lightweight, they like to get "stuck", balloon in memory or some such.

I imagine your business will be quite successful if reliability is good and the price is right!


I gave a lightning talk on headless chrome here that is worth checking out!

https://www.youtube.com/watch?v=vs-qzlW9M50&t=726s


If I understand correctly, a lot of the issues you can run into with regards to blocking come from the fact that you're using a headless browser. Past a certain point, wouldn't it be less work to use a regular browser and drive with Selenium or similar solutions? Or does that not address the kind of problems you're facing?


I used to semi-automate access to some sites by using Selenium with a non-headless browser. These were sites where there were just one or two pages where I wanted some automation to fill out a form or scrape some data, and they frequently made changes to the home page that made it hard to automate navigating from the home page to the pages I wanted to automate.

The idea was to have a script use Selenium to launch non-headless Chrome and then wait:

  driver = Chrome()
  driver.get("https://example.org")
  input("Press enter when ready")
I could then manually deal with logging in, answering any CAPTCHA that came up, and navigate to the page I wanted to run my automation. Then I could press "enter" in my terminal and my script would continue.

That used to work fine, but then on sites using Cloudflare's CAPTCHA it stopped working. Solving the CAPTCHA would just result in another CAPTCHA.

I tried an alternative Selenium Chrome driver that was supposed to be more stealthy, and tried setting various flags that were supposed to make it so JavaScript could not tell that Selenium was there, and those worked for a while, but then they stopped working.

The results were similar using Selenium with Firefox.

I also tried Puppeteer, with Chromium and Firefox, and they too could not get past the CAPTCHA loops.

I then tried Playwright. With Chromium and Webkit that got the CAPTCHA loops. With Firefox it actually worked. I didn't even see the CAPTCHA. The non-interactive check for not being a bot passed.

Still, the whole approach seems fragile. I don't know if Firefox/Playwright working was due to some fundamental difference between Firefox and the others or just Cloudflare having not yet gotten around to dealing with it.


The newest version of headless chrome actually runs the same code as a "regular browser": https://developer.chrome.com/docs/chromium/new-headless


I created a dedicated chrome profile (--user-data-dir) signed in to a few sites and then drive it, with visible window from scripts.

Does all my crawling, it goes very slow, it's never trigger the bot detectors.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: