How does this compare to fine tuning something like BERT? | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

swalsh on May 25, 2023 | parent | context | favorite | on: How to Finetune GPT-Like Large Language Models on ...

How does this compare to fine tuning something like BERT?

theaniketmaurya on May 25, 2023 [–]

I would say similar since the building block is the transformer for both. In this blog post, the fine-tuning strategy used is Adapter. It basically adds a learnable layer to the Transformer block.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact