It really is good. Surprisingly it seems to answer instruct-like prompts well! I...

codetrotter · on Aug 26, 2023

How do you guys get it to run?

I tried with ollama, installed from Homebrew, on my M1 Max with 64GB RAM.

I downloaded the phind-codellama model using

  ollama pull phind-codellama

But when I give it a prompt, for example

  ollama run phind-codellama "write a production grade implementation of Sieve of Eratosthenes in Rust"

It prints the following error message

    Error: Post "http://localhost:11434/api/generate": EOF

and exits.

Even though it worked to run ollama with some other models.

Is the version in homebrew not able to run phind-codellama?

jmorgan · on Aug 26, 2023

Yes you'll need the latest version (0.0.16) to run the 34B model. It should run great on that machine!

The download (both as a Mac app and standalone binary) is available here: https://github.com/jmorganca/ollama/releases/tag/v0.0.16. And I will work on getting that brew formula updated as well! Sorry to see you hit an error!

eshack94 · on Sept 1, 2023

Is there any chance you can add the phind-codellama-34b-v2 model? Or if there's already a way to run that with ollama, can you tell me how or point me to some docs? Thanks a ton!

JoelJacobson · on Aug 26, 2023

Is there a limit on the number of lines/tokens it can take as input? Can it be increased? The limit seems to be somewhere above 30 lines.

I'm on a MacBook Pro M1 Max 64GB.

    % ollama --version
    ollama version 0.0.16
    % ollama run phind-codellama "$(cat prompt.txt)"
    ⠧   Error: Post "http://localhost:11434/api/generate": EOF
    % wc prompt.txt
         335    1481   11457 prompt.txt

syntaxing · on Aug 26, 2023

This feature was surprisingly hard to find, but you can wrap multi line in “””. So start off with “””, then type or new line (with enter) anything you want, then close with “”” and it’ll run the whole prompt. (“”” is 3 parenthesis quote, the formatting from HN is making it look funny)

jquery · on Aug 26, 2023

>3 parenthesis quote

Do you mean triple double quotes?

JoelJacobson · on Aug 26, 2023

That's what I've been using but still get that error.

tomduncalf · on Aug 26, 2023

Same here, would be interested to know if there's a solution!

I created an issue, if you have an example prompt to add that would be helpful! https://github.com/jmorganca/ollama/issues/422

jmorgan · on Aug 26, 2023

Thanks for creating an issue! And sorry for the error folks… working on it!

tomduncalf · on Aug 27, 2023

Haha no need to apologise! Thanks for an amazing project :)

pmarreck · on Aug 26, 2023

Datapoint here on an M1Max Macbook Pro: I use the triple doublequotes and I'm not getting that error

pmarreck · on Aug 26, 2023

This project is sweet and I fortunately have a Mac M1 laptop with 64GB RAM to play with it.

But I'd like to also be able to run these models on my Linux desktop with two GPU's (a 2080Ti and a 3080Ti) and a Threadripper. How difficult would it be to set some of these up on there?

sodality2 · on Aug 26, 2023

Not hard with text-generation-ui! https://github.com/oobabooga/text-generation-webui

I personally use llama.cpp as the driver since I run CPU-only but another may be better suited for GPU usage. But then it's as simple as downloading the model and placing it in the directory.

pmarreck · on Aug 26, 2023

Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects Command '. "/Users/pmarreck/Downloads/oobabooga_macos/installer_files/conda/etc/profile.d/conda.sh" && conda activate "/Users/pmarreck/Downloads/oobabooga_macos/installer_files/env" && python -m pip install -r requirements.txt --upgrade' failed with exit status code '1'. Exiting...

I'm admittedly running Sonoma beta, that's probably why

I will try it on my Linux machine later tonight (currently AFK with laptop)

pmarreck · on Aug 26, 2023

My Threadripper has 64 cores/128 threads and I'm wondering if any models can take advantage of CPU concurrency to at least mitigate some of the loss from not using a GPU (should one not be using a GPU)

sodality2 · on Aug 26, 2023

From my experience llama.cpp doesn’t take full advantage of parallelism as it could. Tested this on an HPC cluster - increasing thread count certainly did increase CPU usage but did not meaningfully improve tok/s past 6-8 cores. Same behavior with whisper.cpp. :( I wonder if there’s another backend that scales better.

moonchrome · on Aug 26, 2023

I'm guessing the problem is that you're constrained by memory bandwidth and not computation power and this is inherent to the algorithm, not an artifact of any one implementation.

tomr75 · on Aug 26, 2023

do you think can get this working with https://github.com/jackmort/chatgpt.nvim?

syntaxing · on Aug 26, 2023

Not exactly using ollama, but you can use localAI [1] and set your local environment "OPENAI_API_HOST" to localhost and it should work.

[1] https://github.com/go-skynet/LocalAI

troupo · on Aug 26, 2023

> It really is good

> write c code to reverse a linked list

The number of times I had to do it in real production code amounts to zero.

The number of times I had to piece code from poorly documented external services, conflicting product requirements, and running on complex and weird environments has been ... well ... multiple times a day for the past 20+ years.

"instruct-like prompts" is what you give a very junior engineer out of college, and then you have to carefully review their code.

praveenhm · on Aug 26, 2023

Is reverse a linked list is still a popular question on interview?

ddalex · on Aug 26, 2023

I ask it as a warmup question, I expect it to be done in 5-10 minutes.

Then comes the real question, which is "let's write fizz buzz so it generates at above 55Gbytes/second".

moffkalast · on Aug 26, 2023

Well I'm out of ideas: https://chat.openai.com/share/3883332d-511a-404d-9d5a-7f63f9...

robinson7d · on Aug 26, 2023

I believe they’re referencing this codegolf entry: https://codegolf.stackexchange.com/questions/215216/high-thr...

Previously discussed here on HN: https://news.ycombinator.com/item?id=29031488

moffkalast · on Aug 26, 2023

Ah, lmao

redox99 · on Aug 26, 2023

Kind of hilarious what it would go for Python, of all languages.

pmarreck · on Aug 26, 2023

I explicitly instructed in my custom instructions to never use Python, Go or JS, and to stick to functional languages. Works a treat.

fb03 · on Aug 26, 2023

I got that fizzbuzz reference. Stack Overflow thread right?

madduci · on Aug 26, 2023

just out of curiosity, how performed the candidates on real tasks on the job then?