Your own local AI assistant

Setting up your own local AI assistant

In this article we will explore how to set up your own local AI assistant model and interact with it, leveraging freely available tools. Let’s start by assuming that we’d like to set up a local LLM to help us produce great musical lyrics. To streamline the walk-through and to avoid having to install Ollama locally we’ll leverage Docker, however should you prefer, you can just install Ollama manually. Installation guides can be found at [https://github.com/ollama/ollama/blob/main/docs/linux.md].

Now you might be wondering what Ollama is, in summary, it is a piece of software that allows anyone to spin up local LLMs, supporting several models out of the box in an intuitive and easy to use manner.

So let’s start by pulling the latest image (For more information on the image, you can check https://hub.docker.com/r/ollama/ollama).

 

> docker pull ollama/ollama

 

For the walkthrough we will leverage the Phi-3 model by Microsoft, which is a small but highly capable LLM, if you want to learn more about it, you can check https://ollama.com/library/phi3, or if you are used to reading papers you can check https://arxiv.org/abs/2404.14219.

Although we picked phi-3 due to being small & highly capable, you can check the list of supported models at https://ollama.com/library At this point lets spin up a new container running Ollama:

 

> docker run -d -p 11434:11434 --name ollama ollama/ollama

 

Note that:

  • The new container will run in detached mode (-d).
  • We are not mounting any volume, consider doing so depending on your needs once you have completed this walkthrough.
  • We are binding the container port 11434 to the local port 11434, in this port Ollama will listen for new requests!

Now lets ask Ollama to pull & serve the phi3 model (2.3GBs).

 

> docker exec -it ollama ollama run phi3

 

A new prompt that allows us to interact with the new model and to set up your own local AI assistant might appear. Since we want to talk with it from within our own tools let’s just quit for the time being.

 

>>> (CTRL+D)

 

At this point, our own local AI assistant is already up & running!

We can interact with it through HTTP at port 11434. Ollama offers some standard APIs that are the same regardless of the model being served, one of such APIs is: /api/generate.

For example, Assuming you have cURL installed one could invoke the model by doing:

 

> curl http://127.0.0.1:11434/api/generate -d '{
"model": "phi3",
"stream": false,
"prompt": "Hey there, welcome aboard!"
}'

{"model":"phi3","created_at":"2024-05-13T14:55:34.953477497Z","response":"Hello and thank you for joining! If you have any questions or need assistance, feel free to ask. I'm here to help make your experience as enjoyable and productive as possible. Welcome aboard!","done":true,"done_reason":"stop","context":[32010,13,29950,1032,727,29892,12853,633,29877,538,29991,32007,13,32001,13,10994,322,6452,366,363,22960,29991,960,366,505,738,5155,470,817,18872,29892,4459,3889,304,2244,29889,306,29915,29885,1244,304,1371,1207,596,7271,408,13389,519,322,3234,573,408,1950,29889,21829,633,29877,538,29991,32007,13],"total_duration":4535904662,"load_duration":1913666,"prompt_eval_count":13,"prompt_eval_duration":704920000,"eval_count":45,"eval_duration":3786004000}

 

If you don’t have URL available don’t worry, you can mimic the above call with any other similar tool (e.g. Postman). As you may have realized, the model is somehow slow in responding. This is because we are executing the model using the CPU only, this is also one of the key reasons why phi-3 was selected, as it is a smaller model, way quicker than the bigger alternatives!

After completing this walkthrough consider learning more about GPU support if you want a better experience, a good place to start learning more on this topic would be: https://github.com/ollama/ollama/blob/main/docs/gpu.md

At this point, we have our very own local AI assistant LLM up and running after executing just 3 commands! However, interacting through direct HTTP calls is not the best user experience you can have…

For that reason let’s explore a couple of ways to simplify the user experience.

The first one that is available out of the box is the CLI interaction, which we prevously saw in the walkthrough. If we just re-run:

 

> docker exec -it ollama ollama run phi3

 

The prompt will appear allowing us direct interaction through the CLI:

 

>>> I want to write a folk song on strawberries, any catchy title recommendations?

"Strawberry Symphony: A Berry Ballad of the Fields"

"Fields Fresh and Fair: The Strawberry Serenade"

"Bask in Blush: The Tale of Tangy Treasures"

These titles are catchy, evoke imagery related to strawberries, and suggest a folk song style. Each title hints at the joyfulness and natural setting that 
could accompany such a tune.

 

However, this is still not a good user experience. One simple recommendation would be to leverage some web-based UI (e.g. Open WebUI https://github.com/open-webui/open-webui), or if you are an avid VS Code user like me, you might be interested in checking out some great plugins like Continue (https://marketplace.visualstudio.com/items?itemName=Continue.continue).

Let’s try out Open WebUI which looks simple & straightforward.

Once more let’s leverage a second docker container:

 

> docker run -d -p 3000:8080 –add-host=host.docker.internal:host-gateway –name open-webui ghcr.io/open-webui/open-webui:main

 

Note that:

  • The new container will run in detached mode (-d)
  • It will be linked to port 3000
  • We are not mounting any volume, consider doing so for the long run

Now point your browser to localhost: 3000 and let’s start trying out the new UI:

The very first thing to do is to create a new account (Sign Up link!). Notice that the first account you create will be automatically linked to an Admin account, as explained in https://docs.openwebui.com/getting-started/.

Since we are just trying out the UI feel free to use any random email & password.

Then select the model in the dropdown above to start interacting with it.

And voilà, time to write an awesome song!

After playing a while with some ideas I ended up with the following, the AI recommended the lyrics and I added some chord progression around A major, feel free to play along, and let me know if you liked the results! (It certainly sounds like a strawberries commercial 😅).

 

A                                      D
Fields fresh and fair, ripe berry's glow in the sunlight's kiss,
          Bm7                                            E
Straw-berries gleam like rubies under-neath nature's bliss.
Em                                   Bm                A9sus4
Bountiful harvest calls us home, as we dance in delight,
A                                        D                            A9sus4 A
Fields fresh and fair where strawberry dreams take     flight.

 

As we have seen, to set up your own local AI assistant  model is pretty straightforward. As we enter into a new ‘augmented intelligence’ era one can’t stop but think just what AI will do to our everyday life, but I guess time will tell!

For the time being, I hope this walkthrough has spun up some interesting ideas in you. I certainly now have a new assistant to help me out with lyrics production, and within Folder IT we have also internally enabled our own code companion to increase the productivity of our development community.

Some further topics you might be interested in:

Outsourcing or Augmenting your AI Team with Folder IT professionals is a cost effective solution that does not sacrifice on quality nor communication effectiveness. Our teams are qualified for working with all the latest technologies and for joining you right away.

Request a quote now for outsourcing your project or staff augmentation services to Argentina.

Tags

Access top talent now!

Related

Get in Touch