Llama 2 AI ChatBot
In this example we will create a chatbot that uses the Llama2 7b chat model. All the code for this example can be found on GitHub.


Our application will:
- Download our model from remote storage in a dependency ensuring the model is ready before we receive traffic, and it can be shared across requests.
- Expose an endpoint that allows chat with the model
- When our endpoint is called it will send the request through the model and return the response.
If all goes well, you should have a working chatbot that looks like this:

Before completing the example ensure you have installed BuildFlow with all extra dependencies.
Our basic model is hosted on a public bucket, but you can also host your own model on a private bucket.
Clone the GitHub Repo
git clone git@github.com:launchflow/launchflow-model-serving.git
cd launchflow-model-serving
Install your requirements
pip install -r requirements.txt
Run your project
Run your project with:
buildflow run
If you want to experiment loading the model from S3 instead of GCS simply set USE_GCP=false
in the .env
file
Once running you can visit http://localhost:8000 to begin chatting with the AI!
What's next?
Now that you have a working chatbot, you can start to customize it to your needs. Such as adding google auth for user authentication or a postgres database for permanent storage. Or even hosting your own model on a private bucket.