Learning LLMs and GenAI for Dev, Sec, Ops
What is this Project About?
This repository is all about learning and understanding the use of Large Language Models (LLMs) and Generative AI (GenAI) through a structured lesson plan. The aim is to make this high-tech subject accessible for those familiar with traditional software engineering, particularly in the realms of development, operations, and security. By drawing on existing materials, this project crafts a coherent narrative that saves the time spent repeating the same explanations.
The educational journey within this repository predominantly leverages the Langchain framework. Therefore, a basic understanding of the Python programming language is recommended. Many examples that illustrate key concepts have been inspired by Langchain's documentation, with due credit given at appropriate places.
Lessons Overview
Developer
For developers, the lessons cover a wide range of topics including:
- How to communicate with a simple LLM using OpenAI.
- Debugging within the Langchain framework.
- Engaging with OpenAI models in a conversational manner.
- Utilizing prompt templates effectively.
- Employing Docloader to read local files and prepare them for LLM processing.
- Understanding and calculating embeddings.
- The importance of splitting and chunking data.
- Loading embeddings and documents into a vector database.
- Using a chain for Questions and Answers to implement the Retrieval Augmented Generation (RAG) pattern.
- Making LLM generate calls for real-time data by referencing OpenAI documentation.
- Crafting an Agent and equipping it with tools for acquiring real-time information.
Operations
In operational contexts, it explores:
- Token usage and cost estimation.
- Caching LLM calls through exact matching or embeddings.
- Local embedding calculations and caching.
- Hosting a local LLM utilizing Ollama.
- Monitoring and logging LLM interactions using a callback handler.
- Structuring outputs as JSON and validating correctness.
Security
Safety is another pillar, including lessons on:
- Understanding the OWASP top 10 vulnerabilities for LLMs.
- Demonstrating simple prompt injection and ways to mitigate it.
- Detecting prompt injections with third-party models from Huggingface.
- Using prompts to uncover project injections.
- Reviewing LLM-provided answers for acceptability.
- Applying Huggingface models to detect potentially toxic LLM outputs.
- Prompting the LLM for opinions on Kubernetes and Trivy vulnerabilities.
Project History
The project's initial lessons were drafted during a GenAI hackathon hosted by Techstrong/MediaOps. These lessons were subsequently polished for a presentation at the London Devops Meetup group. Others have expressed interest in conducting their own iterations of these lessons as seen here.
How Can You Help?
Community engagement is encouraged in various ways:
- Suggest new topics or lessons by creating a GitHub issue.
- Contribute new lessons or corrections to refine the repository's content.
- Organize meetups or hackathons based on this repository and share your experiences. Photos or videos are welcomed!
- Express gratitude with a tweet to @patrickdebois.
Requirements to Run the Repo
Using a Devcontainer
The project includes a devcontainer for running the repo locally. Alternatively, Google Colab can be used for the notebooks.
Running Locally
- Microsoft VSCode is the preferred environment for demonstrations.
- Python & Jupyter notebooks are run locally.
- Poetry is used as the virtual environment and package manager for Python.
poetry init
poetry install --no-root
Configuring VSCode to Use Poetry
- Install Python 3.11 through pyenv (examples mostly work with 3.12).
- Obtain the Python path through
pyenv which python
. - Set the Poetry Python version using
poetry env use <python path>
. - Locate the Poetry environment path via
poetry env info --path
. - In VSCode, navigate to view -> command palette -> Python: select interpreter -> enter interpreter path and add the path
/Users/patrick.debois/Library/Caches/pypoetry/virtualenvs/london-devops-VW7lFx7f-py3.11
plus/bin/python
. - Add the
ipykernel
using Poetry.
Configuring Jupyter Notebooks
- Install the VSCode plugin for Jupyter.
- Ensure
ipykernel
is installed.
Changelog
The project's evolution includes:
- Version 0.1 focused on initial Langchain syntax.
- Version 0.2 featured adaptations to new Langchain-community, Langchain-OpenAI, and updated syntax.