Introduction to LaVague
LaVague is an innovative open-source framework designed for developers eager to create AI Web Agents. These agents are capable of automating processes and performing tasks on behalf of end users. The core objective of LaVague is to provide a streamlined method for creating web agents that can understand and execute complex tasks.
What is LaVague?
At its essence, LaVague enables developers to build Web Agents that take a given objective, like "Print installation steps for Hugging Face's Diffusers library," and autonomously generate and perform the necessary actions to accomplish that goal. This is made possible through two main components:
- World Model: This component takes an objective and the current state of affairs (such as the website being visited) and provides a relevant set of instructions based on this input.
- Action Engine: It converts the provided instructions into action codes that tools like Selenium or Playwright can execute.
LaVague QA: A Tool for Test Automation
Built on the foundation of LaVague, LaVague QA is tailored for QA engineers. It automates the creation of tests by transforming Gherkin specifications into easy-to-use tests. This project uses the LaVague framework to enhance web testing, making it much more efficient, potentially tenfold.
For more in-depth information and setup guidelines, refer to the LaVague QA documentation.
Getting Started with LaVague
Demo
For a practical demonstration of LaVague in action, consider the scenario where it tackles the multi-step task "Go on the quicktour of PEFT".
Hands-on Experience
Begin your experience with LaVague by following these steps:
-
Install LaVague via:
pip install lavague
-
Use the framework to build and execute a Web Agent according to your objective:
from lavague.core import WorldModel, ActionEngine from lavague.core.agents import WebAgent from lavague.drivers.selenium import SeleniumDriver selenium_driver = SeleniumDriver(headless=False) world_model = WorldModel() action_engine = ActionEngine(selenium_driver) agent = WebAgent(world_model, action_engine) agent.get("https://huggingface.co/docs") agent.run("Go on the quicktour of PEFT") # Launch Gradio Agent Demo agent.demo("Go on the quicktour of PEFT")
For a detailed example or to see how to use LaVague further, check out the quick tour.
Note: These examples use OpenAI's API by default, requiring an OPENAI_API_KEY
in your local environment for functionality.
Key Features
LaVague boasts a variety of features to enhance user experience and functionality:
- Built-in Contexts: Pre-defined configurations
- Customizable Configuration: Tailor-testing based on specific needs
- Test Runner: For testing and evaluating LaVague's performance
- Token Counter: Estimates token usage and related costs
- Logging Tools: Provides transparency during operation
- Gradio Interface: An optional, interactive UI
- Debugging Tools: For efficient troubleshooting
- Chrome Extension: An additional usability feature
Supported Drivers
LaVague supports various driver options to cater to diverse needs:
- Selenium Webdriver
- Playwright Webdriver
- Chrome Extension Driver
Not all drivers support all functionalities. Here’s a quick comparison:
Feature | Selenium | Playwright | Chrome Extension |
---|---|---|---|
Headless agents | ✅ | ⏳ | N/A |
Handle iframes | ✅ | ✅ | ❌ |
Open several tabs | ✅ | ⏳ | ✅ |
Highlight elements | ✅ | ✅ | ✅ |
Support and Community
To assist users and developers, LaVague provides several support channels:
- A troubleshooting guide for resolving common problems
- A platform to raise issues on GitHub
- A support channel on the Discord server
Contribution Opportunities
LaVague welcomes contributions from the community to help build a more robust framework. The outlined contribution process ensures clarity and collaboration:
- Tasks are posted as
GitHub issues
. - Interested contributors can comment on an issue to express interest.
- Tasks are assigned with a
community assigned
label. - Submit completed work through a Pull Request (PR).
- The contributions are reviewed and merged, or feedback is provided.
More details are available in the contributing guide.
Project Roadmap and Costs
LaVague's development and progress can be tracked through its project backlog.
Cost Considerations
Running an agent using LaVague involves costs based on:
- The specific LLM models used (default is OpenAI's GPT-4)
- The complexity of objectives
- The particular websites interacted with
Further cost details are provided in dedicated documentation on token usage and cost.