LaVague - Framework for Building Automated Web Solutions

Introduction to LaVague

LaVague is an innovative open-source framework designed for developers eager to create AI Web Agents. These agents are capable of automating processes and performing tasks on behalf of end users. The core objective of LaVague is to provide a streamlined method for creating web agents that can understand and execute complex tasks.

What is LaVague?

At its essence, LaVague enables developers to build Web Agents that take a given objective, like "Print installation steps for Hugging Face's Diffusers library," and autonomously generate and perform the necessary actions to accomplish that goal. This is made possible through two main components:

World Model: This component takes an objective and the current state of affairs (such as the website being visited) and provides a relevant set of instructions based on this input.
Action Engine: It converts the provided instructions into action codes that tools like Selenium or Playwright can execute.

LaVague QA: A Tool for Test Automation

Built on the foundation of LaVague, LaVague QA is tailored for QA engineers. It automates the creation of tests by transforming Gherkin specifications into easy-to-use tests. This project uses the LaVague framework to enhance web testing, making it much more efficient, potentially tenfold.

For more in-depth information and setup guidelines, refer to the LaVague QA documentation.

Getting Started with LaVague

Demo

For a practical demonstration of LaVague in action, consider the scenario where it tackles the multi-step task "Go on the quicktour of PEFT".

Demo for agent

Hands-on Experience

Begin your experience with LaVague by following these steps:

Install LaVague via:
```
pip install lavague
```

Use the framework to build and execute a Web Agent according to your objective:

from lavague.core import  WorldModel, ActionEngine
from lavague.core.agents import WebAgent
from lavague.drivers.selenium import SeleniumDriver

selenium_driver = SeleniumDriver(headless=False)
world_model = WorldModel()
action_engine = ActionEngine(selenium_driver)
agent = WebAgent(world_model, action_engine)
agent.get("https://huggingface.co/docs")
agent.run("Go on the quicktour of PEFT")

# Launch Gradio Agent Demo
agent.demo("Go on the quicktour of PEFT")

For a detailed example or to see how to use LaVague further, check out the quick tour.

Note: These examples use OpenAI's API by default, requiring an OPENAI_API_KEY in your local environment for functionality.

Key Features

LaVague boasts a variety of features to enhance user experience and functionality:

Built-in Contexts: Pre-defined configurations
Customizable Configuration: Tailor-testing based on specific needs
Test Runner: For testing and evaluating LaVague's performance
Token Counter: Estimates token usage and related costs
Logging Tools: Provides transparency during operation
Gradio Interface: An optional, interactive UI
Debugging Tools: For efficient troubleshooting
Chrome Extension: An additional usability feature

Supported Drivers

LaVague supports various driver options to cater to diverse needs:

Selenium Webdriver
Playwright Webdriver
Chrome Extension Driver

Not all drivers support all functionalities. Here’s a quick comparison:

Feature	Selenium	Playwright	Chrome Extension
Headless agents	✅	⏳	N/A
Handle iframes	✅	✅	❌
Open several tabs	✅	⏳	✅
Highlight elements	✅	✅	✅

Support and Community

To assist users and developers, LaVague provides several support channels:

A troubleshooting guide for resolving common problems
A platform to raise issues on GitHub
A support channel on the Discord server

Contribution Opportunities

LaVague welcomes contributions from the community to help build a more robust framework. The outlined contribution process ensures clarity and collaboration:

Tasks are posted as GitHub issues.
Interested contributors can comment on an issue to express interest.
Tasks are assigned with a community assigned label.
Submit completed work through a Pull Request (PR).
The contributions are reviewed and merged, or feedback is provided.

More details are available in the contributing guide.

Project Roadmap and Costs

LaVague's development and progress can be tracked through its project backlog.

Cost Considerations

Running an agent using LaVague involves costs based on:

The specific LLM models used (default is OpenAI's GPT-4)
The complexity of objectives
The particular websites interacted with

Further cost details are provided in dedicated documentation on token usage and cost.