Introducing Bricks
Bricks is an innovative open-source initiative designed to equip developers with ready-to-use natural language processing (NLP) tools. This project simplifies text enrichment processes for any development endeavor, embodying ease of use and accessibility by allowing code snippets to be directly copy-pasted from its online platform.
Why Choose Bricks?
Bricks was conceived with the objective of creating a comprehensive library of NLP enhancements that can be integrated into various projects seamlessly. It functions not as a typical Python library that you install, but rather as a repository of readily available code snippets. These snippets can be used directly in projects or in combination with Kern AI's main product, Refinery, to enhance data processing workflows.
Demo Availability
An engaging demonstration of Bricks in action is available for viewing. This demo provides insights into how Bricks can be utilized effectively within your projects. To watch the demo, you can click here.
Understanding the Modules: Classifiers, Extractors, and Generators
Bricks categorizes its powerful modules into three primary types:
- Classifiers: These modules help in categorizing text into specified segments like 'news' or 'blogs'. They can also enrich data by identifying languages and other attributes.
- Extractors: Aimed at retrieving specific pieces of information from large text bodies, these modules work effectively in isolating details such as the author's name from a document.
- Generators: These modules are designed to produce new content from existing text data. They include tools like language translators and pre-defined content filters for Refinery.
Structure of Modules
Each Bricks module is organized meticulously to ensure its functionality and ease of implementation. A module typically comprises several important components:
__init__.py
: The script's entry point, if it is executable.README.md
: An informative description of the module.code_snippet_refinery.md
: A sample code snippet using SpaCy, displayed on the module's detail page.code_snippet_common.md
: A generic Python code snippet available on the module's detail page.config.py
: A configuration script for synchronizing with the online platform.
Getting Started with Bricks
Accessing and utilizing the modules from Bricks is straightforward. Here’s how you can get started:
- Clone the repository.
- Optionally, create a virtual environment for your project.
- Install necessary dependencies using
pip install -r requirements.txt
. - Launch the FastAPI server with
uvicorn api:api
. - Explore the API documentation at
http://localhost:8000/docs
.
Contributing to Bricks
The Bricks platform welcomes contributions from the developer community. New modules can be added by following the project's contribution guidelines. Community members are encouraged to engage through Bricks' Discord channel for any queries or contributions.
Link to Refinery
In conjunction with Bricks, Kern AI offers Refinery, an open-source tool that manages training data at a larger scale. Bricks modules integrate perfectly with Refinery, offering a robust solution for handling and analyzing textual information.
Regular Updates and Newsletter
Bricks is continually evolving, with new modules released frequently. Developers can stay informed about the latest additions and updates by subscribing to Kern AI's newsletter through the official website.
License Information
Bricks is distributed under the Apache 2.0 License, ensuring its open-source availability and encouraging community participation and innovation.
Bricks provides a robust foundation for developers looking to enhance their projects with sophisticated NLP capabilities, making text analysis and enrichment straightforward and efficient.