#Data Augmentation
tiger
The Tiger toolkit is an open-source solution for developing reliable AI models, featuring components like TigerRAG for retrieval-augmented generation and TigerTune for model fine-tuning. It bridges the gap between general LLMs and data sources, ensuring safety in AI applications. Suitable for organizations looking to integrate proprietary data while maintaining safety standards.
awesome-imbalanced-learning
An extensive collection of imbalanced learning materials, including curated papers, codes, and libraries. The repository classifies frameworks and libraries by programming language, from Python to Julia, and organizes research papers by domain, such as ensemble and deep learning. Keep informed with the latest updates, featuring the 'imbalanced-ensemble' package, and explore a range of algorithms for multi-class imbalanced learning, with features like parallel execution and compatibility with popular libraries like scikit-learn. This carefully selected non-exhaustive collection supports unbiased model learning from imbalanced datasets.
nlpcda
The tool provides efficient solutions for augmenting Chinese text data, with features like random entity replacement and synonym swaps to improve NLP model generalization and resilience. It includes advanced functionality such as NER data augmentation and character replacement to maintain semantic integrity. Easy to install via pip, this tool helps generate extensive datasets while preserving the text's original meaning, enhancing model capacity and stability.
Awesome-LLM4IE-Papers
Explore a comprehensive range of academic papers on generative information extraction using Large Language Models (LLMs). This curated collection includes recent studies on topics such as named entity recognition, relation extraction, and event extraction. Access innovative methodologies like supervised fine-tuning, few-shot, and zero-shot learning, along with data augmentation and constrained decoding. The repository invites contributions from academics and offers a detailed survey of LLMs in generative information extraction. Keep current with the latest papers and access useful datasets to advance research in the information extraction domain.
Feedback Email: [email protected]