en

#Data Augmentation

The Tiger toolkit is an open-source solution for developing reliable AI models, featuring components like TigerRAG for retrieval-augmented generation and TigerTune for model fine-tuning. It bridges the gap between general LLMs and data sources, ensuring safety in AI applications. Suitable for organizations looking to integrate proprietary data while maintaining safety standards.

awesome-imbalanced-learning

An extensive collection of imbalanced learning materials, including curated papers, codes, and libraries. The repository classifies frameworks and libraries by programming language, from Python to Julia, and organizes research papers by domain, such as ensemble and deep learning. Keep informed with the latest updates, featuring the 'imbalanced-ensemble' package, and explore a range of algorithms for multi-class imbalanced learning, with features like parallel execution and compatibility with popular libraries like scikit-learn. This carefully selected non-exhaustive collection supports unbiased model learning from imbalanced datasets.

The tool provides efficient solutions for augmenting Chinese text data, with features like random entity replacement and synonym swaps to improve NLP model generalization and resilience. It includes advanced functionality such as NER data augmentation and character replacement to maintain semantic integrity. Easy to install via pip, this tool helps generate extensive datasets while preserving the text's original meaning, enhancing model capacity and stability.

Awesome-LLM4IE-Papers

Explore a comprehensive range of academic papers on generative information extraction using Large Language Models (LLMs). This curated collection includes recent studies on topics such as named entity recognition, relation extraction, and event extraction. Access innovative methodologies like supervised fine-tuning, few-shot, and zero-shot learning, along with data augmentation and constrained decoding. The repository invites contributions from academics and offers a detailed survey of LLMs in generative information extraction. Keep current with the latest papers and access useful datasets to advance research in the information extraction domain.

Awesome-Knowledge-Distillation-of-LLMs

Discover a detailed study on the knowledge distillation of large language models (LLMs) that highlights methods for transferring skills from models like GPT-4 to open-source alternatives such as LLaMA and Mistral. The survey thoroughly examines techniques for compressing models and using data augmentation for self-improvement. It offers structured insights into algorithms, skill refinement, and practical implementations across various domains. Regular updates provide a continuously updated collection of recent research advancements.

The survey paper explores the developing field of AI-generated images used as data sources, emphasizing the methodologies and various uses of synthetic visual data. It categorizes the content comprehensively, focusing on generative models and neural rendering, applied across 2D and 3D visual perception and medical data synthesis. By reviewing diverse methods such as generative adversarial networks and diffusion models, the paper examines new applications in image classification, segmentation, and self-supervised learning, providing insights into the future potential of AI-generated content across different industries.

Terms of Use Privacy Policy Advertising Services

Feedback Email: [email protected]