en

#instruction dataset

open-korean-instructions

This repository assembles a wide array of Korean language datasets essential for training language models, encompassing translated and GPT-generated data. It includes datasets like KoAlpaca and ShareGPT DeepL translations, supporting both single and multi-turn formats. Contributions for new data via PR are encouraged. This collection is a valuable asset for building instruction models, utilizing sources like Wikipedia data, ethical Q&A, and language feedback.

Terms of Use Privacy Policy Advertising Services

Feedback Email: [email protected]