en

#dataset generation

YuzuMarker.FontDetection

YuzuMarker.FontDetection introduces a model for recognizing fonts in Chinese, Japanese, and Korean texts. Utilizing an open-source dataset available on Huggingface, the project includes instructions for data preparation and model training to support effective font classification. With options for online demos and Docker deployment, this tool aids developers and researchers in text detection and font analysis across Asian scripts without exaggeration.

speech-dataset-generator

The tool facilitates the creation of multilingual datasets for training text-to-speech and speech recognition models by transcribing and refining audio quality. It segments audio, identifies speaker gender, and utilizes pyannote embeddings for automatic speaker naming. Suitable for detecting multiple speakers, it enhances audio using deepfilternet, resembleai, or mayavoz. The tool supports input from local files, YouTube, LibriVox, and TED Talks, storing data efficiently in a Chroma database.

Terms of Use Privacy Policy Advertising Services

Feedback Email: [email protected]