DdddOcr: A Universal Offline Local CAPTCHA Recognition SDK
DdddOcr is an open-source project created by sml2h3 and kerlomz. It provides a straightforward solution for recognizing CAPTCHAs and supports a wide range of text formats through deep learning network training. This project is not designed to target any specific CAPTCHA provider and its effectiveness largely relies on various factors, potentially recognizing some CAPTCHAs while missing others.
DdddOcr emphasizes minimal dependencies to reduce setup and usage costs, aiming to provide a comfortable experience for all users.
The project repository can be found here.
Sponsorship Partners
- YesCaptcha: Offers commercial-grade recognition interfaces for Google reCaptcha, hCaptcha, funCaptcha.
- 超级鹰: A leading global vendor in intelligent image classification and recognition.
- Malenia: An enterprise-grade proxy IP gateway platform.
- 雨云VPS: Provides budget-friendly large bandwidth with Zhejiang node.
Getting Started
Supported Environments
Operating System | CPU | GPU | Max Python Version | Notes |
---|---|---|---|---|
Windows 64-bit | √ | √ | 3.12 | Some versions may require a vc runtime library. |
Windows 32-bit | × | × | - | |
Linux 64 / ARM64 | √ | √ | 3.12 | |
Linux 32-bit | × | × | - | |
Macos X64 | √ | √ | 3.12 | Reference M1/M2/M3 guidance here. |
Installation Steps
Install from PyPI
pip install ddddocr
Install from Source
git clone https://github.com/sml2h3/ddddocr.git
cd ddddocr
python setup.py
Note: Avoid importing DdddOcr directly from its root directory to prevent conflicts.
Directory Structure
ddddocr
├── MANIFEST.in
├── LICENSE
├── README.md
├── /ddddocr/
│ ├── __init__.py Main library file
│ ├── common.onnx New OCR model
│ ├── common_det.onnx Object detection model
│ ├── common_old.onnx Old OCR model
│ ├── logo.png
│ ├── README.md
│ ├── requirements.txt
├── logo.png
└── setup.py
Project Base Support
The project is built on models trained using dddd_trainer with a PyTorch framework, and utilizes onnxruntime for inference. Therefore, its compatibility with Python versions largely depends on the onnxruntime.
User Guide
Basic OCR Recognition Ability
DdddOcr is capable of recognizing single-line text, particularly CAPTCHAs involving English letters and numbers, and some special characters.
import ddddocr
ocr = ddddocr.DdddOcr()
image = open("example.jpg", "rb").read()
result = ocr.classification(image)
print(result)
DdddOcr includes two OCR models. The second model can be accessed by setting beta=True
in the initialization:
import ddddocr
ocr = ddddocr.DdddOcr(beta=True)
image = open("example.jpg", "rb").read()
result = ocr.classification(image)
print(result)
Note: Initialize DdddOcr only once to avoid performance issues.
Object Detection Ability
Detects potential target locations in an image using bounding boxes.
import ddddocr
import cv2
det = ddddocr.DdddOcr(det=True)
with open("test.jpg", 'rb') as f:
image = f.read()
bboxes = det.detection(image)
print(bboxes)
im = cv2.imread("test.jpg")
for bbox in bboxes:
x1, y1, x2, y2 = bbox
im = cv2.rectangle(im, (x1, y1), (x2, y2), color=(0, 0, 255), thickness=2)
cv2.imwrite("result.jpg", im)
Slider Detection
This feature identifies slider target positions using OpenCV algorithms. Two algorithms cater to different scenarios, and usage depends on the image formats.
Algorithm 1
# With separate slider and background images
det = ddddocr.DdddOcr(det=False, ocr=False)
with open('target.png', 'rb') as f:
target_bytes = f.read()
with open('background.png', 'rb') as f:
background_bytes = f.read()
res = det.slide_match(target_bytes, background_bytes)
print(res)
Algorithm 2
# With images having no added slider
slide = ddddocr.DdddOcr(det=False, ocr=False)
with open('bg.jpg', 'rb') as f:
target_bytes = f.read()
with open('fullpage.jpg', 'rb') as f:
background_bytes = f.read()
res = slide.slide_comparison(target_bytes, background_bytes)
print(res)
OCR Probability Output
Allows for more flexible result control by outputting character probabilities.
import ddddocr
ocr = ddddocr.DdddOcr()
image = open("test.jpg", "rb").read()
ocr.set_ranges("0123456789+-x/=")
result = ocr.classification(image, probability=True)
s = ""
for i in result['probability']:
s += result['charsets'][i.index(max(i))]
print(s)
Custom OCR Model Import
DdddOcr supports importing custom models trained using dddd_trainer.
import ddddocr
ocr = ddddocr.DdddOcr(det=False, ocr=False, import_onnx_path="myproject.onnx", charsets_path="charsets.json")
with open('test.jpg', 'rb') as f:
image_bytes = f.read()
res = ocr.classification(image_bytes)
print(res)
Version Control
DdddOcr uses Git for version management, and the latest versions are available on the repository.
Related Articles and Projects
Submissions for tutorials and articles are welcome via an issue titled "[Submission]".
Author
For queries and discussions, please use GitHub issues.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Donations
Support the project by visiting the repository.