PaddleOCR-VL-0.9B
0.9Bby PaddlePaddle
Baidu's 0.9B vision-language OCR model combining a NaViT-style dynamic-resolution encoder with ERNIE-4.5-0.3B. Handles multilingual text, tables, charts, and formulas across 16K context — optimized for efficient on-device document parsing.
Specifications
Technical details and pricing.
Frequently Asked Questions
What is PaddleOCR-VL-0.9B?
Baidu's 0.9B vision-language OCR model combining a NaViT-style dynamic-resolution encoder with ERNIE-4.5-0.3B. Handles multilingual text, tables, charts, and formulas across 16K context — optimized for efficient on-device document parsing.
What input formats does PaddleOCR-VL-0.9B support?
PaddleOCR-VL-0.9B accepts text, image as input and produces text output.
What is the context length of PaddleOCR-VL-0.9B?
PaddleOCR-VL-0.9B supports up to 16,384 tokens of context.
Is PaddleOCR-VL-0.9B open source?
PaddleOCR-VL-0.9B is available under the Apache 2.0 license.
Specifications are based on publicly available model documentation.