You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

26 KiB

Raw Blame History Unescape Escape

Vision-OCR 项目深度分析与优化报告

分析日期: 2026-01-08 分析范围: 全项目代码审查

一、架构设计问题

1.1 API 请求参数未生效 (严重 🔴)

问题描述

lang、use_gpu、drop_score 等请求参数虽然被接收，但在实际 OCR 处理中 完全未被使用。

问题位置: api/routes/ocr.py 第 78-95 行

def _process_ocr(
    image_bytes: bytes,
    pipeline: OCRPipeline,
    roi: Optional[ROIParams] = None,
    return_annotated_image: bool = False,
) -> tuple[OCRResult, Optional[str]]:
    # lang, use_gpu, drop_score 参数未传入也未使用!

影响

用户设置的语言、GPU 加速、置信度阈值完全无效
API 文档描述与实际行为不一致
用户误以为参数生效，导致困惑

解决方案

方案一：在 _process_ocr 中动态创建或更新 OCRConfig

def _process_ocr(
    image_bytes: bytes,
    pipeline: OCRPipeline,
    params: OCRRequestParams,  # 新增参数
    roi: Optional[ROIParams] = None,
    return_annotated_image: bool = False,
) -> tuple[OCRResult, Optional[str]]:
    # 创建临时配置
    ocr_config = OCRConfig(
        lang=params.lang,
        use_gpu=params.use_gpu,
        drop_score=params.drop_score,
    )
    # 使用新配置处理...

方案二：修改 OCRPipeline.process() 方法接受运行时参数

def process(
    self,
    image: np.ndarray,
    image_path: Optional[str] = None,
    drop_score: Optional[float] = None,  # 运行时覆盖
) -> OCRResult:
    effective_drop_score = drop_score or self._ocr_config.drop_score
    # ...

1.2 Pipeline 配置临时替换 - 线程安全问题 (严重 🔴)

问题描述

api/routes/ocr.py 中直接修改共享的 pipeline 配置对象，在并发场景下会产生竞态条件。

问题位置: api/routes/ocr.py 第 102-113 行

# 临时更新管道配置
original_config = pipeline._pipeline_config
pipeline._pipeline_config = pipeline_config  # ⚠️ 非线程安全!

try:
    result = pipeline.process(image)
finally:
    pipeline._pipeline_config = original_config  # ⚠️ 并发时可能恢复错误的配置

影响

多用户并发请求时，配置会相互干扰
用户 A 的 ROI 设置可能被应用到用户 B 的请求
产生不可预期且难以复现的 Bug

解决方案

方案一：将配置作为 process() 方法的参数传入（推荐）

def process(
    self,
    image: np.ndarray,
    image_path: Optional[str] = None,
    pipeline_config: Optional[PipelineConfig] = None,  # 新增
) -> OCRResult:
    config = pipeline_config or self._pipeline_config
    # 使用传入的配置...

方案二：使用 contextvars 实现请求级别隔离

from contextvars import ContextVar

_request_config: ContextVar[PipelineConfig] = ContextVar('request_config')

# 在请求处理开始时设置
_request_config.set(pipeline_config)

# 在 process 中读取
config = _request_config.get(self._pipeline_config)

方案三：使用线程锁（性能较差，不推荐）

import threading

class OCRPipeline:
    _lock = threading.Lock()
    
    def process_with_config(self, image, config):
        with self._lock:
            original = self._pipeline_config
            self._pipeline_config = config
            try:
                return self.process(image)
            finally:
                self._pipeline_config = original

1.3 缺少日志系统 (中等 🟡)

问题描述

全项目使用 print() 输出信息，无法控制日志级别、格式、输出目标。

问题位置: 分布在多个文件

# main.py
print("[INFO] 正在初始化 OCR 系统...")

# input/loader.py
print(f"[ERROR] 文件不存在: {path}")

# api/main.py
print("[INFO] 正在加载 OCR 模型...")

影响

无法按级别过滤日志（开发/生产环境）
无法将日志输出到文件或日志服务
缺少时间戳、调用位置等上下文信息
无法进行日志聚合和分析

解决方案

引入 Python 标准 logging 模块：

# utils/logger.py
import logging
import sys

def setup_logger(name: str, level: int = logging.INFO) -> logging.Logger:
    logger = logging.getLogger(name)
    logger.setLevel(level)
    
    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(logging.Formatter(
        '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
    ))
    logger.addHandler(handler)
    
    return logger

# 使用
logger = setup_logger(__name__)
logger.info("正在初始化 OCR 系统...")
logger.error(f"文件不存在: {path}")

或使用 loguru（更简洁）：

from loguru import logger

logger.info("正在初始化 OCR 系统...")
logger.error(f"文件不存在: {path}")

二、性能优化点

2.1 缺少批处理 API (中等 🟡)

现状

API 只支持单张图片处理，需要批量识别时必须多次调用。

影响

网络往返开销大
无法充分利用 GPU 批处理能力
客户端实现复杂

解决方案

添加批处理端点：

@router.post("/recognize/batch")
async def recognize_batch(
    files: List[UploadFile] = File(..., max_length=10),
    params: OCRRequestParams = Depends(parse_multipart_params),
    pipeline: OCRPipeline = Depends(get_ocr_pipeline),
) -> BatchOCRResponse:
    results = []
    for file in files:
        image_bytes = await parse_multipart_image(file)
        result, _ = _process_ocr(image_bytes, pipeline, params.get_roi(), False)
        results.append(_convert_ocr_result_to_response(result))
    
    return BatchOCRResponse(success=True, data=results)

2.2 可视化器重复创建 (低 🟢)

问题描述

api/routes/ocr.py 中每次请求都创建新的 OCRVisualizer。

问题位置: api/routes/ocr.py 第 117-120 行

if return_annotated_image and result.text_count > 0:
    visualizer = OCRVisualizer(VisualizeConfig())  # 每次请求都创建
    annotated = visualizer.draw_result(image, result)

影响

每次都重新加载中文字体文件
PIL/OpenCV 初始化开销

解决方案

将 visualizer 作为应用级单例：

# api/dependencies.py
_visualizer: Optional[OCRVisualizer] = None

def get_visualizer() -> OCRVisualizer:
    global _visualizer
    if _visualizer is None:
        _visualizer = OCRVisualizer(VisualizeConfig())
    return _visualizer

2.3 图片编码格式固定 (低 🟢)

现状

返回标注图片时固定使用 JPEG 格式。

问题位置: api/dependencies.py 第 119 行

def encode_image_base64(image: np.ndarray, format: str = ".jpg") -> str:

建议

允许用户指定输出格式，PNG 适合需要透明度或无损压缩的场景：

def encode_image_base64(
    image: np.ndarray, 
    format: str = ".jpg",
    quality: int = 95,  # JPEG 质量
) -> str:
    params = []
    if format == ".jpg":
        params = [cv2.IMWRITE_JPEG_QUALITY, quality]
    elif format == ".png":
        params = [cv2.IMWRITE_PNG_COMPRESSION, 3]
    
    success, encoded = cv2.imencode(format, image, params)
    # ...

三、代码质量问题

3.1 OCR 路由重复代码 (中等 🟡)

问题描述

express_multipart 和 express_base64 中解析快递单的逻辑完全重复，约 30+ 行。

问题位置: api/routes/ocr.py 第 204-242 行和第 322-360 行

# express_multipart 中
express_info = result.parse_express()
merged_text = result.merge_text()
return ExpressResponse(
    success=True,
    data=ExpressResultData(
        processing_time_ms=result.processing_time_ms,
        express_info=ExpressInfoData(
            tracking_number=express_info.tracking_number,
            sender=ExpressPersonData(
                name=express_info.sender_name,
                phone=express_info.sender_phone,
                address=express_info.sender_address,
            ),
            receiver=ExpressPersonData(
                name=express_info.receiver_name,
                phone=express_info.receiver_phone,
                address=express_info.receiver_address,
            ),
            courier_company=express_info.courier_company,
            confidence=express_info.confidence,
            extra_fields=express_info.extra_fields,
            raw_text=express_info.raw_text,
        ),
        merged_text=merged_text,
        annotated_image_base64=annotated_base64,
    ),
)

影响

修改一处逻辑需要同步修改另一处
容易遗漏导致行为不一致

解决方案

提取公共辅助函数：

def _convert_express_result_to_response(
    result: OCRResult,
    annotated_base64: Optional[str] = None,
) -> ExpressResultData:
    """将 OCRResult 转换为快递单响应数据"""
    express_info = result.parse_express()
    merged_text = result.merge_text()
    
    return ExpressResultData(
        processing_time_ms=result.processing_time_ms,
        express_info=ExpressInfoData(
            tracking_number=express_info.tracking_number,
            sender=ExpressPersonData(
                name=express_info.sender_name,
                phone=express_info.sender_phone,
                address=express_info.sender_address,
            ),
            receiver=ExpressPersonData(
                name=express_info.receiver_name,
                phone=express_info.receiver_phone,
                address=express_info.receiver_address,
            ),
            courier_company=express_info.courier_company,
            confidence=express_info.confidence,
            extra_fields=express_info.extra_fields,
            raw_text=express_info.raw_text,
        ),
        merged_text=merged_text,
        annotated_image_base64=annotated_base64,
    )

3.2 异常处理过于宽泛 (中等 🟡)

问题描述

多处使用裸 except Exception，吞掉所有异常。

问题位置: api/routes/ocr.py 第 165-172 行等多处

except Exception as e:
    return OCRResponse(
        success=False,
        error=ErrorDetail(
            code=type(e).__name__,
            message=str(e),
        ),
    )

影响

隐藏了真正的错误信息
难以定位问题根源
可能掩盖严重的系统错误

解决方案

明确捕获预期异常，让未预期异常传播：

from api.exceptions import OCRAPIException, InvalidImageError, OCRProcessingError

try:
    # ...
except OCRAPIException as e:
    # 业务异常，返回友好信息
    return OCRResponse(
        success=False,
        error=ErrorDetail(code=type(e).__name__, message=e.message),
    )
except Exception as e:
    # 未预期异常，记录日志并返回通用错误
    logger.exception(f"OCR 处理发生未知错误: {e}")
    raise  # 让全局异常处理器处理

3.3 类型注解不完整 (低 🟢)

问题描述

部分函数返回值和参数缺少完整的类型注解。

问题位置: ocr/pipeline.py 第 185 行

def _apply_roi(self, image: np.ndarray) -> tuple:
    # 应该是: -> Tuple[np.ndarray, Tuple[int, int], Optional[Tuple[int, int, int, int]]]

建议

补充完整的类型注解：

from typing import Tuple, Optional

def _apply_roi(
    self,
    image: np.ndarray
) -> Tuple[np.ndarray, Tuple[int, int], Optional[Tuple[int, int, int, int]]]:
    """
    Returns:
        (裁剪后的图像, ROI 偏移量, ROI 矩形)
    """

四、安全性问题

4.1 缺少速率限制 (Rate Limiting) (严重 🔴)

问题描述

API 无任何请求频率限制，易受 DDoS 攻击或滥用。

影响

恶意用户可无限制发送请求
OCR 处理是 CPU/GPU 密集型操作，易导致服务过载
可能产生高额的计算成本

解决方案

使用 slowapi 实现速率限制：

# requirements.txt 添加
slowapi>=0.1.9

# api/main.py
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

# api/routes/ocr.py
from slowapi import Limiter

@router.post("/recognize")
@limiter.limit("10/minute")  # 每分钟最多 10 次请求
async def recognize_multipart(...):

4.2 CORS 配置过于宽松 (中等 🟡)

问题描述

生产环境不应允许所有来源访问。

问题位置: api/main.py 第 106-112 行

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # ⚠️ 危险！允许任何域名
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

影响

任何网站都可以调用你的 API
可能被用于 CSRF 攻击
敏感数据可能泄露给第三方

解决方案

通过环境变量配置允许的域名：

import os

ALLOWED_ORIGINS = os.getenv("ALLOWED_ORIGINS", "http://localhost:3000").split(",")

app.add_middleware(
    CORSMiddleware,
    allow_origins=ALLOWED_ORIGINS,
    allow_credentials=True,
    allow_methods=["GET", "POST"],
    allow_headers=["*"],
)

4.3 Base64 解码后缺少图片尺寸验证 (中等 🟡)

问题描述

恶意用户可构造压缩率极高的图片（如 zip bomb），解码后占用大量内存。

问题位置: api/dependencies.py 第 94-116 行

影响

单个请求可能消耗数 GB 内存
导致服务崩溃或 OOM

解决方案

在图片解码后添加尺寸检查：

def decode_image_bytes(content: bytes, max_dimension: int = 10000) -> np.ndarray:
    """
    将图片字节解码为 numpy 数组
    
    Args:
        content: 图片字节数据
        max_dimension: 最大允许的图片尺寸（宽或高）
    """
    try:
        nparr = np.frombuffer(content, np.uint8)
        image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
        if image is None:
            raise InvalidImageError("图片解码失败")
        
        # 新增: 检查图片尺寸
        height, width = image.shape[:2]
        if width > max_dimension or height > max_dimension:
            raise InvalidImageError(
                f"图片尺寸过大 ({width}x{height})，最大允许 {max_dimension}x{max_dimension}"
            )
        
        return image
    except InvalidImageError:
        raise
    except Exception as e:
        raise InvalidImageError(f"图片解码失败: {str(e)}")

五、配置管理问题

5.1 硬编码配置 (中等 🟡)

问题描述

多处配置硬编码在代码中，无法通过环境变量调整。

问题位置

# api/security.py:19
MAX_FILE_SIZE = 10 * 1024 * 1024  # 硬编码

# api/main.py:48-56
return OCRConfig(
    lang="ch",           # 硬编码默认值
    use_angle_cls=True,
    use_gpu=False,
    drop_score=0.5,
)

影响

不同环境（开发/测试/生产）无法使用不同配置
修改配置需要改代码并重新部署

解决方案

使用 pydantic-settings 统一管理：

# utils/settings.py
from pydantic_settings import BaseSettings
from typing import List

class Settings(BaseSettings):
    # 文件上传限制
    max_file_size: int = 10 * 1024 * 1024
    max_image_dimension: int = 10000
    
    # OCR 默认配置
    ocr_default_lang: str = "ch"
    ocr_use_gpu: bool = False
    ocr_drop_score: float = 0.5
    
    # API 配置
    api_rate_limit: str = "10/minute"
    cors_origins: List[str] = ["http://localhost:3000"]
    
    # 日志配置
    log_level: str = "INFO"
    
    class Config:
        env_prefix = "VISION_OCR_"
        env_file = ".env"

settings = Settings()

使用示例：

from utils.settings import settings

MAX_FILE_SIZE = settings.max_file_size

ocr_config = OCRConfig(
    lang=settings.ocr_default_lang,
    use_gpu=settings.ocr_use_gpu,
    drop_score=settings.ocr_drop_score,
)

5.2 API 版本号分散 (低 🟢)

问题描述

版本号在多处定义，可能不一致。

问题位置

# api/main.py:98
version="1.0.0",

# api/routes/health.py:14
API_VERSION = "1.0.0"

解决方案

从单一来源读取版本号：

# api/__init__.py
__version__ = "1.0.0"

# 其他文件使用
from api import __version__

或从 pyproject.toml 动态读取：

from importlib.metadata import version
__version__ = version("vision-ocr")

六、测试覆盖问题

6.1 核心模块缺少单元测试 (中等 🟡)

现状

只有 API 集成测试，核心业务逻辑无单元测试。

缺失的测试

模块	测试覆盖	风险
`ocr/engine.py`	❌ 无	高 - OCR 核心逻辑
`ocr/express_parser.py`	❌ 无	高 - 正则匹配复杂
`ocr/pipeline.py`	❌ 无	高 - 处理流程
`input/loader.py`	❌ 无	中 - 文件加载
`visualize/draw.py`	❌ 无	低 - 可视化
`utils/config.py`	❌ 无	低 - 配置类

建议

为 ExpressParser 添加单元测试（最高优先级）：

# tests/test_express_parser.py
import pytest
from ocr.express_parser import ExpressParser
from ocr.engine import TextBlock

class TestExpressParser:
    @pytest.fixture
    def parser(self):
        return ExpressParser()
    
    def test_extract_tracking_number(self, parser):
        text_blocks = [
            TextBlock(
                text="运单号：SF1234567890",
                confidence=0.95,
                bbox=[[0, 0], [100, 0], [100, 20], [0, 20]],
            )
        ]
        result = parser.parse(text_blocks)
        assert result.tracking_number == "SF1234567890"
    
    def test_extract_phone_number(self, parser):
        text_blocks = [
            TextBlock(
                text="收件人：张三 13800138000",
                confidence=0.95,
                bbox=[[0, 0], [200, 0], [200, 20], [0, 20]],
            )
        ]
        result = parser.parse(text_blocks)
        assert result.receiver_phone == "13800138000"
    
    def test_detect_courier_company(self, parser):
        text_blocks = [
            TextBlock(
                text="顺丰速运",
                confidence=0.95,
                bbox=[[0, 0], [100, 0], [100, 20], [0, 20]],
            )
        ]
        result = parser.parse(text_blocks)
        assert result.courier_company == "顺丰速运"

6.2 测试使用 Mock 导致假阳性 (中等 🟡)

问题描述

测试全程使用 Mock Pipeline，无法验证真实 OCR 行为。

问题位置: tests/conftest.py 第 25-67 行

@pytest.fixture(scope="session")
def mock_ocr_pipeline():
    mock_pipeline = MagicMock()
    mock_pipeline.process.return_value = mock_result  # 永远返回固定结果

影响

无法发现 OCR 引擎的问题
接口变更可能导致测试仍然通过
端到端流程未被验证

解决方案

添加集成测试（可选择性运行）：

# tests/test_integration.py
import pytest
import os

# 通过环境变量控制是否运行集成测试
SKIP_INTEGRATION = os.getenv("SKIP_INTEGRATION_TESTS", "true").lower() == "true"

@pytest.mark.skipif(SKIP_INTEGRATION, reason="跳过集成测试")
class TestOCRIntegration:
    @pytest.fixture(scope="class")
    def real_pipeline(self):
        """使用真实的 OCR Pipeline"""
        from ocr.pipeline import OCRPipeline
        from utils.config import OCRConfig, PipelineConfig
        
        pipeline = OCRPipeline(OCRConfig(), PipelineConfig())
        pipeline.initialize()
        return pipeline
    
    def test_real_ocr_recognition(self, real_pipeline):
        """测试真实 OCR 识别"""
        import cv2
        import numpy as np
        
        # 创建包含文字的测试图片
        image = np.ones((100, 300, 3), dtype=np.uint8) * 255
        cv2.putText(image, "Hello OCR", (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 0), 2)
        
        result = real_pipeline.process(image)
        
        assert result is not None
        assert result.text_count >= 0  # 可能识别到也可能没有

七、功能增强建议

7.1 支持语言动态切换

现状

lang 参数即使生效，切换语言也需要重新初始化 OCR 引擎，耗时较长。

建议

预加载多语言模型，或实现模型池：

class OCREnginePool:
    """OCR 引擎池，支持多语言"""
    
    def __init__(self):
        self._engines: Dict[str, OCREngine] = {}
    
    def get_engine(self, lang: str) -> OCREngine:
        if lang not in self._engines:
            config = OCRConfig(lang=lang)
            engine = OCREngine(config)
            engine.initialize()
            self._engines[lang] = engine
        return self._engines[lang]

7.2 添加结果缓存

场景

相同图片重复识别时可直接返回缓存结果，节省计算资源。

建议

基于图片哈希实现缓存：

import hashlib
from functools import lru_cache

def get_image_hash(image_bytes: bytes) -> str:
    return hashlib.md5(image_bytes).hexdigest()

# 使用 Redis 或内存缓存
_cache: Dict[str, OCRResult] = {}

def process_with_cache(image_bytes: bytes, pipeline: OCRPipeline) -> OCRResult:
    cache_key = get_image_hash(image_bytes)
    
    if cache_key in _cache:
        return _cache[cache_key]
    
    image = decode_image_bytes(image_bytes)
    result = pipeline.process(image)
    
    _cache[cache_key] = result
    return result

7.3 支持异步处理

场景

大批量图片处理时，同步等待耗时过长。

建议

提供任务队列 + Webhook 回调模式：

@router.post("/recognize/async")
async def recognize_async(
    file: UploadFile,
    callback_url: str = Form(..., description="处理完成后的回调 URL"),
) -> dict:
    # 1. 保存图片到临时存储
    task_id = str(uuid.uuid4())
    save_to_storage(task_id, await file.read())
    
    # 2. 提交任务到队列
    queue.enqueue(process_ocr_task, task_id, callback_url)
    
    # 3. 立即返回任务 ID
    return {"task_id": task_id, "status": "pending"}

@router.get("/task/{task_id}")
async def get_task_status(task_id: str) -> dict:
    # 查询任务状态
    return {"task_id": task_id, "status": get_status(task_id)}

7.4 增强快递单解析能力

现状

正则匹配覆盖有限，部分快递公司格式无法识别。

建议

扩展正则模式库：收集更多快递单样本，补充正则规则
引入 NER 模型：使用命名实体识别提取人名、地址等
添加置信度评估：对解析结果的可靠性给出评分

class ExpressParser:
    def parse(self, text_blocks: List[TextBlock]) -> ExpressInfo:
        info = self._extract_by_regex(text_blocks)
        
        # 如果正则效果不好，尝试 NER
        if not info.is_valid:
            info = self._extract_by_ner(text_blocks)
        
        # 评估解析结果的置信度
        info.parse_confidence = self._evaluate_confidence(info)
        
        return info

八、优化优先级总结

按紧急程度分类

P0 - 必须立即修复 🔴

问题	影响	工作量
API 参数未生效	功能完全失效，用户设置的参数无意义	中
Pipeline 线程安全	并发请求数据错乱，生产事故风险	中
缺少速率限制	服务可被 DDoS 攻击，稳定性风险	低

P1 - 近期需要处理 🟡

问题	影响	工作量
缺少日志系统	无法排查线上问题	低
CORS 过于宽松	安全风险	低
图片尺寸验证缺失	内存攻击风险	低
代码重复	维护成本增加	低
测试覆盖不足	回归风险	中

P2 - 可以规划 🟢

问题	影响	工作量
配置硬编码	部署灵活性差	中
异常处理宽泛	问题定位困难	低
类型注解不完整	代码可读性	低
可视化器重复创建	性能损耗（轻微）	低

P3 - 长期优化

问题	影响	工作量
批处理 API	用户体验	中
结果缓存	性能优化	中
异步处理	大批量场景支持	高
快递单解析增强	产品竞争力	高

建议的修复顺序

1. [P0] 修复 API 参数传递问题
2. [P0] 解决 Pipeline 线程安全问题
3. [P0] 添加速率限制
4. [P1] 引入日志框架
5. [P1] 修复 CORS 配置
6. [P1] 添加图片尺寸验证
7. [P1] 提取重复代码
8. [P2] 配置外部化
9. [P2] 补充单元测试

附录：快速修复代码片段

A. 修复 API 参数传递

# api/routes/ocr.py

def _process_ocr(
    image_bytes: bytes,
    pipeline: OCRPipeline,
    params: OCRRequestParams,  # 新增
    roi: Optional[ROIParams] = None,
    return_annotated_image: bool = False,
) -> tuple[OCRResult, Optional[str]]:
    image = decode_image_bytes(image_bytes)
    pipeline_config = build_pipeline_config(roi)
    
    # 关键：将参数传递给 process 方法
    result = pipeline.process(
        image,
        pipeline_config=pipeline_config,
        drop_score=params.drop_score,
    )
    
    # ...

B. 修复线程安全问题

# ocr/pipeline.py

def process(
    self,
    image: np.ndarray,
    image_path: Optional[str] = None,
    pipeline_config: Optional[PipelineConfig] = None,  # 新增
    drop_score: Optional[float] = None,  # 新增
) -> OCRResult:
    config = pipeline_config or self._pipeline_config
    effective_drop_score = drop_score or self._ocr_config.drop_score
    
    # 使用传入的配置，而不是修改实例属性
    cropped_image, roi_offset, roi_rect = self._apply_roi(image, config.roi)
    # ...

C. 添加速率限制

# requirements.txt
slowapi>=0.1.9

# api/main.py
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter

# api/routes/ocr.py
from fastapi import Request
from api.main import limiter

@router.post("/recognize")
@limiter.limit("10/minute")
async def recognize_multipart(request: Request, ...):
    # ...

26 KiB Raw Blame History Unescape Escape

Vision-OCR 项目深度分析与优化报告

目录

一、架构设计问题

1.1 API 请求参数未生效 (严重 🔴)

1.2 Pipeline 配置临时替换 - 线程安全问题 (严重 🔴)

1.3 缺少日志系统 (中等 🟡)

二、性能优化点

2.1 缺少批处理 API (中等 🟡)

2.2 可视化器重复创建 (低 🟢)

2.3 图片编码格式固定 (低 🟢)

三、代码质量问题

3.1 OCR 路由重复代码 (中等 🟡)

3.2 异常处理过于宽泛 (中等 🟡)

3.3 类型注解不完整 (低 🟢)

四、安全性问题

4.1 缺少速率限制 (Rate Limiting) (严重 🔴)

4.2 CORS 配置过于宽松 (中等 🟡)

4.3 Base64 解码后缺少图片尺寸验证 (中等 🟡)

五、配置管理问题

5.1 硬编码配置 (中等 🟡)

5.2 API 版本号分散 (低 🟢)

六、测试覆盖问题

6.1 核心模块缺少单元测试 (中等 🟡)

6.2 测试使用 Mock 导致假阳性 (中等 🟡)

七、功能增强建议

7.1 支持语言动态切换

7.2 添加结果缓存

7.3 支持异步处理

7.4 增强快递单解析能力

八、优化优先级总结

按紧急程度分类

P0 - 必须立即修复 🔴

P1 - 近期需要处理 🟡

P2 - 可以规划 🟢

P3 - 长期优化

建议的修复顺序

附录：快速修复代码片段

A. 修复 API 参数传递

B. 修复线程安全问题

C. 添加速率限制

26 KiB

Raw Blame History Unescape Escape