|
|
# Vision-OCR 项目深度分析与优化报告
|
|
|
|
|
|
> 分析日期: 2026-01-08
|
|
|
> 分析范围: 全项目代码审查
|
|
|
|
|
|
---
|
|
|
|
|
|
## 目录
|
|
|
|
|
|
- [一、架构设计问题](#一架构设计问题)
|
|
|
- [二、性能优化点](#二性能优化点)
|
|
|
- [三、代码质量问题](#三代码质量问题)
|
|
|
- [四、安全性问题](#四安全性问题)
|
|
|
- [五、配置管理问题](#五配置管理问题)
|
|
|
- [六、测试覆盖问题](#六测试覆盖问题)
|
|
|
- [七、功能增强建议](#七功能增强建议)
|
|
|
- [八、优化优先级总结](#八优化优先级总结)
|
|
|
|
|
|
---
|
|
|
|
|
|
## 一、架构设计问题
|
|
|
|
|
|
### 1.1 API 请求参数未生效 (严重 🔴)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
`lang`、`use_gpu`、`drop_score` 等请求参数虽然被接收,但在实际 OCR 处理中 **完全未被使用**。
|
|
|
|
|
|
**问题位置**: `api/routes/ocr.py` 第 78-95 行
|
|
|
|
|
|
```python
|
|
|
def _process_ocr(
|
|
|
image_bytes: bytes,
|
|
|
pipeline: OCRPipeline,
|
|
|
roi: Optional[ROIParams] = None,
|
|
|
return_annotated_image: bool = False,
|
|
|
) -> tuple[OCRResult, Optional[str]]:
|
|
|
# lang, use_gpu, drop_score 参数未传入也未使用!
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 用户设置的语言、GPU 加速、置信度阈值完全无效
|
|
|
- API 文档描述与实际行为不一致
|
|
|
- 用户误以为参数生效,导致困惑
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
方案一:在 `_process_ocr` 中动态创建或更新 OCRConfig
|
|
|
|
|
|
```python
|
|
|
def _process_ocr(
|
|
|
image_bytes: bytes,
|
|
|
pipeline: OCRPipeline,
|
|
|
params: OCRRequestParams, # 新增参数
|
|
|
roi: Optional[ROIParams] = None,
|
|
|
return_annotated_image: bool = False,
|
|
|
) -> tuple[OCRResult, Optional[str]]:
|
|
|
# 创建临时配置
|
|
|
ocr_config = OCRConfig(
|
|
|
lang=params.lang,
|
|
|
use_gpu=params.use_gpu,
|
|
|
drop_score=params.drop_score,
|
|
|
)
|
|
|
# 使用新配置处理...
|
|
|
```
|
|
|
|
|
|
方案二:修改 `OCRPipeline.process()` 方法接受运行时参数
|
|
|
|
|
|
```python
|
|
|
def process(
|
|
|
self,
|
|
|
image: np.ndarray,
|
|
|
image_path: Optional[str] = None,
|
|
|
drop_score: Optional[float] = None, # 运行时覆盖
|
|
|
) -> OCRResult:
|
|
|
effective_drop_score = drop_score or self._ocr_config.drop_score
|
|
|
# ...
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 1.2 Pipeline 配置临时替换 - 线程安全问题 (严重 🔴)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
`api/routes/ocr.py` 中直接修改共享的 pipeline 配置对象,在并发场景下会产生竞态条件。
|
|
|
|
|
|
**问题位置**: `api/routes/ocr.py` 第 102-113 行
|
|
|
|
|
|
```python
|
|
|
# 临时更新管道配置
|
|
|
original_config = pipeline._pipeline_config
|
|
|
pipeline._pipeline_config = pipeline_config # ⚠️ 非线程安全!
|
|
|
|
|
|
try:
|
|
|
result = pipeline.process(image)
|
|
|
finally:
|
|
|
pipeline._pipeline_config = original_config # ⚠️ 并发时可能恢复错误的配置
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 多用户并发请求时,配置会相互干扰
|
|
|
- 用户 A 的 ROI 设置可能被应用到用户 B 的请求
|
|
|
- 产生不可预期且难以复现的 Bug
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
方案一:将配置作为 `process()` 方法的参数传入(推荐)
|
|
|
|
|
|
```python
|
|
|
def process(
|
|
|
self,
|
|
|
image: np.ndarray,
|
|
|
image_path: Optional[str] = None,
|
|
|
pipeline_config: Optional[PipelineConfig] = None, # 新增
|
|
|
) -> OCRResult:
|
|
|
config = pipeline_config or self._pipeline_config
|
|
|
# 使用传入的配置...
|
|
|
```
|
|
|
|
|
|
方案二:使用 `contextvars` 实现请求级别隔离
|
|
|
|
|
|
```python
|
|
|
from contextvars import ContextVar
|
|
|
|
|
|
_request_config: ContextVar[PipelineConfig] = ContextVar('request_config')
|
|
|
|
|
|
# 在请求处理开始时设置
|
|
|
_request_config.set(pipeline_config)
|
|
|
|
|
|
# 在 process 中读取
|
|
|
config = _request_config.get(self._pipeline_config)
|
|
|
```
|
|
|
|
|
|
方案三:使用线程锁(性能较差,不推荐)
|
|
|
|
|
|
```python
|
|
|
import threading
|
|
|
|
|
|
class OCRPipeline:
|
|
|
_lock = threading.Lock()
|
|
|
|
|
|
def process_with_config(self, image, config):
|
|
|
with self._lock:
|
|
|
original = self._pipeline_config
|
|
|
self._pipeline_config = config
|
|
|
try:
|
|
|
return self.process(image)
|
|
|
finally:
|
|
|
self._pipeline_config = original
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 1.3 缺少日志系统 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
全项目使用 `print()` 输出信息,无法控制日志级别、格式、输出目标。
|
|
|
|
|
|
**问题位置**: 分布在多个文件
|
|
|
|
|
|
```python
|
|
|
# main.py
|
|
|
print("[INFO] 正在初始化 OCR 系统...")
|
|
|
|
|
|
# input/loader.py
|
|
|
print(f"[ERROR] 文件不存在: {path}")
|
|
|
|
|
|
# api/main.py
|
|
|
print("[INFO] 正在加载 OCR 模型...")
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 无法按级别过滤日志(开发/生产环境)
|
|
|
- 无法将日志输出到文件或日志服务
|
|
|
- 缺少时间戳、调用位置等上下文信息
|
|
|
- 无法进行日志聚合和分析
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
引入 Python 标准 `logging` 模块:
|
|
|
|
|
|
```python
|
|
|
# utils/logger.py
|
|
|
import logging
|
|
|
import sys
|
|
|
|
|
|
def setup_logger(name: str, level: int = logging.INFO) -> logging.Logger:
|
|
|
logger = logging.getLogger(name)
|
|
|
logger.setLevel(level)
|
|
|
|
|
|
handler = logging.StreamHandler(sys.stdout)
|
|
|
handler.setFormatter(logging.Formatter(
|
|
|
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
|
|
))
|
|
|
logger.addHandler(handler)
|
|
|
|
|
|
return logger
|
|
|
|
|
|
# 使用
|
|
|
logger = setup_logger(__name__)
|
|
|
logger.info("正在初始化 OCR 系统...")
|
|
|
logger.error(f"文件不存在: {path}")
|
|
|
```
|
|
|
|
|
|
或使用 `loguru`(更简洁):
|
|
|
|
|
|
```python
|
|
|
from loguru import logger
|
|
|
|
|
|
logger.info("正在初始化 OCR 系统...")
|
|
|
logger.error(f"文件不存在: {path}")
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 二、性能优化点
|
|
|
|
|
|
### 2.1 缺少批处理 API (中等 🟡)
|
|
|
|
|
|
**现状**
|
|
|
|
|
|
API 只支持单张图片处理,需要批量识别时必须多次调用。
|
|
|
|
|
|
**影响**
|
|
|
- 网络往返开销大
|
|
|
- 无法充分利用 GPU 批处理能力
|
|
|
- 客户端实现复杂
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
添加批处理端点:
|
|
|
|
|
|
```python
|
|
|
@router.post("/recognize/batch")
|
|
|
async def recognize_batch(
|
|
|
files: List[UploadFile] = File(..., max_length=10),
|
|
|
params: OCRRequestParams = Depends(parse_multipart_params),
|
|
|
pipeline: OCRPipeline = Depends(get_ocr_pipeline),
|
|
|
) -> BatchOCRResponse:
|
|
|
results = []
|
|
|
for file in files:
|
|
|
image_bytes = await parse_multipart_image(file)
|
|
|
result, _ = _process_ocr(image_bytes, pipeline, params.get_roi(), False)
|
|
|
results.append(_convert_ocr_result_to_response(result))
|
|
|
|
|
|
return BatchOCRResponse(success=True, data=results)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 2.2 可视化器重复创建 (低 🟢)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
`api/routes/ocr.py` 中每次请求都创建新的 `OCRVisualizer`。
|
|
|
|
|
|
**问题位置**: `api/routes/ocr.py` 第 117-120 行
|
|
|
|
|
|
```python
|
|
|
if return_annotated_image and result.text_count > 0:
|
|
|
visualizer = OCRVisualizer(VisualizeConfig()) # 每次请求都创建
|
|
|
annotated = visualizer.draw_result(image, result)
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 每次都重新加载中文字体文件
|
|
|
- PIL/OpenCV 初始化开销
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
将 visualizer 作为应用级单例:
|
|
|
|
|
|
```python
|
|
|
# api/dependencies.py
|
|
|
_visualizer: Optional[OCRVisualizer] = None
|
|
|
|
|
|
def get_visualizer() -> OCRVisualizer:
|
|
|
global _visualizer
|
|
|
if _visualizer is None:
|
|
|
_visualizer = OCRVisualizer(VisualizeConfig())
|
|
|
return _visualizer
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 2.3 图片编码格式固定 (低 🟢)
|
|
|
|
|
|
**现状**
|
|
|
|
|
|
返回标注图片时固定使用 JPEG 格式。
|
|
|
|
|
|
**问题位置**: `api/dependencies.py` 第 119 行
|
|
|
|
|
|
```python
|
|
|
def encode_image_base64(image: np.ndarray, format: str = ".jpg") -> str:
|
|
|
```
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
允许用户指定输出格式,PNG 适合需要透明度或无损压缩的场景:
|
|
|
|
|
|
```python
|
|
|
def encode_image_base64(
|
|
|
image: np.ndarray,
|
|
|
format: str = ".jpg",
|
|
|
quality: int = 95, # JPEG 质量
|
|
|
) -> str:
|
|
|
params = []
|
|
|
if format == ".jpg":
|
|
|
params = [cv2.IMWRITE_JPEG_QUALITY, quality]
|
|
|
elif format == ".png":
|
|
|
params = [cv2.IMWRITE_PNG_COMPRESSION, 3]
|
|
|
|
|
|
success, encoded = cv2.imencode(format, image, params)
|
|
|
# ...
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 三、代码质量问题
|
|
|
|
|
|
### 3.1 OCR 路由重复代码 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
`express_multipart` 和 `express_base64` 中解析快递单的逻辑完全重复,约 30+ 行。
|
|
|
|
|
|
**问题位置**: `api/routes/ocr.py` 第 204-242 行 和 第 322-360 行
|
|
|
|
|
|
```python
|
|
|
# express_multipart 中
|
|
|
express_info = result.parse_express()
|
|
|
merged_text = result.merge_text()
|
|
|
return ExpressResponse(
|
|
|
success=True,
|
|
|
data=ExpressResultData(
|
|
|
processing_time_ms=result.processing_time_ms,
|
|
|
express_info=ExpressInfoData(
|
|
|
tracking_number=express_info.tracking_number,
|
|
|
sender=ExpressPersonData(
|
|
|
name=express_info.sender_name,
|
|
|
phone=express_info.sender_phone,
|
|
|
address=express_info.sender_address,
|
|
|
),
|
|
|
receiver=ExpressPersonData(
|
|
|
name=express_info.receiver_name,
|
|
|
phone=express_info.receiver_phone,
|
|
|
address=express_info.receiver_address,
|
|
|
),
|
|
|
courier_company=express_info.courier_company,
|
|
|
confidence=express_info.confidence,
|
|
|
extra_fields=express_info.extra_fields,
|
|
|
raw_text=express_info.raw_text,
|
|
|
),
|
|
|
merged_text=merged_text,
|
|
|
annotated_image_base64=annotated_base64,
|
|
|
),
|
|
|
)
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 修改一处逻辑需要同步修改另一处
|
|
|
- 容易遗漏导致行为不一致
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
提取公共辅助函数:
|
|
|
|
|
|
```python
|
|
|
def _convert_express_result_to_response(
|
|
|
result: OCRResult,
|
|
|
annotated_base64: Optional[str] = None,
|
|
|
) -> ExpressResultData:
|
|
|
"""将 OCRResult 转换为快递单响应数据"""
|
|
|
express_info = result.parse_express()
|
|
|
merged_text = result.merge_text()
|
|
|
|
|
|
return ExpressResultData(
|
|
|
processing_time_ms=result.processing_time_ms,
|
|
|
express_info=ExpressInfoData(
|
|
|
tracking_number=express_info.tracking_number,
|
|
|
sender=ExpressPersonData(
|
|
|
name=express_info.sender_name,
|
|
|
phone=express_info.sender_phone,
|
|
|
address=express_info.sender_address,
|
|
|
),
|
|
|
receiver=ExpressPersonData(
|
|
|
name=express_info.receiver_name,
|
|
|
phone=express_info.receiver_phone,
|
|
|
address=express_info.receiver_address,
|
|
|
),
|
|
|
courier_company=express_info.courier_company,
|
|
|
confidence=express_info.confidence,
|
|
|
extra_fields=express_info.extra_fields,
|
|
|
raw_text=express_info.raw_text,
|
|
|
),
|
|
|
merged_text=merged_text,
|
|
|
annotated_image_base64=annotated_base64,
|
|
|
)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 3.2 异常处理过于宽泛 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
多处使用裸 `except Exception`,吞掉所有异常。
|
|
|
|
|
|
**问题位置**: `api/routes/ocr.py` 第 165-172 行 等多处
|
|
|
|
|
|
```python
|
|
|
except Exception as e:
|
|
|
return OCRResponse(
|
|
|
success=False,
|
|
|
error=ErrorDetail(
|
|
|
code=type(e).__name__,
|
|
|
message=str(e),
|
|
|
),
|
|
|
)
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 隐藏了真正的错误信息
|
|
|
- 难以定位问题根源
|
|
|
- 可能掩盖严重的系统错误
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
明确捕获预期异常,让未预期异常传播:
|
|
|
|
|
|
```python
|
|
|
from api.exceptions import OCRAPIException, InvalidImageError, OCRProcessingError
|
|
|
|
|
|
try:
|
|
|
# ...
|
|
|
except OCRAPIException as e:
|
|
|
# 业务异常,返回友好信息
|
|
|
return OCRResponse(
|
|
|
success=False,
|
|
|
error=ErrorDetail(code=type(e).__name__, message=e.message),
|
|
|
)
|
|
|
except Exception as e:
|
|
|
# 未预期异常,记录日志并返回通用错误
|
|
|
logger.exception(f"OCR 处理发生未知错误: {e}")
|
|
|
raise # 让全局异常处理器处理
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 3.3 类型注解不完整 (低 🟢)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
部分函数返回值和参数缺少完整的类型注解。
|
|
|
|
|
|
**问题位置**: `ocr/pipeline.py` 第 185 行
|
|
|
|
|
|
```python
|
|
|
def _apply_roi(self, image: np.ndarray) -> tuple:
|
|
|
# 应该是: -> Tuple[np.ndarray, Tuple[int, int], Optional[Tuple[int, int, int, int]]]
|
|
|
```
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
补充完整的类型注解:
|
|
|
|
|
|
```python
|
|
|
from typing import Tuple, Optional
|
|
|
|
|
|
def _apply_roi(
|
|
|
self,
|
|
|
image: np.ndarray
|
|
|
) -> Tuple[np.ndarray, Tuple[int, int], Optional[Tuple[int, int, int, int]]]:
|
|
|
"""
|
|
|
Returns:
|
|
|
(裁剪后的图像, ROI 偏移量, ROI 矩形)
|
|
|
"""
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 四、安全性问题
|
|
|
|
|
|
### 4.1 缺少速率限制 (Rate Limiting) (严重 🔴)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
API 无任何请求频率限制,易受 DDoS 攻击或滥用。
|
|
|
|
|
|
**影响**
|
|
|
- 恶意用户可无限制发送请求
|
|
|
- OCR 处理是 CPU/GPU 密集型操作,易导致服务过载
|
|
|
- 可能产生高额的计算成本
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
使用 `slowapi` 实现速率限制:
|
|
|
|
|
|
```python
|
|
|
# requirements.txt 添加
|
|
|
slowapi>=0.1.9
|
|
|
|
|
|
# api/main.py
|
|
|
from slowapi import Limiter, _rate_limit_exceeded_handler
|
|
|
from slowapi.util import get_remote_address
|
|
|
from slowapi.errors import RateLimitExceeded
|
|
|
|
|
|
limiter = Limiter(key_func=get_remote_address)
|
|
|
app.state.limiter = limiter
|
|
|
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
|
|
|
|
|
|
# api/routes/ocr.py
|
|
|
from slowapi import Limiter
|
|
|
|
|
|
@router.post("/recognize")
|
|
|
@limiter.limit("10/minute") # 每分钟最多 10 次请求
|
|
|
async def recognize_multipart(...):
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 4.2 CORS 配置过于宽松 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
生产环境不应允许所有来源访问。
|
|
|
|
|
|
**问题位置**: `api/main.py` 第 106-112 行
|
|
|
|
|
|
```python
|
|
|
app.add_middleware(
|
|
|
CORSMiddleware,
|
|
|
allow_origins=["*"], # ⚠️ 危险!允许任何域名
|
|
|
allow_credentials=True,
|
|
|
allow_methods=["*"],
|
|
|
allow_headers=["*"],
|
|
|
)
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 任何网站都可以调用你的 API
|
|
|
- 可能被用于 CSRF 攻击
|
|
|
- 敏感数据可能泄露给第三方
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
通过环境变量配置允许的域名:
|
|
|
|
|
|
```python
|
|
|
import os
|
|
|
|
|
|
ALLOWED_ORIGINS = os.getenv("ALLOWED_ORIGINS", "http://localhost:3000").split(",")
|
|
|
|
|
|
app.add_middleware(
|
|
|
CORSMiddleware,
|
|
|
allow_origins=ALLOWED_ORIGINS,
|
|
|
allow_credentials=True,
|
|
|
allow_methods=["GET", "POST"],
|
|
|
allow_headers=["*"],
|
|
|
)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 4.3 Base64 解码后缺少图片尺寸验证 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
恶意用户可构造压缩率极高的图片(如 zip bomb),解码后占用大量内存。
|
|
|
|
|
|
**问题位置**: `api/dependencies.py` 第 94-116 行
|
|
|
|
|
|
**影响**
|
|
|
- 单个请求可能消耗数 GB 内存
|
|
|
- 导致服务崩溃或 OOM
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
在图片解码后添加尺寸检查:
|
|
|
|
|
|
```python
|
|
|
def decode_image_bytes(content: bytes, max_dimension: int = 10000) -> np.ndarray:
|
|
|
"""
|
|
|
将图片字节解码为 numpy 数组
|
|
|
|
|
|
Args:
|
|
|
content: 图片字节数据
|
|
|
max_dimension: 最大允许的图片尺寸(宽或高)
|
|
|
"""
|
|
|
try:
|
|
|
nparr = np.frombuffer(content, np.uint8)
|
|
|
image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
|
|
|
if image is None:
|
|
|
raise InvalidImageError("图片解码失败")
|
|
|
|
|
|
# 新增: 检查图片尺寸
|
|
|
height, width = image.shape[:2]
|
|
|
if width > max_dimension or height > max_dimension:
|
|
|
raise InvalidImageError(
|
|
|
f"图片尺寸过大 ({width}x{height}),最大允许 {max_dimension}x{max_dimension}"
|
|
|
)
|
|
|
|
|
|
return image
|
|
|
except InvalidImageError:
|
|
|
raise
|
|
|
except Exception as e:
|
|
|
raise InvalidImageError(f"图片解码失败: {str(e)}")
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 五、配置管理问题
|
|
|
|
|
|
### 5.1 硬编码配置 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
多处配置硬编码在代码中,无法通过环境变量调整。
|
|
|
|
|
|
**问题位置**
|
|
|
|
|
|
```python
|
|
|
# api/security.py:19
|
|
|
MAX_FILE_SIZE = 10 * 1024 * 1024 # 硬编码
|
|
|
|
|
|
# api/main.py:48-56
|
|
|
return OCRConfig(
|
|
|
lang="ch", # 硬编码默认值
|
|
|
use_angle_cls=True,
|
|
|
use_gpu=False,
|
|
|
drop_score=0.5,
|
|
|
)
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 不同环境(开发/测试/生产)无法使用不同配置
|
|
|
- 修改配置需要改代码并重新部署
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
使用 `pydantic-settings` 统一管理:
|
|
|
|
|
|
```python
|
|
|
# utils/settings.py
|
|
|
from pydantic_settings import BaseSettings
|
|
|
from typing import List
|
|
|
|
|
|
class Settings(BaseSettings):
|
|
|
# 文件上传限制
|
|
|
max_file_size: int = 10 * 1024 * 1024
|
|
|
max_image_dimension: int = 10000
|
|
|
|
|
|
# OCR 默认配置
|
|
|
ocr_default_lang: str = "ch"
|
|
|
ocr_use_gpu: bool = False
|
|
|
ocr_drop_score: float = 0.5
|
|
|
|
|
|
# API 配置
|
|
|
api_rate_limit: str = "10/minute"
|
|
|
cors_origins: List[str] = ["http://localhost:3000"]
|
|
|
|
|
|
# 日志配置
|
|
|
log_level: str = "INFO"
|
|
|
|
|
|
class Config:
|
|
|
env_prefix = "VISION_OCR_"
|
|
|
env_file = ".env"
|
|
|
|
|
|
settings = Settings()
|
|
|
```
|
|
|
|
|
|
使用示例:
|
|
|
|
|
|
```python
|
|
|
from utils.settings import settings
|
|
|
|
|
|
MAX_FILE_SIZE = settings.max_file_size
|
|
|
|
|
|
ocr_config = OCRConfig(
|
|
|
lang=settings.ocr_default_lang,
|
|
|
use_gpu=settings.ocr_use_gpu,
|
|
|
drop_score=settings.ocr_drop_score,
|
|
|
)
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 5.2 API 版本号分散 (低 🟢)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
版本号在多处定义,可能不一致。
|
|
|
|
|
|
**问题位置**
|
|
|
|
|
|
```python
|
|
|
# api/main.py:98
|
|
|
version="1.0.0",
|
|
|
|
|
|
# api/routes/health.py:14
|
|
|
API_VERSION = "1.0.0"
|
|
|
```
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
从单一来源读取版本号:
|
|
|
|
|
|
```python
|
|
|
# api/__init__.py
|
|
|
__version__ = "1.0.0"
|
|
|
|
|
|
# 其他文件使用
|
|
|
from api import __version__
|
|
|
```
|
|
|
|
|
|
或从 `pyproject.toml` 动态读取:
|
|
|
|
|
|
```python
|
|
|
from importlib.metadata import version
|
|
|
__version__ = version("vision-ocr")
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 六、测试覆盖问题
|
|
|
|
|
|
### 6.1 核心模块缺少单元测试 (中等 🟡)
|
|
|
|
|
|
**现状**
|
|
|
|
|
|
只有 API 集成测试,核心业务逻辑无单元测试。
|
|
|
|
|
|
**缺失的测试**
|
|
|
|
|
|
| 模块 | 测试覆盖 | 风险 |
|
|
|
|------|----------|------|
|
|
|
| `ocr/engine.py` | ❌ 无 | 高 - OCR 核心逻辑 |
|
|
|
| `ocr/express_parser.py` | ❌ 无 | 高 - 正则匹配复杂 |
|
|
|
| `ocr/pipeline.py` | ❌ 无 | 高 - 处理流程 |
|
|
|
| `input/loader.py` | ❌ 无 | 中 - 文件加载 |
|
|
|
| `visualize/draw.py` | ❌ 无 | 低 - 可视化 |
|
|
|
| `utils/config.py` | ❌ 无 | 低 - 配置类 |
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
为 `ExpressParser` 添加单元测试(最高优先级):
|
|
|
|
|
|
```python
|
|
|
# tests/test_express_parser.py
|
|
|
import pytest
|
|
|
from ocr.express_parser import ExpressParser
|
|
|
from ocr.engine import TextBlock
|
|
|
|
|
|
class TestExpressParser:
|
|
|
@pytest.fixture
|
|
|
def parser(self):
|
|
|
return ExpressParser()
|
|
|
|
|
|
def test_extract_tracking_number(self, parser):
|
|
|
text_blocks = [
|
|
|
TextBlock(
|
|
|
text="运单号:SF1234567890",
|
|
|
confidence=0.95,
|
|
|
bbox=[[0, 0], [100, 0], [100, 20], [0, 20]],
|
|
|
)
|
|
|
]
|
|
|
result = parser.parse(text_blocks)
|
|
|
assert result.tracking_number == "SF1234567890"
|
|
|
|
|
|
def test_extract_phone_number(self, parser):
|
|
|
text_blocks = [
|
|
|
TextBlock(
|
|
|
text="收件人:张三 13800138000",
|
|
|
confidence=0.95,
|
|
|
bbox=[[0, 0], [200, 0], [200, 20], [0, 20]],
|
|
|
)
|
|
|
]
|
|
|
result = parser.parse(text_blocks)
|
|
|
assert result.receiver_phone == "13800138000"
|
|
|
|
|
|
def test_detect_courier_company(self, parser):
|
|
|
text_blocks = [
|
|
|
TextBlock(
|
|
|
text="顺丰速运",
|
|
|
confidence=0.95,
|
|
|
bbox=[[0, 0], [100, 0], [100, 20], [0, 20]],
|
|
|
)
|
|
|
]
|
|
|
result = parser.parse(text_blocks)
|
|
|
assert result.courier_company == "顺丰速运"
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 6.2 测试使用 Mock 导致假阳性 (中等 🟡)
|
|
|
|
|
|
**问题描述**
|
|
|
|
|
|
测试全程使用 Mock Pipeline,无法验证真实 OCR 行为。
|
|
|
|
|
|
**问题位置**: `tests/conftest.py` 第 25-67 行
|
|
|
|
|
|
```python
|
|
|
@pytest.fixture(scope="session")
|
|
|
def mock_ocr_pipeline():
|
|
|
mock_pipeline = MagicMock()
|
|
|
mock_pipeline.process.return_value = mock_result # 永远返回固定结果
|
|
|
```
|
|
|
|
|
|
**影响**
|
|
|
- 无法发现 OCR 引擎的问题
|
|
|
- 接口变更可能导致测试仍然通过
|
|
|
- 端到端流程未被验证
|
|
|
|
|
|
**解决方案**
|
|
|
|
|
|
添加集成测试(可选择性运行):
|
|
|
|
|
|
```python
|
|
|
# tests/test_integration.py
|
|
|
import pytest
|
|
|
import os
|
|
|
|
|
|
# 通过环境变量控制是否运行集成测试
|
|
|
SKIP_INTEGRATION = os.getenv("SKIP_INTEGRATION_TESTS", "true").lower() == "true"
|
|
|
|
|
|
@pytest.mark.skipif(SKIP_INTEGRATION, reason="跳过集成测试")
|
|
|
class TestOCRIntegration:
|
|
|
@pytest.fixture(scope="class")
|
|
|
def real_pipeline(self):
|
|
|
"""使用真实的 OCR Pipeline"""
|
|
|
from ocr.pipeline import OCRPipeline
|
|
|
from utils.config import OCRConfig, PipelineConfig
|
|
|
|
|
|
pipeline = OCRPipeline(OCRConfig(), PipelineConfig())
|
|
|
pipeline.initialize()
|
|
|
return pipeline
|
|
|
|
|
|
def test_real_ocr_recognition(self, real_pipeline):
|
|
|
"""测试真实 OCR 识别"""
|
|
|
import cv2
|
|
|
import numpy as np
|
|
|
|
|
|
# 创建包含文字的测试图片
|
|
|
image = np.ones((100, 300, 3), dtype=np.uint8) * 255
|
|
|
cv2.putText(image, "Hello OCR", (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 0), 2)
|
|
|
|
|
|
result = real_pipeline.process(image)
|
|
|
|
|
|
assert result is not None
|
|
|
assert result.text_count >= 0 # 可能识别到也可能没有
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 七、功能增强建议
|
|
|
|
|
|
### 7.1 支持语言动态切换
|
|
|
|
|
|
**现状**
|
|
|
|
|
|
`lang` 参数即使生效,切换语言也需要重新初始化 OCR 引擎,耗时较长。
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
预加载多语言模型,或实现模型池:
|
|
|
|
|
|
```python
|
|
|
class OCREnginePool:
|
|
|
"""OCR 引擎池,支持多语言"""
|
|
|
|
|
|
def __init__(self):
|
|
|
self._engines: Dict[str, OCREngine] = {}
|
|
|
|
|
|
def get_engine(self, lang: str) -> OCREngine:
|
|
|
if lang not in self._engines:
|
|
|
config = OCRConfig(lang=lang)
|
|
|
engine = OCREngine(config)
|
|
|
engine.initialize()
|
|
|
self._engines[lang] = engine
|
|
|
return self._engines[lang]
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 7.2 添加结果缓存
|
|
|
|
|
|
**场景**
|
|
|
|
|
|
相同图片重复识别时可直接返回缓存结果,节省计算资源。
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
基于图片哈希实现缓存:
|
|
|
|
|
|
```python
|
|
|
import hashlib
|
|
|
from functools import lru_cache
|
|
|
|
|
|
def get_image_hash(image_bytes: bytes) -> str:
|
|
|
return hashlib.md5(image_bytes).hexdigest()
|
|
|
|
|
|
# 使用 Redis 或内存缓存
|
|
|
_cache: Dict[str, OCRResult] = {}
|
|
|
|
|
|
def process_with_cache(image_bytes: bytes, pipeline: OCRPipeline) -> OCRResult:
|
|
|
cache_key = get_image_hash(image_bytes)
|
|
|
|
|
|
if cache_key in _cache:
|
|
|
return _cache[cache_key]
|
|
|
|
|
|
image = decode_image_bytes(image_bytes)
|
|
|
result = pipeline.process(image)
|
|
|
|
|
|
_cache[cache_key] = result
|
|
|
return result
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 7.3 支持异步处理
|
|
|
|
|
|
**场景**
|
|
|
|
|
|
大批量图片处理时,同步等待耗时过长。
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
提供任务队列 + Webhook 回调模式:
|
|
|
|
|
|
```python
|
|
|
@router.post("/recognize/async")
|
|
|
async def recognize_async(
|
|
|
file: UploadFile,
|
|
|
callback_url: str = Form(..., description="处理完成后的回调 URL"),
|
|
|
) -> dict:
|
|
|
# 1. 保存图片到临时存储
|
|
|
task_id = str(uuid.uuid4())
|
|
|
save_to_storage(task_id, await file.read())
|
|
|
|
|
|
# 2. 提交任务到队列
|
|
|
queue.enqueue(process_ocr_task, task_id, callback_url)
|
|
|
|
|
|
# 3. 立即返回任务 ID
|
|
|
return {"task_id": task_id, "status": "pending"}
|
|
|
|
|
|
@router.get("/task/{task_id}")
|
|
|
async def get_task_status(task_id: str) -> dict:
|
|
|
# 查询任务状态
|
|
|
return {"task_id": task_id, "status": get_status(task_id)}
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
### 7.4 增强快递单解析能力
|
|
|
|
|
|
**现状**
|
|
|
|
|
|
正则匹配覆盖有限,部分快递公司格式无法识别。
|
|
|
|
|
|
**建议**
|
|
|
|
|
|
1. **扩展正则模式库**:收集更多快递单样本,补充正则规则
|
|
|
2. **引入 NER 模型**:使用命名实体识别提取人名、地址等
|
|
|
3. **添加置信度评估**:对解析结果的可靠性给出评分
|
|
|
|
|
|
```python
|
|
|
class ExpressParser:
|
|
|
def parse(self, text_blocks: List[TextBlock]) -> ExpressInfo:
|
|
|
info = self._extract_by_regex(text_blocks)
|
|
|
|
|
|
# 如果正则效果不好,尝试 NER
|
|
|
if not info.is_valid:
|
|
|
info = self._extract_by_ner(text_blocks)
|
|
|
|
|
|
# 评估解析结果的置信度
|
|
|
info.parse_confidence = self._evaluate_confidence(info)
|
|
|
|
|
|
return info
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 八、优化优先级总结
|
|
|
|
|
|
### 按紧急程度分类
|
|
|
|
|
|
#### P0 - 必须立即修复 🔴
|
|
|
|
|
|
| 问题 | 影响 | 工作量 |
|
|
|
|------|------|--------|
|
|
|
| API 参数未生效 | 功能完全失效,用户设置的参数无意义 | 中 |
|
|
|
| Pipeline 线程安全 | 并发请求数据错乱,生产事故风险 | 中 |
|
|
|
| 缺少速率限制 | 服务可被 DDoS 攻击,稳定性风险 | 低 |
|
|
|
|
|
|
#### P1 - 近期需要处理 🟡
|
|
|
|
|
|
| 问题 | 影响 | 工作量 |
|
|
|
|------|------|--------|
|
|
|
| 缺少日志系统 | 无法排查线上问题 | 低 |
|
|
|
| CORS 过于宽松 | 安全风险 | 低 |
|
|
|
| 图片尺寸验证缺失 | 内存攻击风险 | 低 |
|
|
|
| 代码重复 | 维护成本增加 | 低 |
|
|
|
| 测试覆盖不足 | 回归风险 | 中 |
|
|
|
|
|
|
#### P2 - 可以规划 🟢
|
|
|
|
|
|
| 问题 | 影响 | 工作量 |
|
|
|
|------|------|--------|
|
|
|
| 配置硬编码 | 部署灵活性差 | 中 |
|
|
|
| 异常处理宽泛 | 问题定位困难 | 低 |
|
|
|
| 类型注解不完整 | 代码可读性 | 低 |
|
|
|
| 可视化器重复创建 | 性能损耗(轻微) | 低 |
|
|
|
|
|
|
#### P3 - 长期优化
|
|
|
|
|
|
| 问题 | 影响 | 工作量 |
|
|
|
|------|------|--------|
|
|
|
| 批处理 API | 用户体验 | 中 |
|
|
|
| 结果缓存 | 性能优化 | 中 |
|
|
|
| 异步处理 | 大批量场景支持 | 高 |
|
|
|
| 快递单解析增强 | 产品竞争力 | 高 |
|
|
|
|
|
|
---
|
|
|
|
|
|
### 建议的修复顺序
|
|
|
|
|
|
```
|
|
|
1. [P0] 修复 API 参数传递问题
|
|
|
2. [P0] 解决 Pipeline 线程安全问题
|
|
|
3. [P0] 添加速率限制
|
|
|
4. [P1] 引入日志框架
|
|
|
5. [P1] 修复 CORS 配置
|
|
|
6. [P1] 添加图片尺寸验证
|
|
|
7. [P1] 提取重复代码
|
|
|
8. [P2] 配置外部化
|
|
|
9. [P2] 补充单元测试
|
|
|
```
|
|
|
|
|
|
---
|
|
|
|
|
|
## 附录:快速修复代码片段
|
|
|
|
|
|
### A. 修复 API 参数传递
|
|
|
|
|
|
```python
|
|
|
# api/routes/ocr.py
|
|
|
|
|
|
def _process_ocr(
|
|
|
image_bytes: bytes,
|
|
|
pipeline: OCRPipeline,
|
|
|
params: OCRRequestParams, # 新增
|
|
|
roi: Optional[ROIParams] = None,
|
|
|
return_annotated_image: bool = False,
|
|
|
) -> tuple[OCRResult, Optional[str]]:
|
|
|
image = decode_image_bytes(image_bytes)
|
|
|
pipeline_config = build_pipeline_config(roi)
|
|
|
|
|
|
# 关键:将参数传递给 process 方法
|
|
|
result = pipeline.process(
|
|
|
image,
|
|
|
pipeline_config=pipeline_config,
|
|
|
drop_score=params.drop_score,
|
|
|
)
|
|
|
|
|
|
# ...
|
|
|
```
|
|
|
|
|
|
### B. 修复线程安全问题
|
|
|
|
|
|
```python
|
|
|
# ocr/pipeline.py
|
|
|
|
|
|
def process(
|
|
|
self,
|
|
|
image: np.ndarray,
|
|
|
image_path: Optional[str] = None,
|
|
|
pipeline_config: Optional[PipelineConfig] = None, # 新增
|
|
|
drop_score: Optional[float] = None, # 新增
|
|
|
) -> OCRResult:
|
|
|
config = pipeline_config or self._pipeline_config
|
|
|
effective_drop_score = drop_score or self._ocr_config.drop_score
|
|
|
|
|
|
# 使用传入的配置,而不是修改实例属性
|
|
|
cropped_image, roi_offset, roi_rect = self._apply_roi(image, config.roi)
|
|
|
# ...
|
|
|
```
|
|
|
|
|
|
### C. 添加速率限制
|
|
|
|
|
|
```python
|
|
|
# requirements.txt
|
|
|
slowapi>=0.1.9
|
|
|
|
|
|
# api/main.py
|
|
|
from slowapi import Limiter
|
|
|
from slowapi.util import get_remote_address
|
|
|
|
|
|
limiter = Limiter(key_func=get_remote_address)
|
|
|
app.state.limiter = limiter
|
|
|
|
|
|
# api/routes/ocr.py
|
|
|
from fastapi import Request
|
|
|
from api.main import limiter
|
|
|
|
|
|
@router.post("/recognize")
|
|
|
@limiter.limit("10/minute")
|
|
|
async def recognize_multipart(request: Request, ...):
|
|
|
# ...
|
|
|
```
|
|
|
|
|
|
---
|