On April 1, DeepSeek’s affiliated company, Hangzhou DeepSeek AI Fundamental Technology Research Co., Ltd., had its patent for a “Broad Data Collection Method and System” officially published.
The patented method offers several key advantages:
- It maximizes the discovery of web links while minimizing traffic impact on target websites.
- By analyzing downloaded content and estimating the quality of undiscovered links, it optimizes bandwidth allocation, reducing low-quality and redundant downloads while improving data quality and efficiency.
- A dedicated information backfill queue is employed to ensure atomicity and stability in webpage metadata updates.
This innovation enhances the efficiency of web data collection while optimizing resource usage, marking another step forward in DeepSeek’s AI-driven data processing capabilities.