孔夫子旧书网作为国内核心的旧书交易与古籍数据平台,其开放接口承载着古籍、善本、旧书等特色商品的元数据获取功能 —— 不同于普通电商接口,其数据结构包含 “年代”“品相”“版本” 等古籍专属字段,且对调用频率与签名合规性要求严格。本文从实战角度拆解接口调用全流程,覆盖认证签名、古籍检索、商铺集成、特色数据处理四大核心场景,提供可直接复用的 Python 代码与避坑指南,助力古籍数字化、学术研究、旧书商管理等业务快速落地。
一、接口调用前置准备(聚焦古籍特色)
1. 核心参数与接口规范(必知要点)
调用孔夫子接口前需明确基础参数与平台限制,尤其关注古籍专属字段的配置要求:
参数类别 | 具体项 | 说明 | 是否必选 |
认证参数 | appKey | 平台分配的应用唯一标识(注册后在开放平台获取) | 是 |
appSecret | 签名密钥(需妥善保管,建议通过环境变量存储,避免硬编码) | 是 | |
timestamp | 毫秒级时间戳(与平台服务器时间偏差≤5 分钟,否则签名失效) | 是 | |
signature | 按平台规则生成的签名字符串(MD5 加密,32 位小写) | 是 | |
古籍检索专属参数 | era | 年代(如 “清代”“民国”“明代”,普通电商接口无此字段) | 否 |
bookCondition | 品相编码(1 = 全新→8 = 八五品以下,古籍业务核心筛选条件) | 否 | |
categoryId | 古籍分类 ID(如 “经部”“史部”,需从/v2/categories接口获取) | 否 | |
通用控制参数 | page/pageSize | 分页参数(pageSize 最大 20,超限会被截断) | 否 |
sort | 排序方式(支持price_asc/publish_time_desc,古籍常用 “年代倒序”) | 否 |
2. 核心接口列表(按业务场景分类)
接口名称 | 接口地址 | 核心功能 | 适用场景 |
图书检索 | /v2/books/search | 按关键词、作者、年代、品相检索古籍 / 旧书 | 古籍批量筛选、学术样本采集 |
图书详情 | /v2/books/detail | 获取单本图书的详细元数据(含版本、页数、描述) | 古籍详情展示、数字化存档 |
商铺检索 | /v2/shops/search | 按地域、主营类目检索旧书商铺 | 商铺合作筛选、多店比价 |
商铺详情 | /v2/shops/detail | 获取商铺信息 + 在售商品列表(支持指定商品数量) | 商铺数据集成、货源监控 |
分类列表 | /v2/categories | 获取图书分类体系(含古籍专属类目) | 类目筛选条件构造、业务分类管理 |
3. 签名生成规则(避坑核心步骤)
孔夫子采用 “参数排序 + MD5 加密” 的签名机制,任一环节错误会直接返回401认证失败,步骤如下:
- 参数过滤:移除空值参数与signature本身,保留非空的业务参数与公共参数;
- ASCII 排序:按参数名 ASCII 码升序排序(如appKey在bookCondition前,era在timestamp前);
- 字符串拼接:按key=value&key=value格式拼接排序后的参数(例:appKey=xxx&era=清代×tamp=1719000000000);
- 密钥追加:在拼接字符串末尾直接追加appSecret(无分隔符,例:上述字符串 +abc123def);
- MD5 加密:对最终字符串进行 UTF-8 编码后,通过 MD5 加密生成 32 位小写字符串,即为signature。
二、核心技术实现(突出古籍特色处理)
1. 通用认证工具类(适配全接口)
封装签名生成与时间戳获取逻辑,支持所有接口复用,避免重复开发:
import hashlibimport timefrom urllib.parse import urlencodeclass KongfzAuthUtil: """孔夫子开放平台认证工具类(支持所有接口签名生成)""" @staticmethod def generate_sign(params: dict, app_secret: str) -> str: """ 生成签名(严格遵循孔夫子MD5签名规则) :param params: 待签名参数字典(含公共参数与业务参数) :param app_secret: 应用密钥 :return: 32位小写签名字符串(None表示生成失败) """ try: # 1. 过滤空值与signature字段 valid_params = { k: v for k, v in params.items() if v is not None and v != "" and k != "signature" } # 2. 按参数名ASCII升序排序 sorted_params = sorted(valid_params.items(), key=lambda x: x[0]) # 3. 拼接"key=value&key=value"格式 param_str = urlencode(sorted_params) # 4. 追加appSecret并MD5加密 sign_str = f"{param_str}{app_secret}" sign = hashlib.md5(sign_str.encode("utf-8")).hexdigest().lower() return sign except Exception as e: print(f"签名生成失败:{str(e)}") return None @staticmethod def get_timestamp() -> int: """获取当前毫秒级时间戳(避免与平台时间偏差超限)""" return int(time.time() * 1000)
2. 古籍图书接口客户端(专属字段解析)
针对古籍的 “年代”“品相”“版本” 字段做专项解析,输出结构化数据,减少业务端处理成本:
import requestsimport timefrom threading import Lockfrom datetime import datetimefrom typing import Dict, List, Optionalclass KongfzBookClient: """孔夫子古籍图书接口客户端(含古籍特色字段解析)""" def __init__(self, app_key: str, app_secret: str): self.app_key = app_key self.app_secret = app_secret self.base_url = "https://open.kongfz.com/api" self.timeout = 15 # 超时时间(秒,古籍数据可能较大,建议设15-20秒) self.qps_limit = 3 # 平台QPS限制(单应用最大3,超限返回429) self.last_request_time = 0 self.request_lock = Lock() # 线程锁控制QPS def _get_common_params(self) -> Dict: """生成所有接口通用的公共参数""" return { "appKey": self.app_key, "timestamp": KongfzAuthUtil.get_timestamp(), "format": "json" # 固定返回JSON格式 } def _control_qps(self) -> None: """QPS限流(避免触发429错误,核心优化点)""" with self.request_lock: current_time = time.time() min_interval = 1.0 / self.qps_limit # 最小请求间隔(≈0.33秒) elapsed = current_time - self.last_request_time if elapsed < min_interval: time.sleep(min_interval - elapsed) # 补全间隔 self.last_request_time = current_time def search_ancient_books(self, **kwargs) -> Optional[Dict]: """ 古籍专属检索(支持年代、品相、类目筛选) :param kwargs: 检索参数(含era、bookCondition、keyword等) :return: 结构化检索结果(含分页信息+古籍列表) """ self._control_qps() # 1. 构造请求URL与参数 url = f"{self.base_url}/v2/books/search" params = self._get_common_params() # 筛选有效业务参数(仅保留接口支持的字段) valid_params = ["keyword", "author", "era", "bookCondition", "categoryId", "minPrice", "maxPrice", "page", "pageSize", "sort"] for param in valid_params: if param in kwargs and kwargs[param] is not None: params[param] = kwargs[param] # 2. 生成签名 params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) # 3. 发送请求并处理响应 try: response = requests.post( url, json=params, headers={ "Content-Type": "application/json;charset=utf-8", "User-Agent": "KongfzAncientBookClient/1.0" }, timeout=self.timeout ) response.raise_for_status() # 捕获4xx/5xx错误 result = response.json() # 4. 业务错误判断(code=200为成功) if result.get("code") != 200: raise Exception(f"检索失败:{result.get('message', '未知错误')}(code:{result.get('code')})") # 5. 解析古籍数据(处理特色字段) return self._parse_ancient_book_result(result.get("data", {})) except Exception as e: print(f"古籍检索异常:{str(e)}") return None def get_book_detail(self, book_id: str) -> Optional[Dict]: """获取单本古籍详情(含版本、内容描述、商铺信息)""" self._control_qps() url = f"{self.base_url}/v2/books/detail" params = self._get_common_params() params["id"] = book_id params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) try: response = requests.post( url, json=params, headers={"Content-Type": "application/json;charset=utf-8"}, timeout=self.timeout ) response.raise_for_status() result = response.json() if result.get("code") != 200: raise Exception(f"详情获取失败:{result.get('message')}(code:{result.get('code')})") return self._parse_book_detail(result.get("data", {})) except Exception as e: print(f"古籍详情异常(ID:{book_id}):{str(e)}") return None def _parse_ancient_book_result(self, raw_data: Dict) -> Optional[Dict]: """解析古籍检索结果(重点处理年代、品相字段)""" if not raw_data or "items" not in raw_data: return None # 处理分页信息 search_info = { "total": raw_data.get("total", 0), "page": raw_data.get("page", 1), "page_size": raw_data.get("pageSize", 20), "total_page": raw_data.get("totalPage", 0) } # 解析单本古籍数据 ancient_books = [] for item in raw_data["items"]: # 品相编码转描述(如2→九五品,古籍业务常用) book_condition_desc = item.get("bookConditionDesc", "") or self._map_condition_code(item.get("bookCondition", 0)) ancient_books.append({ "book_id": item.get("id", ""), "title": item.get("title", ""), "author": item.get("author", ""), "era": item.get("era", "未知年代"), # 古籍核心字段 "book_condition": { "code": item.get("bookCondition", 0), "desc": book_condition_desc }, "price": float(item.get("price", 0)), "publisher": item.get("publisher", "未知出版社"), "publish_time": item.get("publishTime", "未知时间"), "cover_img": self._complete_img_url(item.get("coverImg", "")), "shop_info": { "id": item.get("shopId", ""), "name": item.get("shopName", "") }, "tags": item.get("tags", []), # 如["儒家经典","清代刻本"] "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") }) return {"search_info": search_info, "ancient_books": ancient_books} def _parse_book_detail(self, raw_data: Dict) -> Optional[Dict]: """解析古籍详情(含版本、内容描述等深度字段)""" if not raw_data: return None return { "book_id": raw_data.get("id", ""), "title": raw_data.get("title", ""), "subtitle": raw_data.get("subtitle", ""), "author": raw_data.get("author", ""), "translator": raw_data.get("translator", ""), "era": raw_data.get("era", "未知年代"), "version": raw_data.get("edition", "未知版本"), # 古籍版本(如"清代刻本") "binding": raw_data.get("binding", "未知装帧"), # 装帧(线装/平装,古籍多线装) "pages": raw_data.get("pages", 0), # 页数(古籍常用"卷"表述,需业务端转换) "price": float(raw_data.get("price", 0)), "book_condition": { "code": raw_data.get("bookCondition", 0), "desc": raw_data.get("bookConditionDesc", "") or self._map_condition_code(raw_data.get("bookCondition", 0)) }, "description": raw_data.get("description", "无详细描述"), # 古籍保存状况、瑕疵说明 "content_desc": raw_data.get("contentDesc", "无内容简介"), # 内容摘要(学术研究用) "images": [self._complete_img_url(img) for img in raw_data.get("images", [])], # 多图展示 "shop_info": { "id": raw_data.get("shopId", ""), "name": raw_data.get("shopName", ""), "score": float(raw_data.get("shopScore", 0)), "location": raw_data.get("shopLocation", "未知地域") # 商铺所在地(古籍货源地域分析) }, "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") } def _map_condition_code(self, code: int) -> str: """品相编码映射为文字描述(古籍业务专属映射)""" condition_map = { 1: "全新", 2: "九五品", 3: "九品", 4: "八五品", 5: "八品", 6: "七品", 7: "六品", 8: "八五品以下" } return condition_map.get(code, "未知品相") def _complete_img_url(self, url: str) -> str: """补全图片URL(处理相对路径,避免404)""" if not url: return "" if url.startswith(("http://", "https://")): return url return f"https://img.kongfz.com{url}" if not url.startswith("//") else f"https:{url}"
3. 商铺数据集成客户端(含在售古籍获取)
针对旧书商管理、多店比价场景,封装商铺检索与详情接口,支持获取商铺在售古籍列表:
class KongfzShopClient: """孔夫子商铺接口客户端(支持在售古籍数据获取)""" def __init__(self, app_key: str, app_secret: str): self.app_key = app_key self.app_secret = app_secret self.base_url = "https://open.kongfz.com/api" self.timeout = 15 self.qps_limit = 3 self.last_request_time = 0 self.request_lock = Lock() def _get_common_params(self) -> Dict: return { "appKey": self.app_key, "timestamp": KongfzAuthUtil.get_timestamp(), "format": "json" } def _control_qps(self) -> None: with self.request_lock: current_time = time.time() min_interval = 1.0 / self.qps_limit elapsed = current_time - self.last_request_time if elapsed < min_interval: time.sleep(min_interval - elapsed) self.last_request_time = current_time def search_shops(self, **kwargs) -> Optional[Dict]: """检索旧书商铺(支持地域、主营类目、评分筛选)""" self._control_qps() url = f"{self.base_url}/v2/shops/search" params = self._get_common_params() # 有效商铺检索参数 valid_params = ["keyword", "categoryId", "location", "minScore", "minSales", "isVip", "page", "pageSize", "sort"] for param in valid_params: if param in kwargs and kwargs[param] is not None: params[param] = kwargs[param] params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) try: response = requests.post( url, json=params, headers={"Content-Type": "application/json;charset=utf-8"}, timeout=self.timeout ) response.raise_for_status() result = response.json() if result.get("code") != 200: raise Exception(f"商铺检索失败:{result.get('message')}(code:{result.get('code')})") return self._parse_shop_search_result(result.get("data", {})) except Exception as e: print(f"商铺检索异常:{str(e)}") return None def get_shop_detail(self, shop_id: str, goods_count: int = 5) -> Optional[Dict]: """获取商铺详情+在售古籍(默认返回5件,最大20件)""" self._control_qps() url = f"{self.base_url}/v2/shops/detail" params = self._get_common_params() params["id"] = shop_id params["goodsCount"] = min(goods_count, 20) # 限制最大返回数,避免数据过载 params["signature"] = KongfzAuthUtil.generate_sign(params, self.app_secret) try: response = requests.post( url, json=params, headers={"Content-Type": "application/json;charset=utf-8"}, timeout=self.timeout ) response.raise_for_status() result = response.json() if result.get("code") != 200: raise Exception(f"商铺详情失败:{result.get('message')}(code:{result.get('code')})") return self._parse_shop_detail(result.get("data", {})) except Exception as e: print(f"商铺详情异常(ID:{shop_id}):{str(e)}") return None def _parse_shop_search_result(self, raw_data: Dict) -> Optional[Dict]: """解析商铺检索结果""" if not raw_data or "items" not in raw_data: return None search_info = { "total": raw_data.get("total", 0), "page": raw_data.get("page", 1), "page_size": raw_data.get("pageSize", 20), "total_page": raw_data.get("totalPage", 0) } shops = [] for item in raw_data["items"]: shops.append({ "shop_id": item.get("id", ""), "name": item.get("name", ""), "location": item.get("location", "未知地域"), # 古籍货源地域(如"北京潘家园") "score": float(item.get("score", 0)), # 商铺评分(筛选优质货源) "sales": item.get("sales", 0), # 总销量(可信度参考) "goods_count": item.get("goodsCount", 0), # 在售商品数 "is_vip": item.get("isVip", False), # 是否VIP商铺(服务保障更高) "main_category": item.get("mainCategory", "未知主营"), # 如"古籍善本" "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") }) return {"search_info": search_info, "shops": shops} def _parse_shop_detail(self, raw_data: Dict) -> Optional[Dict]: """解析商铺详情(含在售古籍列表)""" if not raw_data: return None # 解析在售古籍(仅保留核心字段) on_sale_books = [] for p in raw_data.get("products", []): on_sale_books.append({ "book_id": p.get("id", ""), "title": p.get("title", ""), "price": float(p.get("price", 0)), "book_condition": p.get("bookConditionDesc", "未知品相"), "cover_img": self._complete_img_url(p.get("coverImg", "")), "publish_time": p.get("publishTime", "未知时间") }) return { "shop_id": raw_data.get("id", ""), "name": raw_data.get("name", ""), "logo": self._complete_img_url(raw_data.get("logo", "")), "location": raw_data.get("location", "未知地域"), "score": float(raw_data.get("score", 0)), "score_detail": raw_data.get("scoreDetail", {}), # 评分明细(服务/物流/描述) "sales": raw_data.get("sales", 0), "month_sales": raw_data.get("monthSales", 0), # 月销量(近期活跃度) "goods_count": raw_data.get("goodsCount", 0), "business_scope": raw_data.get("businessScope", "无经营范围"), # 如"主营清代古籍、民国期刊" "on_sale_books": on_sale_books, # 在售古籍列表 "fetch_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") } def _complete_img_url(self, url: str) -> str: """复用图片URL补全逻辑""" if not url: return "" if url.startswith(("http://", "https://")): return url return f"https://img.kongfz.com{url}" if not url.startswith("//") else f"https:{url}"
4. 数据管理器(缓存 + 批量处理)
针对古籍数据更新慢、重复调用频繁的特点,封装缓存与批量处理功能,减少接口调用量:
import osimport jsonimport sqlite3from datetime import datetime, timedeltaimport timeclass KongfzDataManager: """孔夫子数据管理器(支持缓存、批量处理、过期清理)""" def __init__(self, app_key: str, app_secret: str, cache_dir: str = "./kongfz_cache"): self.book_client = KongfzBookClient(app_key, app_secret) self.shop_client = KongfzShopClient(app_key, app_secret) self.cache_dir = cache_dir self.db_path = os.path.join(cache_dir, "kongfz_cache.db") self._init_cache_db() # 初始化缓存数据库 def _init_cache_db(self) -> None: """创建图书、商铺、搜索结果缓存表""" if not os.path.exists(self.cache_dir): os.makedirs(self.cache_dir) conn = sqlite3.connect(self.db_path) cursor = conn.cursor() # 古籍缓存表(有效期长,因古籍数据变动少) cursor.execute(''' CREATE TABLE IF NOT EXISTS ancient_book_cache ( book_id TEXT PRIMARY KEY, data TEXT, fetch_time TEXT ) ''') # 搜索结果缓存表(有效期短,避免数据过时) cursor.execute(''' CREATE TABLE IF NOT EXISTS search_cache ( cache_key TEXT PRIMARY KEY, data TEXT, fetch_time TEXT, keyword TEXT ) ''') # 商铺缓存表 cursor.execute(''' CREATE TABLE IF NOT EXISTS shop_cache ( shop_id TEXT PRIMARY KEY, data TEXT, fetch_time TEXT ) ''') conn.commit() conn.close() def batch_get_ancient_books(self, book_ids: List[str], cache_ttl: int = 86400) -> List[Dict]: """批量获取古籍详情(支持缓存,默认缓存1天)""" books = [] for book_id in book_ids: # 优先从缓存获取 cached = self._get_cached("ancient_book_cache", book_id, cache_ttl) if cached: books.append(cached) continue # 缓存未命中,调用接口 detail = self.book_client.get_book_detail(book_id) if detail: books.append(detail) self._update_cache("ancient_book_cache", book_id, detail) time.sleep(0.5) # 额外间隔,避免QPS超限 return books def search_ancient_books_with_cache(self, keyword: str, cache_ttl: int = 3600, **kwargs) -> Optional[Dict]: """带缓存的古籍检索(搜索结果缓存1小时)""" # 生成唯一缓存键(含关键词与分页参数) cache_key = self._generate_cache_key(keyword, **kwargs) # 尝试缓存获取 cached = self._get_cached("search_cache", cache_key, cache_ttl) if cached: print(f"使用缓存:古籍检索(关键词:{keyword},页码:{kwargs.get('page',1)})") return cached # 接口获取并更新缓存 result = self.book_client.search_ancient_books(keyword=keyword, **kwargs) if result: self._update_cache("search_cache", cache_key, result, keyword=keyword) return result def _generate_cache_key(self, keyword: str, **kwargs) -> str: """生成搜索结果的唯一缓存键(避免重复缓存)""" sorted_params = sorted(kwargs.items(), key=lambda x: x[0]) params_str = "&".join([f"{k}={v}" for k, v in sorted_params]) return hashlib.md5(f"ancient_search_{keyword}_{params_str}".encode()).hexdigest() def _get_cached(self, table: str, key: str, ttl: int) -> Optional[Dict]: """从缓存表获取数据(校验有效期)""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() # 图书/商铺缓存表的主键是xxx_id,搜索缓存表是cache_key id_col = f"{table[:-6]}_id" if "search" not in table else "cache_key" cursor.execute(f"SELECT data, fetch_time FROM {table} WHERE {id_col} = ?", (key,)) record = cursor.fetchone() conn.close() if not record: return None data_str, fetch_time = record # 检查是否过期 fetch_dt = datetime.strptime(fetch_time, "%Y-%m-%d %H:%M:%S") if (datetime.now() - fetch_dt).total_seconds() > ttl: return None try: return json.loads(data_str) except json.JSONDecodeError: return None def _update_cache(self, table: str, key: str, data: Dict, keyword: str = "") -> None: """更新缓存表数据(插入或替换)""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() data_str = json.dumps(data, ensure_ascii=False) fetch_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") id_col = f"{table[:-6]}_id" if "search" not in table else "cache_key" if table == "search_cache": # 搜索缓存需存储关键词 cursor.execute(f''' INSERT OR REPLACE INTO {table} ({id_col}, data, fetch_time, keyword) VALUES (?, ?, ?, ?) ''', (key, data_str, fetch_time, keyword)) else: cursor.execute(f''' INSERT OR REPLACE INTO {table} ({id_col}, data, fetch_time) VALUES (?, ?, ?) ''', (key, data_str, fetch_time)) conn.commit() conn.close() def clean_expired_cache(self, max_age: int = 86400 * 7) -> Dict: """清理过期缓存(默认保留7天数据)""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() expire_time = (datetime.now() - timedelta(seconds=max_age)).strftime("%Y-%m-%d %H:%M:%S") deleted = {} # 清理古籍缓存 cursor.execute("DELETE FROM ancient_book_cache WHERE fetch_time < ?", (expire_time,)) deleted["ancient_book"] = cursor.rowcount # 清理搜索缓存 cursor.execute("DELETE FROM search_cache WHERE fetch_time < ?", (expire_time,)) deleted["search"] = cursor.rowcount # 清理商铺缓存 cursor.execute("DELETE FROM shop_cache WHERE fetch_time < ?", (expire_time,)) deleted["shop"] = cursor.rowcount conn.commit() conn.close() print(f"缓存清理完成:古籍{deleted['ancient_book']}条,搜索{deleted['search']}条,商铺{deleted['shop']}条") return deleted
三、实战示例(覆盖核心业务场景)
1. 古籍批量检索与详情获取(学术研究场景)
def ancient_book_research_demo(): """示例:检索清代古籍并获取详情(学术样本采集)""" # 1. 替换为自身的appKey和appSecret(从孔夫子开放平台获取) APP_KEY = "your_app_key" APP_SECRET = "your_app_secret" # 2. 初始化数据管理器(含缓存) data_manager = KongfzDataManager(APP_KEY, APP_SECRET) # 3. 检索参数(清代、九品及以上、关键词"论语",适合学术研究) search_params = { "keyword": "论语", "era": "清代", "bookCondition": 3, # 3=九品(学术研究对品相要求较高) "minPrice": 100, "maxPrice": 5000, "page": 1, "pageSize": 10, "sort": "publish_time_desc" # 按出版时间倒序(优先获取晚期刻本) } # 4. 带缓存检索(避免重复调用) print("=== 开始检索清代古籍 ===") search_result = data_manager.search_ancient_books_with_cache(**search_params) if not search_result: print("古籍检索失败") return # 5. 打印检索结果概览 search_info = search_result["search_info"] print(f"检索结果:共{search_info['total']}本清代论语相关古籍,第{search_info['page']}/{search_info['total_page']}页") # 6. 打印单本古籍信息 for i, book in enumerate(search_result["ancient_books"], 1): print(f"\n{i}. 书名:{book['title']}") print(f" 作者:{book['author']} | 年代:{book['era']} | 品相:{book['book_condition']['desc']}") print(f" 价格:¥{book['price']} | 商家:{book['shop_info']['name']}") print(f" 封面:{book['cover_img'][:50]}...") # 7. 批量获取前3本古籍的详细信息(用于学术分析) if search_result["ancient_books"]: book_ids = [book["book_id"] for book in search_result["ancient_books"][:3]] print(f"\n=== 获取{len(book_ids)}本古籍详情 ===") book_details = data_manager.batch_get_ancient_books(book_ids) for detail in book_details: print(f"\n书名:{detail['title']}") print(f"版本:{detail['version']} | 装帧:{detail['binding']} | 页数:{detail['pages']}页") print(f"内容摘要:{detail['content_desc'][:150]}...") # 打印前150字摘要 print(f"商家信息:{detail['shop_info']['name']}(评分:{detail['shop_info']['score']})") # 8. 清理过期缓存(可选) data_manager.clean_expired_cache()if __name__ == "__main__": ancient_book_research_demo()
2. 旧书商铺筛选与在售古籍分析(旧书商货源场景)
def ancient_shop_analysis_demo(): """示例:筛选北京地区优质古籍商铺并分析在售商品""" APP_KEY = "your_app_key" APP_SECRET = "your_app_secret" data_manager = KongfzDataManager(APP_KEY, APP_SECRET) # 1. 检索北京地区、主营古籍、评分4.5以上的VIP商铺 shop_search_params = { "keyword": "古籍", "location": "北京", "minScore": 4.5, "isVip": 1, # 1=VIP商铺(服务更有保障) "minSales": 1000, # 总销量≥1000(筛选活跃商铺) "page": 1, "pageSize": 5, "sort": "sales_desc" # 按销量排序(优先优质货源) } print("=== 开始筛选北京古籍商铺 ===") shop_result = data_manager.shop_client.search_shops(**shop_search_params) if not shop_result: print("商铺筛选失败") return # 2. 打印商铺列表 print(f"筛选结果:共{shop_result['search_info']['total']}家符合条件的商铺") for i, shop in enumerate(shop_result["shops"], 1): print(f"\n{i}. 商铺名称:{shop['name']}") print(f" 地域:{shop['location']} | 评分:{shop['score']} | 销量:{shop['sales']}单") print(f" 在售古籍:{shop['goods_count']}本 | VIP:{'是' if shop['is_vip'] else '否'}") # 3. 获取第一家商铺的详情与在售古籍 if shop_result["shops"]: first_shop_id = shop_result["shops"][0]["shop_id"] print(f"\n=== 获取商铺【{shop_result['shops'][0]['name']}】详情 ===") shop_detail = data_manager.shop_client.get_shop_detail(first_shop_id, goods_count=8) if shop_detail: print(f"经营范围:{shop_detail['business_scope']}") print(f"本月销量:{shop_detail['month_sales']}单 | 商家地域:{shop_detail['location']}") print(f"\n在售热门古籍({len(shop_detail['on_sale_books'])}本):") for i, book in enumerate(shop_detail["on_sale_books"], 1): print(f"{i}. {book['title']} | 价格:¥{book['price']} | 品相:{book['book_condition']}")# 运行示例# if __name__ == "__main__":# ancient_shop_analysis_demo()
四、避坑指南(针对孔夫子接口特色)
1. 高频错误与解决方案(附错误码)
错误现象 | 错误码 | 可能原因 | 解决方案 |
签名失败 | 401 | 1. 时间戳偏差超 5 分钟;2. 参数排序错误;3. appSecret 错误 | 1. 同步服务器时间;2. 确保参数按 ASCII 升序排序;3. 重新核对 appSecret |
调用频率超限 | 429 | QPS 超过 3,或日调用超 3000 次 | 1. 用_control_qps方法控制间隔;2. 增加缓存(尤其古籍详情);3. 分时段调用 |
古籍年代筛选无结果 | 200 | 1. era 参数格式错误(如 “清朝” 应为 “清代”);2. 无对应数据 | 1. 参考平台文档使用标准年代值(清代 / 民国 / 明代);2. 放宽品相或价格限制 |
图片 URL 404 | - | 接口返回相对路径(如 “/books/123.jpg”) | 使用_complete_img_url方法补全为完整 HTTPS URL |
商铺在售商品为空 | 200 | goodsCount 参数超限(最大 20)或商铺无在售品 | 1. 限制 goodsCount≤20;2. 检查商铺是否正常营业(通过 shop_status 字段) |
2. 古籍特色数据处理技巧(差异化优势)
- 年代标准化:接口返回的 “era” 可能为 “清”“清代”“清朝”,需统一映射为标准值(如 “清”→“清代”),避免数据混乱;
- 品相描述增强:将bookCondition编码与文字描述结合展示(如 “3 - 九品”),同时补充业务说明(如 “九品:保存完好,略有磨损,无缺页”);
- 多卷本处理:古籍常为 “全 X 册”,需从description字段提取卷数信息(如用正则r'全(\d+)册'匹配),补充到结构化数据中;
- 版本区分:重点关注edition字段(如 “乾隆刻本”“民国影印本”),学术场景需单独存储版本信息,用于样本分类。
3. 性能优化建议(降低调用成本)
- 缓存分层策略:古籍详情(变动少)缓存 24 小时,搜索结果(可能变动)缓存 1-6 小时,商铺信息缓存 12 小时;
- 增量更新:记录古籍的fetch_time,下次仅更新publish_time或price有变动的商品,减少重复调用;
- 批量请求控制:批量获取古籍详情时,除 QPS 控制外,额外增加 0.5 秒间隔,避免触发平台的 “突发流量限制”;
- 字段筛选:若仅需核心字段(如 title、era、price),可在请求参数中添加fields字段(如fields=title,era,price),减少数据传输量。
五、合规与扩展建议
- 合规要点:
- 数据用途限制:古籍数据仅用于自身业务(如学术研究、内部管理),不得出售或用于恶意竞争;
- 调用频率遵守:严格按 QPS=3、日调用 3000 次的限制设计逻辑,超限会导致账号临时封禁;
- 版权尊重:接口返回的图书描述、图片等内容,需标注 “来源孔夫子旧书网”,不得篡改或抹去来源信息。
- 扩展方向:
- 古籍数字化存档:结合get_book_detail接口获取的description和images,构建古籍数字档案库;
- 价格趋势分析:基于缓存的历史价格数据,分析古籍(如清代刻本)的价格波动规律;
- 多平台集成:将孔夫子的古籍数据与其他古籍平台(如中华书局数字库)对接,补充学术样本。
- 若在接口对接中遇到 “古籍年代解析”“品相字段处理”“批量调用限流” 等具体问题,可在评论区说明场景(如 “检索明代古籍无结果”),将针对性分享解决方案 —— 孔夫子接口的核心价值在于古籍特色数据,做好专属字段的处理,才能真正发挥其业务价值。