文章目录
近期笔者在做Text2SQL的研究,于是调研了下Chat2DB,基于车辆订单业务做了一些SQL生成验证,有了一点心得,和大家分享一下.:
基于车辆订单业务,模拟新建了以下四张表,并添加了一些测试数据
1. organization:组织表,包含组织id,组织名称,组织分类等3个字段;
3. vehicle:车辆信息表,包含组织id,车辆id,车牌号码,使用年限等字段;
4. refueling_order:车辆加油订单表,包含组织id,车辆id,车牌号码,加油时间,加油费用等字段
5. **driven_distance**:车辆行驶里程表,包含组织id,车辆id,车牌号码,年份,行驶里程等字段
//通过docker,安装运行最新版本的chat2db容器 docker run --name=chat2db -ti -p 10824:10824 -v ~/.chat2db-docker:/root/.chat2db chat2db/chat2db:latest
从GIT上下载并剖析源码,最核心的Text-2-SQL生成代码部分:
/** * SQL转换模型 * * @param queryRequest * @param headers * @return * @throws IOException */ @GetMapping("/chat") @CrossOrigin public SseEmitter completions(ChatQueryRequest queryRequest, @RequestHeader Mapheaders) throws IOException { //默认30秒超时,设置为0L则永不超时 SseEmitter sseEmitter = new SseEmitter(CHAT_TIMEOUT); String uid = headers.get("uid"); if (StrUtil.isBlank(uid)) { throw new ParamBusinessException("uid"); } //提示消息不得为空 if (StringUtils.isBlank(queryRequest.getMessage())) { throw new ParamBusinessException("message"); } return distributeAISql(queryRequest, sseEmitter, uid); }
/** * distribute with different AI * * @return */ public SseEmitter distributeAISql(ChatQueryRequest queryRequest, SseEmitter sseEmitter, String uid) throws IOException { ConfigService configService = ApplicationContextUtil.getBean(ConfigService.class); Config config = configService.find(RestAIClient.AI_SQL_SOURCE).getData(); String aiSqlSource = AiSqlSourceEnum.CHAT2DBAI.getCode(); if (Objects.nonNull(config)) { aiSqlSource = config.getContent(); } AiSqlSourceEnum aiSqlSourceEnum = AiSqlSourceEnum.getByName(aiSqlSource); if (Objects.isNull(aiSqlSourceEnum)) { aiSqlSourceEnum = AiSqlSourceEnum.OPENAI; } uid = aiSqlSourceEnum.getCode() + uid; switch (Objects.requireNonNull(aiSqlSourceEnum)) { case OPENAI : return chatWithOpenAi(queryRequest, sseEmitter, uid); case CHAT2DBAI: return chatWithChat2dbAi(queryRequest, sseEmitter, uid); case RESTAI : case FASTCHATAI: return chatWithFastChatAi(queryRequest, sseEmitter, uid); case AZUREAI : return chatWithAzureAi(queryRequest, sseEmitter, uid); case CLAUDEAI: return chatWithClaudeAi(queryRequest, sseEmitter, uid); case WENXINAI: return chatWithWenxinAi(queryRequest, sseEmitter, uid); case BAICHUANAI: return chatWithBaichuanAi(queryRequest, sseEmitter, uid); case TONGYIQIANWENAI: return chatWithTongyiChatAi(queryRequest, sseEmitter, uid); case ZHIPUAI: return chatWithZhipuChatAi(queryRequest, sseEmitter, uid); } return chatWithOpenAi(queryRequest, sseEmitter, uid); }
/** * 使用OPENAI SQL接口 * * @param queryRequest * @param sseEmitter * @param uid * @return * @throws IOException */ private SseEmitter chatWithOpenAi(ChatQueryRequest queryRequest, SseEmitter sseEmitter, String uid) throws IOException { String prompt = buildPrompt(queryRequest); if (prompt.length() / TOKEN_CONVERT_CHAR_LENGTH > MAX_PROMPT_LENGTH) { log.error("提示语超出最大长度:{},输入长度:{}, 请重新输入", MAX_PROMPT_LENGTH, prompt.length() / TOKEN_CONVERT_CHAR_LENGTH); throw new ParamBusinessException(); } Listmessages = new ArrayList<>(); prompt = prompt.replaceAll("#", ""); log.info(prompt); Message currentMessage = Message.builder().content(prompt).role(Message.Role.USER).build(); messages.add(currentMessage); buildSseEmitter(sseEmitter, uid); OpenAIEventSourceListener openAIEventSourceListener = new OpenAIEventSourceListener(sseEmitter); OpenAIClient.getInstance().streamChatCompletion(messages, openAIEventSourceListener); LocalCache.CACHE.put(uid, JSONUtil.toJsonStr(messages), LocalCache.TIMEOUT); return sseEmitter; }
请根据以下table properties和SQL input将自然语言转换成SQL查询. MYSQL SQL tables, with their properties: ["CREATE TABLE `driven_distance` (\n `id` bigint(20) NOT NULL AUTO_INCREMENT,\n `organization_id` bigint(20) DEFAULT NULL,\n `vehicle_id` bigint(20) DEFAULT NULL,\n `license_plate` varchar(255) DEFAULT NULL,\n 。。。"] SQL input: 2023年,每个季度的加油金额各是多少元?
经过测试,通常的业务查询基本上都能准确生成,另外通过上述一路使用和分析,笔者发现Text2SQL的技术几大要点
上一篇:什么是栈,如何实现?