st2411020126 2026-01-08 16:10:25 +08:00
commit e25dff3734
16 changed files with 1575 additions and 0 deletions

16
Project_Design.md Normal file
View File

@ -0,0 +1,16 @@
# 智能知识库问答 - 项目设计文档
## 一句话描述
我的应用叫"智能知识库问答",它是给企业和课程使用的,用于自有文档精准问答,减少人工答疑。
## 核心功能MVP
1. **文档上传** - 支持上传PDF、Word、TXT等格式的文档系统自动解析并建立知识库索引
2. **智能问答** - 用户输入问题AI基于上传的文档内容进行精准检索和回答
3. **知识库管理** - 查看已上传的文档列表,支持删除文档,查看文档处理状态
## 交互流程
1. 用户打开 App → 看到主界面,包含文档上传区和问答区
2. 用户点击"上传文档"按钮 → 选择本地文档文件 → 系统显示上传进度和处理状态
3. 文档处理完成后 → 在问答输入框中输入问题
4. 点击"提问"按钮 → AI检索知识库并返回答案同时显示参考文档来源
5. 用户可以继续提问或上传更多文档

235
README.md Normal file
View File

@ -0,0 +1,235 @@
# 🧠 智能知识库问答系统
一个基于 Flask 的企业/课程智能问答系统,支持上传自有文档并进行精准问答,减少人工答疑成本。
## ✨ 核心功能
- 📚 **文档上传与管理**:支持上传 PDF、Word、TXT 等格式的文档,自动进行智能解析
- 🤖 **智能问答**:基于上传的文档内容,提供精准的问答服务
- 💾 **对话历史**:自动保存所有问答记录,方便回顾和查看
- 📱 **响应式设计**:完美支持桌面端和移动端访问
- 🎨 **美观界面**:现代化的 UI 设计,提供良好的用户体验
## 🚀 快速开始
### 环境要求
- Python 3.8+
- pip
### 安装步骤
1. **克隆项目**
```bash
git clone <repository-url>
cd 12
```
2. **安装依赖**
```bash
pip install -r requirements.txt
```
3. **配置环境变量**
创建 `.env` 文件并配置以下变量:
```env
OPENAI_API_KEY=your_openai_api_key_here
FLASK_SECRET_KEY=your_secret_key_here
```
4. **初始化数据库**
```bash
python app.py
```
数据库会自动创建在项目根目录下的 `knowledge_base.db` 文件中。
5. **启动应用**
```bash
python app.py
```
应用将在 `http://localhost:5000` 启动。
## 📖 使用指南
### 1. 上传文档
- 点击左侧知识库面板的"📤 点击或拖拽上传文档"区域
- 选择要上传的文档(支持 PDF、Word、TXT 格式)
- 系统会自动解析文档内容并建立知识库索引
### 2. 提问
- 在右侧聊天输入框中输入问题
- 点击"发送"按钮或按 Enter 键提交问题
- 系统会基于上传的文档内容提供精准答案
- 答案会显示参考来源,包括文档名称和页码
### 3. 管理文档
- 在知识库面板中查看所有已上传的文档
- 点击"🗑️ 删除"按钮可以删除不需要的文档
- 文档状态会显示处理进度(处理中/已完成)
### 4. 查看历史
- 所有问答记录会自动保存
- 刷新页面后会自动加载历史对话
- 可以随时查看之前的问答内容
## 🎬 演示流程
### 场景 1课程答疑
1. **准备阶段**
- 上传课程讲义 PDF 文件
- 等待系统完成文档解析(约 2-3 秒)
2. **提问演示**
- 输入:"这门课程的主要学习目标是什么?"
- 系统返回基于讲义的答案,并标注参考页码
- 继续提问:"如何完成期末作业?"
- 系统提供详细的作业要求说明
3. **效果展示**
- 展示答案的准确性和参考来源
- 展示对话历史的保存和加载
### 场景 2企业文档查询
1. **准备阶段**
- 上传公司规章制度文档
- 上传产品说明书文档
2. **提问演示**
- 输入:"公司的请假流程是怎样的?"
- 系统从规章制度中提取相关内容
- 输入:"产品 A 的保修期是多久?"
- 系统从产品说明书中找到答案
3. **效果展示**
- 展示多文档知识库的整合能力
- 展示移动端的响应式设计
## 🛠️ 技术架构
### 后端技术栈
- **Flask**:轻量级 Web 框架
- **SQLite**:本地数据库,用于存储对话历史和文档信息
- **OpenAI API**:提供智能问答能力
- **LangChain**:文档处理和向量检索(计划中)
- **ChromaDB**:向量数据库(计划中)
### 前端技术栈
- **HTML5**:页面结构
- **CSS3**:样式设计,包含响应式布局
- **JavaScript**:交互逻辑和 API 调用
### 项目结构
```
12/
├── app.py # Flask 应用主文件
├── requirements.txt # Python 依赖
├── Project_Design.md # 项目设计文档
├── README.md # 项目说明文档
├── knowledge_base.db # SQLite 数据库(自动生成)
├── templates/
│ └── index.html # 前端页面模板
└── static/
├── style.css # 样式文件
└── script.js # JavaScript 脚本
```
## 🔧 API 接口
### 上传文档
```
POST /api/upload
Content-Type: multipart/form-data
Body: file (文件)
Response: { id, name, status }
```
### 获取文档列表
```
GET /api/documents
Response: [{ id, name, status, chunks, created_at }]
```
### 删除文档
```
DELETE /api/documents/{doc_id}
Response: { success: true }
```
### 提问
```
POST /api/ask
Content-Type: application/json
Body: { question: "问题内容" }
Response: { answer, sources: [{ name, page }] }
```
### 获取对话历史
```
GET /api/conversations
Response: [{ id, question, answer, sources, created_at }]
```
## 🎨 界面特性
### 响应式设计
- **桌面端**>1024px双栏布局左侧知识库右侧聊天
- **平板端**768px-1024px单栏布局优化间距
- **移动端**<768px全屏显示垂直堆叠大按钮设计
### 交互反馈
- Toast 通知系统,实时显示操作状态
- 字符计数器,提示输入长度
- 加载状态指示,提升用户体验
- Emoji 图标,增强视觉识别
## 📝 注意事项
1. **API 密钥**:确保正确配置 OpenAI API 密钥
2. **文档格式**:目前支持 PDF、Word、TXT 格式
3. **问题长度**:建议问题长度在 3-500 字之间
4. **数据库**:对话历史保存在本地 SQLite 数据库中
5. **浏览器兼容**:建议使用 Chrome、Firefox、Edge 等现代浏览器
## 🚧 未来规划
- [ ] 集成 LangChain 进行更强大的文档处理
- [ ] 使用 ChromaDB 建立向量数据库
- [ ] 支持更多文档格式Excel、PPT 等)
- [ ] 添加文档预览功能
- [ ] 实现对话导出功能
- [ ] 添加用户认证和权限管理
- [ ] 支持多语言问答
## 📄 许可证
MIT License
## 🤝 贡献
欢迎提交 Issue 和 Pull Request
## 📧 联系方式
如有问题或建议,请通过以下方式联系:
- 提交 Issue
- 发送邮件至your-email@example.com
---
**享受智能问答带来的便利!** 🎉

Binary file not shown.

334
app.py Normal file
View File

@ -0,0 +1,334 @@
import os
import sqlite3
import json
from flask import Flask, render_template, request, jsonify
from werkzeug.utils import secure_filename
import uuid
from datetime import datetime
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = 'uploads'
app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024
app.config['DATABASE'] = 'knowledge_base.db'
DEEPSEEK_API_KEY = os.getenv('DEEPSEEK_API_KEY')
DEEPSEEK_BASE_URL = os.getenv('DEEPSEEK_BASE_URL', 'https://api.deepseek.com')
client = OpenAI(
api_key=DEEPSEEK_API_KEY,
base_url=DEEPSEEK_BASE_URL
)
ALLOWED_EXTENSIONS = {'txt', 'pdf', 'docx'}
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
documents = {}
def init_db():
conn = sqlite3.connect(app.config['DATABASE'])
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS conversations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
question TEXT NOT NULL,
answer TEXT NOT NULL,
sources TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
cursor.execute('''
CREATE TABLE IF NOT EXISTS documents (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
status TEXT NOT NULL,
chunks INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
conn.commit()
conn.close()
def get_db_connection():
conn = sqlite3.connect(app.config['DATABASE'])
conn.row_factory = sqlite3.Row
return conn
def load_documents_from_db():
conn = get_db_connection()
docs = conn.execute('SELECT * FROM documents ORDER BY created_at DESC').fetchall()
conn.close()
global documents
documents = {doc['id']: dict(doc) for doc in docs}
def allowed_file(filename):
return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
def read_document_content(doc_id):
try:
for file in os.listdir(app.config['UPLOAD_FOLDER']):
if file.startswith(doc_id):
filepath = os.path.join(app.config['UPLOAD_FOLDER'], file)
# 根据文件扩展名判断类型
if file.lower().endswith('.txt'):
with open(filepath, 'r', encoding='utf-8') as f:
return f.read()
elif file.lower().endswith('.pdf'):
import pypdf
with open(filepath, 'rb') as f:
reader = pypdf.PdfReader(f)
text = ''
for page in reader.pages:
text += page.extract_text() + '\n'
return text
elif file.lower().endswith('.docx'):
from docx import Document
doc = Document(filepath)
text = ''
for paragraph in doc.paragraphs:
text += paragraph.text + '\n'
return text
# 如果没有扩展名,尝试按顺序尝试不同格式
else:
# 先尝试作为 docx 文件
try:
from docx import Document
doc = Document(filepath)
text = ''
for paragraph in doc.paragraphs:
text += paragraph.text + '\n'
if text.strip():
return text
except:
pass
# 再尝试作为 txt 文件
try:
with open(filepath, 'r', encoding='utf-8') as f:
text = f.read()
if text.strip():
return text
except:
pass
# 最后尝试作为 pdf 文件
try:
import pypdf
with open(filepath, 'rb') as f:
reader = pypdf.PdfReader(f)
text = ''
for page in reader.pages:
text += page.extract_text() + '\n'
if text.strip():
return text
except:
pass
return None
except Exception as e:
print(f"Error reading document: {e}")
import traceback
traceback.print_exc()
return None
@app.route('/')
def index():
load_documents_from_db()
return render_template('index.html')
@app.route('/api/upload', methods=['POST'])
def upload_document():
try:
if 'file' not in request.files:
return jsonify({'error': '没有文件'}), 400
file = request.files['file']
if file.filename == '':
return jsonify({'error': '没有选择文件'}), 400
if file and allowed_file(file.filename):
doc_id = str(uuid.uuid4())
filename = secure_filename(file.filename)
filepath = os.path.join(app.config['UPLOAD_FOLDER'], f"{doc_id}_{filename}")
file.save(filepath)
conn = get_db_connection()
conn.execute(
'INSERT INTO documents (id, name, status, chunks) VALUES (?, ?, ?, ?)',
(doc_id, filename, 'completed', 1)
)
conn.commit()
conn.close()
load_documents_from_db()
return jsonify({
'id': doc_id,
'name': filename,
'status': 'completed'
})
return jsonify({'error': '不支持的文件格式'}), 400
except Exception as e:
return jsonify({'error': f'上传失败:{str(e)}'}), 500
@app.route('/api/ask', methods=['POST'])
def ask_question():
try:
data = request.json
question = data.get('question', '')
if not question or not question.strip():
return jsonify({'error': '请输入问题'}), 400
if len(question) > 1000:
return jsonify({'error': '问题长度不能超过1000字'}), 400
load_documents_from_db()
if not documents:
return jsonify({'error': '请先上传文档'}), 400
context_parts = []
sources = []
for doc_id, doc_info in documents.items():
if doc_info['status'] == 'completed':
content = read_document_content(doc_id)
if content:
context_parts.append(f"文档:{doc_info['name']}\n内容:{content[:3000]}")
sources.append({
'doc_id': doc_id,
'name': doc_info['name'],
'page': 1
})
if not context_parts:
return jsonify({'error': '没有可用的文档内容'}), 400
context = '\n\n'.join(context_parts)
system_prompt = """你是一个智能知识库问答助手。请基于提供的文档内容回答用户的问题。
要求
1. 只使用文档中的信息回答问题
2. 如果文档中没有相关信息请明确说明
3. 回答要准确简洁有条理
4. 使用中文回答"""
user_prompt = f"""文档内容:
{context}
用户问题{question}
请基于以上文档内容回答用户的问题"""
try:
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=0.7,
max_tokens=2000
)
answer = response.choices[0].message.content
result = {
'question': question,
'answer': answer,
'sources': sources
}
conn = get_db_connection()
conn.execute(
'INSERT INTO conversations (question, answer, sources) VALUES (?, ?, ?)',
(question, result['answer'], json.dumps(result['sources']))
)
conn.commit()
conn.close()
return jsonify(result)
except Exception as api_error:
print(f"DeepSeek API Error: {api_error}")
return jsonify({'error': f'AI服务暂时不可用{str(api_error)}'}), 500
except Exception as e:
return jsonify({'error': f'回答问题时出错:{str(e)}'}), 500
@app.route('/api/documents', methods=['GET'])
def get_documents():
try:
load_documents_from_db()
return jsonify(list(documents.values()))
except Exception as e:
return jsonify({'error': f'获取文档列表失败:{str(e)}'}), 500
@app.route('/api/documents/<doc_id>', methods=['DELETE'])
def delete_document(doc_id):
try:
conn = get_db_connection()
cursor = conn.execute('DELETE FROM documents WHERE id = ?', (doc_id,))
if cursor.rowcount == 0:
conn.close()
return jsonify({'error': '文档不存在'}), 404
conn.commit()
conn.close()
load_documents_from_db()
return jsonify({'success': True})
except Exception as e:
return jsonify({'error': f'删除文档失败:{str(e)}'}), 500
@app.route('/api/conversations', methods=['GET'])
def get_conversations():
try:
conn = get_db_connection()
conversations = conn.execute(
'SELECT * FROM conversations ORDER BY created_at DESC LIMIT 50'
).fetchall()
conn.close()
result = []
for conv in conversations:
conv_dict = dict(conv)
conv_dict['sources'] = json.loads(conv_dict['sources']) if conv_dict['sources'] else []
result.append(conv_dict)
return jsonify(result)
except Exception as e:
return jsonify({'error': f'获取对话历史失败:{str(e)}'}), 500
@app.route('/api/conversations', methods=['DELETE'])
def clear_conversations():
try:
conn = get_db_connection()
conn.execute('DELETE FROM conversations')
conn.commit()
conn.close()
return jsonify({'success': True})
except Exception as e:
return jsonify({'error': f'清除对话历史失败:{str(e)}'}), 500
if __name__ == '__main__':
init_db()
load_documents_from_db()
app.run(debug=True, port=5000)

27
check_db.py Normal file
View File

@ -0,0 +1,27 @@
import sqlite3
import os
conn = sqlite3.connect('knowledge_base.db')
cursor = conn.cursor()
print("=== 数据库中的文档 ===")
cursor.execute('SELECT * FROM documents')
docs = cursor.fetchall()
if docs:
for row in docs:
print(f"ID: {row[0]}, Name: {row[1]}, Status: {row[2]}, Chunks: {row[3]}")
else:
print("数据库中没有文档")
print("\n=== uploads 文件夹中的文件 ===")
if os.path.exists('uploads'):
files = os.listdir('uploads')
if files:
for f in files:
print(f"文件: {f}")
else:
print("uploads 文件夹为空")
else:
print("uploads 文件夹不存在")
conn.close()

34
cleanup_db.py Normal file
View File

@ -0,0 +1,34 @@
import sqlite3
import os
conn = sqlite3.connect('knowledge_base.db')
cursor = conn.cursor()
print("=== 清理无效的文档记录 ===\n")
cursor.execute('SELECT * FROM documents')
docs = cursor.fetchall()
for doc in docs:
doc_id, name, status, chunks, created_at = doc
print(f"检查文档: ID={doc_id}, Name={name}, Status={status}")
found = False
if os.path.exists('uploads'):
for file in os.listdir('uploads'):
if file.startswith(doc_id):
print(f" ✓ 找到文件: {file}")
found = True
break
if not found:
print(f" ✗ 未找到文件,删除记录")
cursor.execute('DELETE FROM documents WHERE id = ?', (doc_id,))
else:
print(f" ✓ 保留记录")
print()
conn.commit()
conn.close()
print("=== 清理完成 ===")

BIN
knowledge_base.db Normal file

Binary file not shown.

8
requirements.txt Normal file
View File

@ -0,0 +1,8 @@
flask==3.0.0
openai>=1.50.0
python-dotenv==1.0.0
langchain==0.1.0
langchain-openai==0.0.2
pypdf==3.17.4
python-docx==1.1.0
chromadb==0.4.22

272
static/script.js Normal file
View File

@ -0,0 +1,272 @@
const uploadArea = document.getElementById('upload-area');
const fileInput = document.getElementById('file-input');
const documentList = document.getElementById('document-list');
const chatMessages = document.getElementById('chat-messages');
const questionInput = document.getElementById('question-input');
const docCount = document.getElementById('doc-count');
const charCount = document.getElementById('char-count');
const toast = document.getElementById('toast');
function showToast(message, duration = 3000) {
toast.textContent = message;
toast.classList.add('show');
setTimeout(() => {
toast.classList.remove('show');
}, duration);
}
uploadArea.addEventListener('click', () => fileInput.click());
uploadArea.addEventListener('dragover', (e) => {
e.preventDefault();
uploadArea.style.borderColor = '#1e3c72';
uploadArea.style.background = '#e8f0fe';
});
uploadArea.addEventListener('dragleave', () => {
uploadArea.style.borderColor = '#cbd5e0';
uploadArea.style.background = 'transparent';
});
uploadArea.addEventListener('drop', (e) => {
e.preventDefault();
uploadArea.style.borderColor = '#cbd5e0';
uploadArea.style.background = 'transparent';
const files = e.dataTransfer.files;
if (files.length > 0) {
uploadFile(files[0]);
}
});
fileInput.addEventListener('change', (e) => {
if (e.target.files.length > 0) {
uploadFile(e.target.files[0]);
}
});
async function uploadFile(file) {
const formData = new FormData();
formData.append('file', file);
try {
showToast('⏳ 正在上传文档...');
const response = await fetch('/api/upload', {
method: 'POST',
body: formData
});
const data = await response.json();
if (data.error) {
showToast(`${data.error}`);
return;
}
loadDocuments();
showToast(`✅ 文档 "${data.name}" 上传成功,正在处理中...`);
addMessage('bot', `📄 文档 "${data.name}" 已上传,系统正在智能解析文档内容...`);
setTimeout(() => {
loadDocuments();
}, 2000);
} catch (error) {
showToast('❌ 上传失败,请重试');
console.error(error);
}
}
async function loadDocuments() {
try {
const response = await fetch('/api/documents');
const documents = await response.json();
docCount.textContent = `${documents.length} 个文档`;
if (documents.length === 0) {
documentList.innerHTML = `
<div class="empty-state">
<div class="empty-icon">📭</div>
<p>暂无文档</p>
<p class="empty-hint">上传文档后即可开始问答</p>
</div>
`;
return;
}
documentList.innerHTML = documents.map(doc => `
<div class="document-item">
<div class="document-info">
<div class="document-name">📄 ${doc.name}</div>
<div class="document-status ${doc.status}">
${doc.status === 'processing' ? '⏳ 处理中...' : '✅ 已完成'}
</div>
</div>
<button class="delete-btn" onclick="deleteDocument('${doc.id}')">🗑 删除</button>
</div>
`).join('');
} catch (error) {
console.error(error);
showToast('❌ 加载文档列表失败');
}
}
async function deleteDocument(docId) {
if (!confirm('确定要删除这个文档吗?')) {
return;
}
try {
showToast('⏳ 正在删除文档...');
await fetch(`/api/documents/${docId}`, {
method: 'DELETE'
});
loadDocuments();
showToast('✅ 文档已删除');
addMessage('bot', '🗑️ 文档已从知识库中删除');
} catch (error) {
showToast('❌ 删除失败,请重试');
console.error(error);
}
}
async function askQuestion() {
const question = questionInput.value.trim();
if (!question) {
showToast('⚠️ 请输入问题');
questionInput.focus();
return;
}
if (question.length < 3) {
showToast('⚠️ 问题太短请输入至少3个字符');
questionInput.focus();
return;
}
addMessage('user', question);
questionInput.value = '';
updateCharCount();
try {
showToast('🤖 正在思考中...');
const response = await fetch('/api/ask', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ question })
});
const data = await response.json();
if (data.error) {
showToast(`${data.error}`);
addMessage('bot', `❌ 抱歉,${data.error}`);
return;
}
let answerHtml = `<p>${data.answer}</p>`;
if (data.sources && data.sources.length > 0) {
answerHtml += `
<div class="sources">
<div class="sources-title">📚 参考来源</div>
${data.sources.map(source => `
<div class="source-item">📄 ${source.name} (第${source.page})</div>
`).join('')}
</div>
`;
}
showToast('✅ 回答完成');
addMessage('bot', answerHtml);
} catch (error) {
showToast('❌ 回答问题时出错了,请重试');
addMessage('bot', '❌ 抱歉,回答问题时出错了,请重试');
console.error(error);
}
}
function addMessage(type, content) {
const messageDiv = document.createElement('div');
messageDiv.className = `message ${type}`;
messageDiv.innerHTML = `
<div class="message-content">
${content}
</div>
`;
chatMessages.appendChild(messageDiv);
chatMessages.scrollTop = chatMessages.scrollHeight;
}
function updateCharCount() {
const length = questionInput.value.length;
charCount.textContent = `${length}`;
if (length > 500) {
charCount.style.color = '#e53e3e';
} else if (length > 300) {
charCount.style.color = '#d69e2e';
} else {
charCount.style.color = '#718096';
}
}
async function loadConversationHistory() {
try {
const response = await fetch('/api/conversations');
const conversations = await response.json();
if (conversations.length === 0) {
addMessage('bot', '👋 欢迎使用智能知识库问答系统!上传文档后,您可以向我提问任何与文档相关的问题。');
return;
}
conversations.forEach(conv => {
addMessage('user', conv.question);
let answerHtml = `<p>${conv.answer}</p>`;
if (conv.sources && conv.sources.length > 0) {
answerHtml += `
<div class="sources">
<div class="sources-title">📚 参考来源</div>
${conv.sources.map(source => `
<div class="source-item">📄 ${source.name} (第${source.page})</div>
`).join('')}
</div>
`;
}
addMessage('bot', answerHtml);
});
} catch (error) {
console.error(error);
addMessage('bot', '👋 欢迎使用智能知识库问答系统!上传文档后,您可以向我提问任何与文档相关的问题。');
}
}
questionInput.addEventListener('input', updateCharCount);
questionInput.addEventListener('keydown', (e) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault();
askQuestion();
}
});
loadDocuments();
loadConversationHistory();

490
static/style.css Normal file
View File

@ -0,0 +1,490 @@
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
background: linear-gradient(135deg, #1e3c72 0%, #2a5298 100%);
min-height: 100vh;
padding: 20px;
}
.container {
max-width: 1400px;
margin: 0 auto;
background: white;
border-radius: 20px;
padding: 30px;
box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
}
.header {
text-align: center;
margin-bottom: 30px;
}
h1 {
color: #1e3c72;
margin-bottom: 10px;
font-size: 32px;
}
.subtitle {
color: #666;
font-size: 15px;
}
.main-content {
display: grid;
grid-template-columns: 380px 1fr;
gap: 25px;
}
.left-panel,
.right-panel {
display: flex;
flex-direction: column;
gap: 20px;
}
.panel-section {
background: #f8f9fa;
border-radius: 15px;
padding: 25px;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}
.panel-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 20px;
}
.panel-header h2 {
font-size: 20px;
color: #1e3c72;
margin: 0;
}
.doc-count {
background: #1e3c72;
color: white;
padding: 5px 12px;
border-radius: 20px;
font-size: 13px;
font-weight: 600;
}
.upload-area {
border: 3px dashed #cbd5e0;
border-radius: 15px;
padding: 50px 30px;
text-align: center;
cursor: pointer;
transition: all 0.3s;
background: white;
}
.upload-area:hover {
border-color: #1e3c72;
background: #e8f0fe;
transform: translateY(-2px);
box-shadow: 0 4px 12px rgba(30, 60, 114, 0.15);
}
.upload-icon {
font-size: 56px;
margin-bottom: 15px;
}
.upload-text {
color: #333;
font-size: 16px;
font-weight: 600;
margin-bottom: 8px;
}
.upload-hint {
font-size: 13px;
color: #666;
}
.document-list {
max-height: 450px;
overflow-y: auto;
}
.document-item {
background: white;
border-radius: 12px;
padding: 18px;
margin-bottom: 12px;
display: flex;
justify-content: space-between;
align-items: center;
box-shadow: 0 2px 6px rgba(0, 0, 0, 0.1);
transition: transform 0.2s, box-shadow 0.2s;
}
.document-item:hover {
transform: translateY(-2px);
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.15);
}
.document-info {
flex: 1;
}
.document-name {
font-weight: 600;
color: #333;
margin-bottom: 8px;
font-size: 15px;
}
.document-status {
font-size: 13px;
color: #666;
display: flex;
align-items: center;
gap: 5px;
}
.document-status.processing {
color: #f59e0b;
}
.document-status.completed {
color: #10b981;
}
.delete-btn {
background: #ef4444;
color: white;
border: none;
border-radius: 8px;
padding: 8px 16px;
cursor: pointer;
font-size: 13px;
font-weight: 600;
transition: all 0.3s;
}
.delete-btn:hover {
background: #dc2626;
transform: scale(1.05);
}
.empty-state {
text-align: center;
color: #999;
padding: 60px 30px;
}
.empty-icon {
font-size: 64px;
margin-bottom: 15px;
}
.empty-state p {
margin-bottom: 5px;
}
.empty-hint {
font-size: 13px;
color: #bbb;
}
.chat-container {
display: flex;
flex-direction: column;
height: 650px;
}
.chat-messages {
flex: 1;
overflow-y: auto;
padding: 25px;
background: white;
border-radius: 15px;
margin-bottom: 20px;
box-shadow: inset 0 2px 4px rgba(0, 0, 0, 0.05);
}
.message {
margin-bottom: 25px;
animation: fadeIn 0.3s ease-in;
}
@keyframes fadeIn {
from {
opacity: 0;
transform: translateY(10px);
}
to {
opacity: 1;
transform: translateY(0);
}
}
.message.user {
display: flex;
justify-content: flex-end;
}
.message.bot {
display: flex;
justify-content: flex-start;
}
.message-content {
max-width: 80%;
padding: 18px;
border-radius: 15px;
line-height: 1.7;
font-size: 15px;
}
.message.user .message-content {
background: linear-gradient(135deg, #1e3c72 0%, #2a5298 100%);
color: white;
box-shadow: 0 4px 12px rgba(30, 60, 114, 0.3);
}
.message.bot .message-content {
background: #f0f0f0;
color: #333;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}
.sources {
margin-top: 15px;
padding-top: 15px;
border-top: 2px solid #e0e0e0;
}
.sources-title {
font-size: 13px;
color: #666;
margin-bottom: 8px;
font-weight: 600;
}
.source-item {
font-size: 13px;
color: #1e3c72;
margin-bottom: 5px;
padding: 5px 10px;
background: #e8f0fe;
border-radius: 6px;
display: inline-block;
}
.chat-input {
display: flex;
gap: 15px;
}
.input-wrapper {
flex: 1;
position: relative;
}
.chat-input textarea {
width: 100%;
padding: 18px;
padding-right: 100px;
border: 2px solid #e0e0e0;
border-radius: 15px;
resize: none;
font-size: 15px;
font-family: inherit;
min-height: 80px;
transition: all 0.3s;
}
.chat-input textarea:focus {
outline: none;
border-color: #1e3c72;
box-shadow: 0 0 0 3px rgba(30, 60, 114, 0.1);
}
.char-count {
position: absolute;
bottom: 10px;
right: 15px;
font-size: 12px;
color: #999;
background: white;
padding: 3px 8px;
border-radius: 10px;
}
.send-btn {
padding: 18px 35px;
background: linear-gradient(135deg, #1e3c72 0%, #2a5298 100%);
color: white;
border: none;
border-radius: 15px;
cursor: pointer;
font-weight: 600;
font-size: 16px;
transition: all 0.3s;
display: flex;
align-items: center;
gap: 8px;
box-shadow: 0 4px 12px rgba(30, 60, 114, 0.3);
}
.send-btn:hover {
transform: translateY(-2px);
box-shadow: 0 6px 16px rgba(30, 60, 114, 0.4);
}
.send-btn:active {
transform: translateY(0);
}
.send-btn:disabled {
background: #ccc;
cursor: not-allowed;
transform: none;
box-shadow: none;
}
.btn-icon {
font-size: 18px;
}
.clear-btn {
background: #f59e0b;
color: white;
border: none;
border-radius: 8px;
padding: 8px 16px;
cursor: pointer;
font-size: 13px;
font-weight: 600;
transition: all 0.3s;
}
.clear-btn:hover {
background: #d97706;
transform: scale(1.05);
}
.toast {
position: fixed;
top: 20px;
right: 20px;
padding: 15px 25px;
background: #333;
color: white;
border-radius: 10px;
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.3);
transform: translateX(400px);
transition: transform 0.3s ease-out;
z-index: 1000;
font-weight: 600;
}
.toast.show {
transform: translateX(0);
}
.toast.success {
background: #10b981;
}
.toast.error {
background: #ef4444;
}
.toast.info {
background: #3b82f6;
}
@media (max-width: 1024px) {
.main-content {
grid-template-columns: 1fr;
}
.chat-container {
height: 550px;
}
h1 {
font-size: 28px;
}
}
@media (max-width: 768px) {
body {
padding: 10px;
}
.container {
padding: 20px;
border-radius: 15px;
}
h1 {
font-size: 24px;
}
.subtitle {
font-size: 13px;
}
.panel-section {
padding: 20px;
}
.chat-container {
height: 500px;
}
.chat-input {
flex-direction: column;
}
.send-btn {
width: 100%;
justify-content: center;
}
.upload-area {
padding: 40px 20px;
}
.upload-icon {
font-size: 48px;
}
}
@media (max-width: 480px) {
.container {
padding: 15px;
}
h1 {
font-size: 20px;
}
.panel-section {
padding: 15px;
}
.message-content {
max-width: 90%;
font-size: 14px;
padding: 15px;
}
.chat-messages {
padding: 15px;
}
}

80
templates/index.html Normal file
View File

@ -0,0 +1,80 @@
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>智能知识库问答</title>
<link rel="stylesheet" href="/static/style.css">
</head>
<body>
<div class="container">
<div class="header">
<h1>🧠 智能知识库问答</h1>
<p class="subtitle">✨ 企业/课程 · 自有文档精准问答,减少人工答疑</p>
</div>
<div class="main-content">
<div class="left-panel">
<div class="panel-section">
<h2>📁 文档上传</h2>
<div class="upload-area" id="upload-area">
<input type="file" id="file-input" accept=".txt,.pdf,.docx" hidden>
<div class="upload-icon">📤</div>
<p class="upload-text">点击或拖拽文件到此处</p>
<p class="upload-hint">💡 支持 PDF、Word、TXT 格式(最大 16MB</p>
</div>
</div>
<div class="panel-section">
<div class="panel-header">
<h2>📚 知识库</h2>
<span class="doc-count" id="doc-count">0 个文档</span>
</div>
<div class="document-list" id="document-list">
<div class="empty-state">
<div class="empty-icon">📭</div>
<p>暂无文档</p>
<p class="empty-hint">上传文档后即可开始问答</p>
</div>
</div>
</div>
</div>
<div class="right-panel">
<div class="panel-section">
<div class="panel-header">
<h2>💬 智能问答</h2>
<button class="clear-btn" onclick="clearHistory()" title="清除对话历史">🗑️ 清空</button>
</div>
<div class="chat-container">
<div class="chat-messages" id="chat-messages">
<div class="message bot">
<div class="message-content">
<p>👋 您好!我是您的智能知识库助手。</p>
<p>📋 请先上传文档,然后我可以基于文档内容回答您的问题。</p>
<p>🎯 支持精准检索,减少人工答疑时间!</p>
</div>
</div>
</div>
<div class="chat-input">
<div class="input-wrapper">
<textarea id="question-input" placeholder="💭 请输入您的问题...(按 Enter 发送Shift+Enter 换行)" maxlength="1000"></textarea>
<div class="char-count" id="char-count">0/1000</div>
</div>
<button class="send-btn" onclick="askQuestion()" id="send-btn">
<span class="btn-text">发送</span>
<span class="btn-icon">🚀</span>
</button>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="toast" id="toast"></div>
<script src="/static/script.js"></script>
</body>
</html>

23
test_document_read.py Normal file
View File

@ -0,0 +1,23 @@
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from app import read_document_content
doc_id = '3cac70d6-cd02-493e-a08a-c7794927e7b4'
print(f"=== 测试文档读取功能 ===\n")
print(f"文档ID: {doc_id}\n")
content = read_document_content(doc_id)
if content:
print(f"✓ 成功读取文档内容")
print(f"内容长度: {len(content)} 字符")
print(f"\n前200字符:")
print(content[:200])
print("\n✓ 文档读取功能正常!")
else:
print("✗ 未能读取文档内容")
print("请检查文件是否存在或格式是否正确")

17
test_output.txt Normal file
View File

@ -0,0 +1,17 @@
=== 测试文档读取功能 ===
文件: 3cac70d6-cd02-493e-a08a-c7794927e7b4_docx
文件: 77320887-f79f-48a0-aa20-bea2d9c1f5af_SIT.docx
类型: DOCX
段落数: 30
内容长度: 336 字符
前100字符: 《长河入海时——致SIT七十一周年》
【第一篇章:长河溯源】
你从1956年的晨光中启航
一捧夯土,铸成应用之学的堤岸。
工程卷轴在黄浦江畔舒展,
墨迹里游动着钢铁与代码的基因链。
【第二篇章:...

39
test_read.py Normal file
View File

@ -0,0 +1,39 @@
import os
from docx import Document
uploads_folder = 'uploads'
print("=== 测试文档读取功能 ===\n")
for file in os.listdir(uploads_folder):
filepath = os.path.join(uploads_folder, file)
print(f"文件: {file}")
if file.endswith('.txt'):
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
print(f"类型: TXT")
print(f"内容长度: {len(content)} 字符")
print(f"前100字符: {content[:100]}...\n")
elif file.endswith('.pdf'):
import pypdf
with open(filepath, 'rb') as f:
reader = pypdf.PdfReader(f)
text = ''
for page in reader.pages:
text += page.extract_text() + '\n'
print(f"类型: PDF")
print(f"页数: {len(reader.pages)}")
print(f"内容长度: {len(text)} 字符")
print(f"前100字符: {text[:100]}...\n")
elif file.endswith('.docx'):
doc = Document(filepath)
text = ''
for paragraph in doc.paragraphs:
text += paragraph.text + '\n'
print(f"类型: DOCX")
print(f"段落数: {len(doc.paragraphs)}")
print(f"内容长度: {len(text)} 字符")
print(f"前100字符: {text[:100]}...\n")

Binary file not shown.

Binary file not shown.