You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
194 lines
3.7 KiB
194 lines
3.7 KiB
1 month ago
|
# LangChain Ollama API
|
||
|
|
||
|
基于 Ollama 本地大模型和 LangChain 的 AI 服务 API。
|
||
|
|
||
|
## 功能特点
|
||
|
|
||
|
- 使用 Ollama 本地大模型进行推理
|
||
|
- 基于 LangChain 框架构建提示词链
|
||
|
- Flask API 服务端点
|
||
|
- 支持自定义提示词模板
|
||
|
- 完全本地化部署,无需外部 API
|
||
|
- 完整的 API 文档(Swagger UI)
|
||
|
- JSON 格式的日志记录
|
||
|
|
||
|
## 环境要求
|
||
|
|
||
|
- Python 3.8+
|
||
|
- Conda
|
||
|
- Ollama 服务(本地运行)
|
||
|
- Qwen2.5 模型(通过 Ollama 安装)
|
||
|
|
||
|
## 安装步骤
|
||
|
|
||
|
1. 克隆项目并进入项目目录:
|
||
|
```bash
|
||
|
git clone <repository-url>
|
||
|
cd <project-directory>
|
||
|
```
|
||
|
|
||
|
2. 创建并激活 Conda 环境:
|
||
|
```bash
|
||
|
# 创建环境
|
||
|
conda create -n langchain-ollama python=3.8
|
||
|
# 激活环境
|
||
|
conda activate langchain-ollama
|
||
|
```
|
||
|
|
||
|
3. 安装依赖:
|
||
|
```bash
|
||
|
pip install -r requirements.txt
|
||
|
```
|
||
|
|
||
|
4. 安装 Qwen2.5 模型(如果尚未安装):
|
||
|
```bash
|
||
|
ollama pull qwen2.5:latest
|
||
|
```
|
||
|
|
||
|
5. 配置环境变量:
|
||
|
创建 `.env` 文件并设置以下变量(可选,有默认值):
|
||
|
```env
|
||
|
OLLAMA_BASE_URL=http://localhost:11434
|
||
|
DEFAULT_MODEL=qwen2.5:latest
|
||
|
FLASK_HOST=0.0.0.0
|
||
|
FLASK_PORT=5000
|
||
|
FLASK_DEBUG=False
|
||
|
MAX_TOKENS=2048
|
||
|
TEMPERATURE=0.7
|
||
|
```
|
||
|
|
||
|
## 运行服务
|
||
|
|
||
|
### 开发环境
|
||
|
|
||
|
1. 确保已激活 Conda 环境:
|
||
|
```bash
|
||
|
conda activate langchain-ollama
|
||
|
```
|
||
|
|
||
|
2. 确保 Ollama 服务已启动并运行在本地
|
||
|
|
||
|
3. 启动开发服务器:
|
||
|
```bash
|
||
|
python app.py
|
||
|
```
|
||
|
|
||
|
4. 访问 API 文档:
|
||
|
- Swagger UI: http://localhost:5000/docs
|
||
|
- ReDoc: http://localhost:5000/redoc
|
||
|
|
||
|
### 生产环境
|
||
|
|
||
|
1. 安装生产服务器:
|
||
|
```bash
|
||
|
pip install gunicorn
|
||
|
```
|
||
|
|
||
|
2. 使用 Gunicorn 启动服务:
|
||
|
```bash
|
||
|
# 基本启动
|
||
|
gunicorn -w 4 -b 0.0.0.0:5000 app:app
|
||
|
|
||
|
# 使用配置文件启动(推荐)
|
||
|
gunicorn -c gunicorn.conf.py app:app
|
||
|
```
|
||
|
|
||
|
3. 创建 Gunicorn 配置文件 `gunicorn.conf.py`:
|
||
|
```python
|
||
|
# 工作进程数
|
||
|
workers = 4
|
||
|
# 工作模式
|
||
|
worker_class = 'sync'
|
||
|
# 绑定地址
|
||
|
bind = '0.0.0.0:5000'
|
||
|
# 超时时间
|
||
|
timeout = 120
|
||
|
# 最大请求数
|
||
|
max_requests = 1000
|
||
|
# 最大请求抖动
|
||
|
max_requests_jitter = 50
|
||
|
# 访问日志
|
||
|
accesslog = 'access.log'
|
||
|
# 错误日志
|
||
|
errorlog = 'error.log'
|
||
|
# 日志级别
|
||
|
loglevel = 'info'
|
||
|
```
|
||
|
|
||
|
4. 使用 systemd 管理服务(Linux):
|
||
|
```ini
|
||
|
# /etc/systemd/system/langchain-ollama.service
|
||
|
[Unit]
|
||
|
Description=LangChain Ollama API Service
|
||
|
After=network.target
|
||
|
|
||
|
[Service]
|
||
|
User=your_user
|
||
|
Group=your_group
|
||
|
WorkingDirectory=/path/to/your/app
|
||
|
Environment="PATH=/path/to/your/conda/env/bin"
|
||
|
ExecStart=/path/to/your/conda/env/bin/gunicorn -c gunicorn.conf.py app:app
|
||
|
Restart=always
|
||
|
|
||
|
[Install]
|
||
|
WantedBy=multi-user.target
|
||
|
```
|
||
|
|
||
|
## API 端点
|
||
|
|
||
|
### 健康检查
|
||
|
- GET `/api/v1/health`
|
||
|
- 返回服务状态
|
||
|
|
||
|
### 聊天接口
|
||
|
- POST `/api/v1/chat`
|
||
|
- 请求体:
|
||
|
```json
|
||
|
{
|
||
|
"question": "你的问题"
|
||
|
}
|
||
|
```
|
||
|
- 响应:
|
||
|
```json
|
||
|
{
|
||
|
"question": "原始问题",
|
||
|
"answer": "AI回答"
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## 日志
|
||
|
|
||
|
服务使用 JSON 格式记录日志,包含以下信息:
|
||
|
- 时间戳
|
||
|
- 日志级别
|
||
|
- 文件名和行号
|
||
|
- 函数名
|
||
|
- 日志消息
|
||
|
|
||
|
## 注意事项
|
||
|
|
||
|
1. 确保 Ollama 服务已正确安装并运行
|
||
|
2. 默认使用 qwen2.5:latest 模型,可以通过环境变量更改
|
||
|
3. 建议在生产环境中设置适当的温度参数和最大 token 限制
|
||
|
4. 使用 Conda 环境时,确保每次运行前都已激活环境
|
||
|
5. 开发环境仅用于测试,生产环境请使用 Gunicorn 部署
|
||
|
|
||
|
## 自定义提示词链
|
||
|
|
||
|
可以通过继承 `BaseChain` 类来创建自定义的提示词链。示例:
|
||
|
|
||
|
```python
|
||
|
from chains.base_chain import BaseChain
|
||
|
|
||
|
class CustomChain(BaseChain):
|
||
|
def __init__(self, model_name="qwen2.5:latest", temperature=0.7):
|
||
|
super().__init__(model_name, temperature)
|
||
|
self.chain = self.create_chain(
|
||
|
template="你的提示词模板",
|
||
|
input_variables=["你的输入变量"]
|
||
|
)
|
||
|
```
|
||
|
|
||
|
## 许可证
|
||
|
|
||
|
MIT
|