基于OpenAI大模型 训练自己的大模型 fine_tunes
2023-7-28|2023-7-28
夜火/xloong
type
Post
status
Published
date
Jul 28, 2023
slug
openai-fine-tunes
summary
基于OpenAI大模型 训练自己的大模型 fine_tunes 微调
tags
开发
AI
category
技术分享
icon
password
URL
Property
Jul 28, 2023 08:50 AM
说是基于大模型,训练自己的大模型,其实就是 fine_tunes 微调。
这篇文章会写的更意识流一些,更多的是作为笔记。下面记录的代码也都是基于命令行,进行技术验证,非python代码。
英文过关的话可以直接查看官方文档: https://platform.openai.com/docs/guides/fine-tuning
准备
安装openai
pip install --upgrade openai
设置openai api key
linux
export OPENAI_API_KEY="<OPENAI_API_KEY>"
windows
setx OPENAI_API_KEY "<OPENAI_API_KEY>"
上传训练文件(非必要)
这步可以跳过,发现后面的fine_tunes.create可以直接上传当前目录下的文件
GPT 列出所有上传的文件
curl <https://api.openai.com/v1/files> -H "Authorization: Bearer sk-***"
或
set OPENAI_API_KEY="sk-***" curl <https://api.openai.com/v1/files> -H "Authorization: Bearer %OPENAI_API_KEY%"
GPT 上传文件
curl <https://api.openai.com/v1/files> -H "Authorization: Bearer sk-***" -F purpose="fine-tune" -F file="@file.jsonl"
返回
{ "object": "file", "id": "file-***", "purpose": "fine-tune", "filename": "file.jsonl", "bytes": 281, "created_at": 168*******, "status": "uploaded", "status_details": null }
微调模型
创建微调模型
openai api fine_tunes.create -t file.jsonl -m ada --suffix "test_model"
openai api fine_tunes.create -t file-*** -m davinci --suffix "test_d_model"
可以忽略前面的单独上传文件,这里可以点击回车,自动上传当前目录下的文件
这里使用已上传的文件,可以直接输入文件id,如file-***
列出微调模型
openai api fine_tunes.list
跟踪事件流
openai api fine_tunes.follow -i ft-***
输出
[2023-07-***] Created fine-tune: ft-*** [2023-07-***] Fine-tune costs $0.00 [2023-07-***] Fine-tune enqueued. Queue number: 2 [2023-07-***] Fine-tune is in the queue. Queue number: 1 [2023-07-***] Fine-tune is in the queue. Queue number: 0 [2023-07-***] Fine-tune started [2023-07-***] Completed epoch 1/4 [2023-07-***] Completed epoch 2/4 [2023-07-***] Completed epoch 3/4 [2023-07-***] Completed epoch 4/4 [2023-07-***] Uploaded model: ada:ft-8000:text-model-2023-07-*** [2023-07-***] Uploaded result file: file-*** [2023-07-***] Fine-tune succeeded Job complete! Status: succeeded 🎉 Try out your fine-tuned model: openai api completions.create -m ada:ft-8000:text-model-2023-07-*** -p <YOUR_PROMPT>
openai api completions.create -m davinci:ft-8000:test-d-model-2023-07-*** -p <YOUR_PROMPT>
列出微调模型
openai api fine_tunes.list
输出
{ "object": "list", "data": [ { "object": "fine-tune", "id": "ft-***" "hyperparams": { "n_epochs": 4, "batch_size": 1, "prompt_loss_weight": 0.01, "learning_rate_multiplier": 0.1 }, "organization_id": "org-***", "model": "ada", "training_files": [ { "object": "file", "id": "file-***", "purpose": "fine-tune", "filename": "file.jsonl", "bytes": 281, "created_at": 168*******, "status": "processed", "status_details": null } ], "validation_files": [], "result_files": [ { "object": "file", "id": "file-***", "purpose": "fine-tune-results", "filename": "compiled_results.csv", "bytes": 894, "created_at": 168*******, "status": "processed", "status_details": null } ], "created_at": 168*******, "updated_at": 168*******, "status": "succeeded", "fine_tuned_model": "ada:ft-8000:text-model-2023-07-***" } ] }