🦜카부캠 앵무말(Parrotalk) : GPT 파인 튜닝

이야 내 블로그 카카오 사내게시판에 올라갓대!

gray 멘토님이 말해주심

이럴 줄 알았으면 좀 말투 덜 유치하게 쓸걸

🦜 GPT 파인 튜닝

openAI 파인 튜닝 공식 가이드라인

https://platform.openai.com/docs/guides/fine-tuning

데이터 세트 준비

50-100개 사이 예제로 먼저 테스트해보기

AI-Hub

샘플 데이터 ? ※샘플데이터는 데이터의 이해를 돕기 위해 별도로 가공하여 제공하는 정보로써 원본 데이터와 차이가 있을 수 있으며, 데이터에 따라서 민감한 정보는 일부 마스킹(*) 처리가 되

www.aihub.or.kr

훈련과 시험 분할 - Train and test splits

데이터 서식 확인

데이터 세트를 컴파일한 후 미세 조정 작업을 생성하기 전에 데이터 형식을 확인하는 것이 중요

잠재적인 오류를 찾고, 토큰 수를 검토하고, 미세 조정 작업 비용을 추정하는 데 사용할 수 있는 간단한 파이썬 스크립트

Data preparation and analysis for chat model fine-tuning | OpenAI Cookbook

Open-source examples and guides for building with the OpenAI API. Browse a collection of snippets, advanced techniques and walkthroughs. Share your own examples and guides.

cookbook.openai.com

비용

(base training cost per 1M input tokens ÷ 1M) × number of tokens in the input file × number of epochs trained

모델 학습

from openai import OpenAI
client = OpenAI()

client.fine_tuning.jobs.create(
  training_file="file-abc123",
  model="gpt-4o-mini-2024-07-18"
)

from openai import OpenAI
client = OpenAI()

# List 10 fine-tuning jobs
client.fine_tuning.jobs.list(limit=10)

# Retrieve the state of a fine-tune
client.fine_tuning.jobs.retrieve("ftjob-abc123")

# Cancel a job
client.fine_tuning.jobs.cancel("ftjob-abc123")

# List up to 10 events from a fine-tuning job
client.fine_tuning.jobs.list_events(fine_tuning_job_id="ftjob-abc123", limit=10)

# Delete a fine-tuned model (must be an owner of the org the model was created in)
client.models.delete("ft:gpt-3.5-turbo:acemeco:suffix:abc123")

모델 사용

from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="ft:gpt-4o-mini:my-org:custom_suffix:id",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)
print(completion.choices[0].message)

데이터 세트 준비 ➡ 파일 검증 ➡ 모델 학습 ➡ 모델 사용

저작자표시

'Club|Project > K-디지털 트레이닝 해커톤' 카테고리의 다른 글

🏃🏻2024년 제6회 K-디지털 트레이닝 해커톤🏃🏻 : 본선 진출! (1)	2024.11.08
🏃🏻2024년 제6회 K-디지털 트레이닝 해커톤🏃🏻 : React Native 공부 (1)	2024.10.29
🏃🏻2024년 제6회 K-디지털 트레이닝 해커톤🏃🏻 : 유저 플로우, 프리세일즈 기획서, NPS (6)	2024.10.23
🏃🏻2024년 제6회 K-디지털 트레이닝 해커톤🏃🏻 : 토스 이경엽님 멘토링 2 (2)	2024.10.19
🏃🏻2024년 제6회 K-디지털 트레이닝 해커톤🏃🏻 : 비즈니스 모델 캔버스 (1)	2024.10.18

은체공부

🦜카부캠 앵무말(Parrotalk) : GPT 파인 튜닝

🦜 GPT 파인 튜닝

openAI 파인 튜닝 공식 가이드라인

데이터 세트 준비

훈련과 시험 분할 - Train and test splits

데이터 서식 확인

비용

모델 학습

모델 사용

'Club|Project > K-디지털 트레이닝 해커톤' 카테고리의 다른 글

티스토리툴바

🦜카부캠 앵무말(Parrotalk) : GPT 파인 튜닝

🦜 GPT 파인 튜닝

openAI 파인 튜닝 공식 가이드라인

데이터 세트 준비

훈련과 시험 분할 - Train and test splits

데이터 서식 확인

비용

모델 학습

모델 사용

'Club|Project > K-디지털 트레이닝 해커톤' 카테고리의 다른 글

관련글

티스토리툴바