GPT-4o 모델 사용하기 - (2) 이미지를 입력으로 하기

프로그래밍/기계학습 2024. 6. 22. 23:44

728x90

이번에 GPT-4o 모델을 사용해서 추가하려는 서비스는 사용자가 넣은 이미지를 기반으로 조언을 하는 서비스입니다.

이제 GPT 개발환경을 설정 완료했으니 실제 이미지를 입력으로 넣어보겠습니다.

2024.06.13 - [프로그래밍/기계학습] - GPT-4o 모델 사용하기 - (1) OpenAI 개발환경 설정하기

우선 샘플 코드를 실행해보겠습니다.

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='This image depicts a peaceful outdoor scene featuring a long, wooden boardwalk that stretches into the distance. The boardwalk is surrounded by tall, green grass and various bushes. In the background, there are trees and a clear blue sky with white clouds, suggesting a calm, sunny day. The scene evokes a sense of tranquility and open space, typical of a nature trail or a meadow.', role='assistant', function_call=None, tool_calls=None))

샘플 코드가 잘 돌아가네요.

참고로 API 호출 한번( max token: 1000 이하) 에 0.01$ 사용했습니다. 14원 정도네요 (환율 기준 1376원)

API 사용비용은 아래 링크에서 확인이 가능합니다.

https://platform.openai.com/organization/usage

참고로 GPT-4o 가격입니다.

서비스에 적용할 때는 위의 이미지 경로만 아래와같이 실제 파일 경로로 변경해주면 됩니다.

import base64
import requests

# OpenAI API Key
api_key = "YOUR_OPENAI_API_KEY"

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "path_to_your_image.jpg"

# Getting the base64 string
base64_image = encode_image(image_path)

headers = {
  "Content-Type": "application/json",
  "Authorization": f"Bearer {api_key}"
}

payload = {
  "model": "gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What’s in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{base64_image}"
          }
        }
      ]
    }
  ],
  "max_tokens": 300
}

response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)

print(response.json())

이렇게 생각보다 간단하게 openai 이미지 분석이 서비스에 추가가 될 수 있습니다.

참고: https://platform.openai.com/docs/guides/vision

728x90

저작자표시

'프로그래밍 > 기계학습' 카테고리의 다른 글

나만의 chatgpt 만들기 (my chatgpt) (0)	2024.09.24
chatgpt 어시스턴트 설정하기 (0)	2024.09.22
GPT-4o 모델 사용하기 - (1) OpenAI 개발환경 설정하기 (1)	2024.06.13
임베딩 검색을 사용하여 질문 답변하기 (유사어 검색 구현) (0)	2024.02.29
chatgpt prompt 사용법 (0)	2023.03.14

ABOUT ME

you've got to find what you love. you've got to find what you love.

'프로그래밍 > 기계학습' 카테고리의 다른 글

티스토리툴바

ABOUT ME

'프로그래밍 > 기계학습' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바