Gemini

Gemini là mô hình AI đa phương thức (multimodal) hàng đầu của Google, có khả năng xử lý văn bản, hình ảnh, âm thanh và video. Gemini được tích hợp vào nhiều sản phẩm Google và có API riêng cho developer.

📚 Bài viết liên quan:

Generative AI - Cách AI hoạt động từ prompt đến response

Context Window - Hiểu về giới hạn tokens và cách tối ưu

Context Engineering - Nghệ thuật sắp xếp context hiệu quả

RAG - Tích hợp dữ liệu bên ngoài vào LLM

Các phiên bản Gemini (2026)

Gemini 3 Series (Mới nhất)

Phiên bản	Mô tả	Use case
Gemini 3 Pro	Model mạnh nhất, suy luận nâng cao, 1M tokens context	Nghiên cứu, phân tích phức tạp, agentic tasks
Gemini 3 Flash	Nhanh, tiết kiệm chi phí, Pro-grade intelligence	Chatbot, coding, high-frequency workflows

Gemini 2 Series (Ổn định)

Phiên bản	Mô tả	Use case
Gemini 2.5 Pro	Cân bằng giữa performance và chi phí	Tác vụ chung, production apps
Gemini 2.0 Flash	Nhanh, context lớn (1M tokens)	Chatbot, coding, tác vụ hàng ngày
Gemini 2.0 Flash Thinking	Tối ưu cho suy luận step-by-step	Toán học, logic phức tạp

Các cách sử dụng Gemini

1. Gemini Web (gemini.google.com)

Truy cập miễn phí tại gemini.google.com :

Chat với AI bằng text
Upload hình ảnh để phân tích
Tạo hình ảnh với Imagen
Kết nối với Google Workspace

2. Gemini API (Google AI Studio)

Dành cho developer, truy cập tại ai.google.dev :


import google.generativeai as genai
 
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-flash")
 
response = model.generate_content("Giải thích recursion trong lập trình")
print(response.text)

3. Gemini trong Google Workspace

Gemini được tích hợp sẵn trong:

Gmail - Viết và tóm tắt email
Google Docs - Soạn thảo văn bản
Google Sheets - Phân tích dữ liệu
Google Slides - Tạo presentation

4. Gemini Code Assist (IDE)

Tích hợp vào IDE cho lập trình viên:

VS Code extension
JetBrains plugin
Cloud Workstations

Tính năng nổi bật

🖼️ Multimodal Input

Gemini có thể xử lý đồng thời nhiều loại input:


import PIL.Image
 
image = PIL.Image.open("diagram.png")
response = model.generate_content([
    "Giải thích sơ đồ này:",
    image
])

📄 Context Window lớn

Gemini 2.0 Flash: 1 triệu tokens (~700.000 từ)
Có thể phân tích cả file PDF, codebase lớn

🔗 Grounding với Google Search

Kết hợp thông tin real-time từ Google Search:


model = genai.GenerativeModel(
    "gemini-3-flash",
    tools=[genai.Tool.from_google_search()]
)
response = model.generate_content("Tin tức công nghệ hôm nay")

🛠️ Function Calling

Cho phép Gemini gọi các hàm bạn định nghĩa:


def get_weather(city: str) -> dict:
    """Lấy thông tin thời tiết của thành phố."""
    # Implementation
    return {"temp": 28, "condition": "sunny"}
 
model = genai.GenerativeModel(
    "gemini-3-flash",
    tools=[get_weather]
)

Sử dụng Gemini cho lập trình

Code Generation


Prompt: "Viết hàm Python sort một list các dictionary theo key 'age'"

Code Review


Prompt: "Review đoạn code sau và đề xuất cải thiện:
[paste code here]"

Debugging


Prompt: "Giải thích lỗi sau và cách fix:
TypeError: Cannot read property 'map' of undefined"

Documentation


Prompt: "Viết docstring cho hàm sau theo chuẩn Google style:
[paste function here]"

API Pricing (2026)

Model	Input (per 1M tokens)	Output (per 1M tokens)
Gemini 3 Pro	$2.00	$8.00
Gemini 3 Flash	$0.15	$0.60
Gemini 2.5 Pro	$1.25	$5.00
Gemini 2.0 Flash	$0.10	$0.40

Lưu ý: Free tier có giới hạn 15 RPM (requests per minute)

Best Practices

1. Prompt Engineering


❌ "Viết code"
✅ "Viết một hàm Python tên `calculate_total` nhận parameter 
   là list các số, return tổng của chúng. Thêm docstring và 
   type hints."

2. Sử dụng System Instructions


model = genai.GenerativeModel(
    "gemini-3-flash",
    system_instruction="Bạn là trợ lý lập trình Python. "
                       "Luôn giải thích code rõ ràng. "
                       "Ưu tiên code clean và hiệu quả."
)

3. Temperature cho creativity


generation_config = genai.GenerationConfig(
    temperature=0.2,  # 0 = deterministic, 1 = creative
    max_output_tokens=2048
)

So sánh với các model khác

Tính năng	Gemini 3 Pro	Gemini 3 Flash	GPT-5	Claude 4.5 Sonnet
Context window	1M tokens	1M tokens	400K tokens	200K - 1M tokens
Agentic capabilities	✅ Mạnh	✅ Tốt	✅	✅
Multimodal	✅	✅	✅	✅
Grounding/Search	✅	✅	❌ (riêng)	❌
Speed	Trung bình	Rất nhanh	Trung bình	Nhanh
Price (input)	$2.00/1M	$0.15/1M	$1.25/1M	$3.00/1M

Kết luận

Gemini là lựa chọn tuyệt vời cho:

Developer: API mạnh mẽ, context window lớn, giá cạnh tranh
Người dùng thường: Miễn phí qua gemini.google.com
Doanh nghiệp: Tích hợp sâu với Google Workspace

Với sự phát triển liên tục từ Google DeepMind, Gemini đang là một trong những AI model hàng đầu thế giới.