Pydantic is a library for data validation using type annotations. Version 2 was rewritten in Rust and runs 5–50x faster than version 1. It is the de-facto standard in FastAPI, LangChain, and LLM applications.
Installation
uv add pydantic
BaseModel — the base model
from pydantic import BaseModel
class TextAnalysis(BaseModel):
sentiment: str
score: float
keywords: list[str]
language: str
# Creating from a dictionary
data = {"sentiment": "positive", "score": 0.9, "keywords": ["python"], "language": "ru"}
result = TextAnalysis.model_validate(data)
print(result.sentiment) # positive
print(result.score) # 0.9
print(result.keywords) # ['python']
model_validate() — parsing from a dict
import json
raw_json = '{"sentiment": "negative", "score": 0.2, "keywords": [], "language": "en"}'
data = json.loads(raw_json)
result = TextAnalysis.model_validate(data)
ValidationError — invalid data
from pydantic import ValidationError
try:
bad = TextAnalysis.model_validate({"sentiment": "ok"}) # missing score and keywords
except ValidationError as e:
print(e.error_count()) # 3
for err in e.errors():
print(err["loc"], err["msg"])
Nested models
class SentimentResult(BaseModel):
label: str # positive / negative / neutral
confidence: float
class TextAnalysis(BaseModel):
sentiment: SentimentResult
keywords: list[str]
language: str
word_count: int
data = {
"sentiment": {"label": "positive", "confidence": 0.87},
"keywords": ["python", "api"],
"language": "ru",
"word_count": 150
}
result = TextAnalysis.model_validate(data)
print(result.sentiment.label) # positive
print(result.sentiment.confidence) # 0.87
Field() — constraints and descriptions
from pydantic import BaseModel, Field
class TextAnalysis(BaseModel):
sentiment: str = Field(description="positive / negative / neutral")
score: float = Field(ge=0.0, le=1.0, description="Confidence from 0 to 1")
keywords: list[str] = Field(max_length=10, description="Keywords")
language: str = Field(pattern=r"^[a-z]{2}$", description="ISO 639-1 language code")
model_dump() — back to dict
result = TextAnalysis.model_validate(data)
d = result.model_dump() # dict
j = result.model_dump_json() # JSON string
Why Pydantic in LLM applications
Claude returns plain text. To get structured data, ask it to respond in JSON and validate the result with Pydantic:
raw = response.content[0].text.strip()
data = json.loads(raw)
result = TextAnalysis.model_validate(data)
# Now result is a typed object with validated fields
💬 Comments (0)
No comments yet
Be the first to share your opinion about this article!