What is it?¶
DataFrameIt processes text in DataFrames using Large Language Models (LLMs) and extracts structured information defined by Pydantic models. One function, one model, one prompt — done.
from pydantic import BaseModel
from typing import Literal
import pandas as pd
from dataframeit import dataframeit
class Sentiment(BaseModel):
sentiment: Literal['positive', 'negative', 'neutral']
confidence: Literal['high', 'medium', 'low']
df = pd.DataFrame({'text': ['Excellent product!', 'Terrible service.']})
result = dataframeit(df, Sentiment, "Analyze the sentiment of the text.", text_column='text')
Features¶
Multiple Providers¶
Google Gemini, OpenAI GPT-5, Anthropic Claude 4.5, Cohere, Mistral — all via LangChain.
Structured Output¶
Automatic validation with Pydantic. Define fields, types, and descriptions — the LLM respects them.
Resilience¶
Automatic retry with exponential backoff. Configurable rate limiting. Never lose progress.
Quick Installation¶
pip install dataframeit[google] # Google Gemini 3 (recommended)
pip install dataframeit[openai] # OpenAI GPT-5
pip install dataframeit[anthropic] # Anthropic Claude 4.5