Modern enterprises, even Fortune 500 firms, still operate on financial systems that were never designed with tax in mind. These systems accurately record transactions, invoices, and journal entries, but they often lack the necessary granularity, context, and automation to apply tax rules accurately and consistently. The result is a heavy reliance on manual reconciliation, spreadsheet adjustments, and judgment calls from overworked tax teams—all of which increase risk, slow reporting, and leave value on the table.
The Tax Sensitization Assistant project is motivated by a simple truth: organizations can't afford to keep treating tax compliance as a reactive clean-up job. If the systems feeding tax data aren't fully tax-sensitized, the CFO and tax department spend more time fixing data than interpreting it. That's wasted effort and missed insight.
By leveraging AI-driven classification and enrichment, this project aims to automate the painful middle ground between raw transaction data and tax-ready information. The system continuously monitors and enhances transactional records, flags inconsistencies, and fills in missing tax context (such as jurisdiction, tax code, rate, and deductibility), all without requiring a full ERP overhaul.
In this article, I'll guide you through the process of building a comprehensive AI-powered tax classification system in just six hours, utilizing FastAPI, OpenAI, and Google Sheets. The system can process 100+ transactions per request and reduces manual review work by up to 70%. More importantly, I'll show you the architecture decisions, challenges faced, and solutions that make this a production-ready system.
This article is for:
Prerequisites:
What You'll Build:
Real-World Applications:
Any workflow requiring AI classification + validation
Fortune 500 companies still use financial systems not designed for tax. These systems produce:
The result? Tax teams spend 70% of their time fixing data instead of analyzing it. Manual reconciliation is time-consuming, error-prone, and doesn't scale.
Instead of replacing entire ERP systems (costly and disruptive), we layer intelligent automation on top of existing systems:
This approach:

Manual Trigger / Cron → OAuth Auth → Read Sheets → Preprocess → AI Classify → Validate → Detect Anomalies → Route → Write Sheets → Calculate Summary → Send Email
Architecture Note: The AI service layer uses a clean abstraction pattern. This naturally allows using different providers (OpenAI, Claude, Gemini, local models) without framework changes.
We need to authenticate with Google Sheets, Drive, and Gmail APIs. The system must operate in various environments (interactive, WSL, headless) and utilize unified credentials (as opposed to separate service accounts).
A single OAuth 2.0 flow with all required scopes, environment detection, and automatic token refresh management.
Implementation Details:
# Environment detection
def is_wsl() -> bool:
"""Detect if running in WSL environment."""
if os.getenv('WSL_DISTRO_NAME') or os.getenv('WSLENV'):
return True
try:
with open('/proc/version', 'r') as f:
return 'microsoft' in f.read().lower()
except:
return False
def is_headless() -> bool:
"""Detect if running in headless environment."""
if os.getenv('DISPLAY') or os.getenv('SSH_CONNECTION') or os.getenv('CI'):
return False
return TrueOAuth Flow Adaptation:
Key Features:
This environment-aware approach ensures the system operates effectively in all deployment scenarios, ranging from local development to production servers.
For Phase 1 proof of concept, Google Sheets offers:
Read Operations:
Write Operations:
Data Flow:
Read: Raw Transactions → Processed Transactions
Read: Tax Reference Data → Validation Dictionary
Write: Clean Transactions → Tax-Ready Output
Write: Flagged Transactions → Needs Review
Key Features:
The Google Sheets API proved surprisingly easy to use, making it an excellent choice for rapid prototyping.
The AI service infers missing tax attributes from transaction descriptions:
The AI service uses a clean abstraction pattern (standard good practice). This naturally allows using different providers (OpenAI, Claude, Gemini, local models). Not revolutionary—just proper separation of concerns. Demonstrated with OpenAI, but the framework supports any provider that implements the interface.
# Service interface (standard pattern)
class AIService:
def classify_transaction(self, transaction) -> ClassificationResponse:
raise NotImplementedError
# OpenAI implementation (one example)
class OpenAIService(AIService):
def __init__(self):
self.client = OpenAI(api_key=settings.openai_api_key)
self.model = settings.openai_model
def classify_transaction(self, transaction):
prompt = self._build_prompt(transaction)
response = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": "You are a tax classification assistant..."},
{"role": "user", "content": prompt}
],
response_format={"type": "json_object"},
temperature=0.3
)
# Parse and return ClassificationResponse
return self._parse_response(response)
# Framework uses dependency injection
ai_service = get_ai_service() # Returns configured implementation
result = ai_service.classify_transaction(transaction)System Prompt:
You are a tax classification assistant. Analyze transactions and infer missing tax attributes. Return only valid JSON.
User Prompt:
Transaction details:
- Description: {description}
- Amount: ${amount:.2f}
- Location: {location}
Determine:
1. What type of purchase does this represent
2. Whether it's typically taxable in the given jurisdiction
3. The appropriate tax rate for that jurisdiction
4. Your confidence level in this classification
Return JSON only with: classification, jurisdiction, suggested_tax_rate, taxable_status, confidence, rationale
Response Format: Structured JSON ensures consistent parsing across all models.
Before AI classification, transactions are normalized:
This ensures consistent inputs for the AI model.
AI-suggested tax rates are validated against reference data:
Why This Approach:
Two high-impact patterns catch most issues:
Pattern 1: Missing Tax on Taxable Items
Pattern 2: Rate Mismatch
Routing Decision:
This focused approach covers 70% of issues without over-engineering.
The workflow orchestrates all processing steps:
async def process_transactions() -> SummaryStats:
# 1. Initialize services
sheets_service = SheetsService()
ai_service = AIService()
gmail_service = GmailService()
# 2. Read raw transactions
raw_transactions = sheets_service.read_raw_transactions()
# 3. Read tax reference data
tax_reference = sheets_service.read_tax_reference()
# 4. Preprocess transactions
processed_transactions = preprocess_batch(raw_transactions)
# 5. Process each transaction
for processed_tx in processed_transactions:
# AI Classification
classification = ai_service.classify_transaction(processed_tx)
# Validation
validation_status, rate_match = validate_rate(...)
# Anomaly Detection
enriched_tx = detect_anomalies(enriched_tx, classification)
# Route
if enriched_tx.anomaly_count > 0:
review_transactions.append(enriched_tx)
else:
clean_transactions.append(enriched_tx)
# 6. Write outputs
sheets_service.write_clean_transactions(clean_transactions)
sheets_service.write_review_queue(review_transactions)
# 7. Calculate summary
summary = calculate_summary_stats(...)
# 8. Send email
gmail_service.send_daily_summary(summary)
return summaryTransaction-Level Resilience:
This ensures batch processing continues despite individual failures.
Global state tracking enables monitoring:
APScheduler handles automated processing:
Priority System:
FastAPI provides REST API for monitoring and control:
@router.post("/api/v1/process/trigger")
async def trigger_workflow():
"""Manually trigger the tax classification workflow"""
job_id = await scheduler_service.trigger_manual()
return {"status": "triggered", "job_id": job_id}
@router.get("/api/v1/process/status")
async def get_status():
"""Get current processing status"""
state = get_processing_state()
return {"status": state["last_status"], ...}
@router.get("/api/v1/process/summary")
async def get_summary():
"""Get last processing summary statistics"""
state = get_processing_state()
return {"summary": state["last_summary"], ...}FastAPI Features Used:
OAuth-authenticated email sending using the same credentials as Sheets:
Daily summary includes:
Why Gmail API:
Problem: Google OAuth flow requires browser, but WSL and headless servers don't have GUI.
Solution: Environment detection with appropriate flow for each:
Impact: System works in all deployment scenarios.
Problem: API rate limits and high costs with AI providers.
Solution:
Note: The service abstraction allows switching providers if needed (standard architecture benefit).
Impact: Cost-effective processing, handles rate limits gracefully
Problem: Managing asynchronous tasks and data flow efficiently.
Solution:
Impact: Efficient processing, resilient to failures.
Problem: One bad transaction shouldn't crash entire batch.
Solution:
Impact: Batch processing continues despite individual failures.
A complete tax classification system with:
We built a complete AI-powered tax classification system in 6 hours using FastAPI, OpenAI GPT-4o-mini, and Google Sheets. The system demonstrates rapid prototyping with modern Python tools, solving real-world challenges (OAuth, async workflows, cost optimization) while creating production-ready architecture.
Built an AI-powered tax classification system in 6 hours using FastAPI, OpenAI GPT-4o-mini, and Google Sheets. The system automatically classifies transactions, validates tax rates, detects anomalies, and sends summary emails to users. Key features: OAuth authentication, async workflows, cost-optimized AI usage, and resilient batch processing. Can process 100+ transactions per request and reduces manual review work by 70%. Built as independent volunteer work with Prof. Rohit Aggarwal, who provided conceptual guidance and project structure. Main challenges solved: OAuth in WSL/headless environments, OpenAI API rate limits, and async data flow management. The framework utilizes standard service abstraction patterns, which naturally enable the use of different AI providers as needed.
I would like to extend a sincere thank you to Professor Rohit Aggarwal for providing the opportunity, the foundational framework, and invaluable guidance for this project.
Hitesh Balegar is a graduate student in Computer Science (AI Track) at the University of Utah, specializing in the design of production-grade AI systems and intelligent agents. He is particularly interested in how effective human-AI collaboration can be utilized to build sophisticated autonomous agents using frameworks such as LangGraph and advanced RAG pipelines. His work spans multimodal LLMs, voice-automation systems, and large-scale data workflows deployed across enterprise environments. Before attending graduate school, he led engineering initiatives at CVS Health and Zmanda, shipping high-impact systems used by thousands of users and spanning over 470 commercial locations.
Dr. Rohit Aggarwal is a professor, AI researcher and practitioner. His research focuses on two complementary themes: how AI can augment human decision-making by improving learning, skill development, and productivity, and how humans can augment AI by embedding tacit knowledge and contextual insight to make systems more transparent, explainable, and aligned with human preferences. He has done AI consulting for many startups, SMEs and public listed companies. He has helped many companies integrate AI-based workflow automations across functional units, and developed conversational AI interfaces that enable users to interact with systems through natural dialogue.
This article demonstrates how modern Python tools enable rapid prototyping of production-ready AI automation systems. The key is focusing on real problems, using cost-effective solutions, and building maintainable architecture from the start.