claude-home/development/2026-03-20.md
Cal Corum cd57645dd0
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: sync KB — 2026-03-20.md,release-2026.3.20.md
2026-03-20 22:00:43 -05:00

63 lines
4.0 KiB
Markdown

---
title: "ESB-56: MDR Object Handler Simplification & Cleanup"
description: "Code review and cleanup of MDR (Master Data Registry) object handler integration — added client singletons, hardened SQL, cached env vars, and filed follow-up issues."
type: context
domain: development
tags: [esb-monorepo, python, cloud-functions, mdr, code-review, refactoring, gitea]
---
# ESB-56: MDR Object Handler Simplification & Cleanup
**Date:** 2026-03-20
**Branch:** `feature/ESB-56-mdr-object-handler`
**Repo:** esb-monorepo
## What Was Done
Ran a three-agent parallel code review (reuse, quality, efficiency) on the MDR-to-object-handler integration, then applied fixes directly.
### Fixes Applied (commit `13f78f3`)
1. **Pub/Sub client singleton**`output_utils.publish_to_pubsub` was creating a new `PublisherClient` per call. Added module-level singleton matching the pattern already used in `mdr-api/event_publisher.py`.
2. **Firestore client singleton**`write_to_firestore` was creating a new `firestore.Client` (gRPC channel + auth) per call. Added keyed singleton dict `_firestore_clients` by `(project_id, database)`.
3. **SQL injection hardening**`write_to_cloudsql` built the Postgres function name via f-string (`f"data.fn_upsert_{object_type.lower()}"`). While guarded by an allowlist check, the pattern is fragile. Replaced with a pre-built lookup dict `_UPSERT_FUNC_MAP` so SQL identifiers are never derived by string interpolation.
4. **Removed verbose payload logging**`logging.info(f"Silver data: {silver_data}")` logged full business object payloads at INFO level. Replaced with identity-only log line.
5. **Dead code removal** — Two `if not project_id` guards in `send_to_downstream()` could never trigger because `os.environ.get("GCP_PROJECT_ID", "amenity-integrat-dev-svc")` always returns a non-None default.
6. **Env var caching**`_PROJECT_ID`, `_FIRESTORE_ENABLED`, `_CLOUDSQL_ENABLED` moved to module-level constants instead of being read from `os.environ` on every request.
7. **Double `list_sources()` call**`process_object_request` called `registry.list_sources()` twice (once for membership check, once for the error response). Assigned to local variable.
8. **Test singleton isolation** — Added `autouse` pytest fixture to reset module-level singletons between tests, fixing 3 test failures caused by cross-test state leakage.
### Decision: No Connection Pool for CloudSQL
Discussed whether a SQLAlchemy connection pool made sense. Decided **no** because:
- Cloud Run functions default to single concurrent request per instance
- A pool of size 1 just adds wrapper overhead
- The `Connector` singleton (which caches IAM auth) is already the expensive part, and it's cached
- Pool would only matter with concurrency > 1, which would need thread-safety work first
## Follow-Up Issues Created (Gitea)
| # | Title | Priority |
|---|-------|----------|
| [#1](https://git.manticorum.com/cal/esb-monorepo/issues/1) | Extract shared Pub/Sub publisher to py-esb-integrations | High |
| [#2](https://git.manticorum.com/cal/esb-monorepo/issues/2) | Extract shared CloudSQL connector to py-esb-integrations | High |
| [#3](https://git.manticorum.com/cal/esb-monorepo/issues/3) | Reduce duplication in MDR API route handlers | Medium |
| [#4](https://git.manticorum.com/cal/esb-monorepo/issues/4) | Parallelize independent downstream calls | Low |
### Key Finding: Three-Way Duplication
Both Pub/Sub publishing and CloudSQL connection logic are copy-pasted identically across three functions (`object-handler`, `outbound-event-handler`, `outbound-object-router`). The shared `packages/py-esb-integrations` package has no GCP utility module yet — that's the natural home for extraction.
## Files Changed
- `functions/tac/object-handler/main.py` — env var caching, dead code removal, list_sources dedup
- `functions/tac/object-handler/output_utils.py` — singletons, SQL hardening, logging cleanup
- `functions/tac/object-handler/tests/test_output_utils.py` — singleton reset fixture