--- title: "ESB-56: MDR Object Handler Simplification & Cleanup" description: "Code review and cleanup of MDR (Master Data Registry) object handler integration — added client singletons, hardened SQL, cached env vars, and filed follow-up issues." type: context domain: development tags: [esb-monorepo, python, cloud-functions, mdr, code-review, refactoring, gitea] --- # ESB-56: MDR Object Handler Simplification & Cleanup **Date:** 2026-03-20 **Branch:** `feature/ESB-56-mdr-object-handler` **Repo:** esb-monorepo ## What Was Done Ran a three-agent parallel code review (reuse, quality, efficiency) on the MDR-to-object-handler integration, then applied fixes directly. ### Fixes Applied (commit `13f78f3`) 1. **Pub/Sub client singleton** — `output_utils.publish_to_pubsub` was creating a new `PublisherClient` per call. Added module-level singleton matching the pattern already used in `mdr-api/event_publisher.py`. 2. **Firestore client singleton** — `write_to_firestore` was creating a new `firestore.Client` (gRPC channel + auth) per call. Added keyed singleton dict `_firestore_clients` by `(project_id, database)`. 3. **SQL injection hardening** — `write_to_cloudsql` built the Postgres function name via f-string (`f"data.fn_upsert_{object_type.lower()}"`). While guarded by an allowlist check, the pattern is fragile. Replaced with a pre-built lookup dict `_UPSERT_FUNC_MAP` so SQL identifiers are never derived by string interpolation. 4. **Removed verbose payload logging** — `logging.info(f"Silver data: {silver_data}")` logged full business object payloads at INFO level. Replaced with identity-only log line. 5. **Dead code removal** — Two `if not project_id` guards in `send_to_downstream()` could never trigger because `os.environ.get("GCP_PROJECT_ID", "amenity-integrat-dev-svc")` always returns a non-None default. 6. **Env var caching** — `_PROJECT_ID`, `_FIRESTORE_ENABLED`, `_CLOUDSQL_ENABLED` moved to module-level constants instead of being read from `os.environ` on every request. 7. **Double `list_sources()` call** — `process_object_request` called `registry.list_sources()` twice (once for membership check, once for the error response). Assigned to local variable. 8. **Test singleton isolation** — Added `autouse` pytest fixture to reset module-level singletons between tests, fixing 3 test failures caused by cross-test state leakage. ### Decision: No Connection Pool for CloudSQL Discussed whether a SQLAlchemy connection pool made sense. Decided **no** because: - Cloud Run functions default to single concurrent request per instance - A pool of size 1 just adds wrapper overhead - The `Connector` singleton (which caches IAM auth) is already the expensive part, and it's cached - Pool would only matter with concurrency > 1, which would need thread-safety work first ## Follow-Up Issues Created (Gitea) | # | Title | Priority | |---|-------|----------| | [#1](https://git.manticorum.com/cal/esb-monorepo/issues/1) | Extract shared Pub/Sub publisher to py-esb-integrations | High | | [#2](https://git.manticorum.com/cal/esb-monorepo/issues/2) | Extract shared CloudSQL connector to py-esb-integrations | High | | [#3](https://git.manticorum.com/cal/esb-monorepo/issues/3) | Reduce duplication in MDR API route handlers | Medium | | [#4](https://git.manticorum.com/cal/esb-monorepo/issues/4) | Parallelize independent downstream calls | Low | ### Key Finding: Three-Way Duplication Both Pub/Sub publishing and CloudSQL connection logic are copy-pasted identically across three functions (`object-handler`, `outbound-event-handler`, `outbound-object-router`). The shared `packages/py-esb-integrations` package has no GCP utility module yet — that's the natural home for extraction. ## Files Changed - `functions/tac/object-handler/main.py` — env var caching, dead code removal, list_sources dedup - `functions/tac/object-handler/output_utils.py` — singletons, SQL hardening, logging cleanup - `functions/tac/object-handler/tests/test_output_utils.py` — singleton reset fixture