63 lines
4.0 KiB
Markdown
63 lines
4.0 KiB
Markdown
---
|
|
title: "ESB-56: MDR Object Handler Simplification & Cleanup"
|
|
description: "Code review and cleanup of MDR (Master Data Registry) object handler integration — added client singletons, hardened SQL, cached env vars, and filed follow-up issues."
|
|
type: context
|
|
domain: development
|
|
tags: [esb-monorepo, python, cloud-functions, mdr, code-review, refactoring, gitea]
|
|
---
|
|
|
|
# ESB-56: MDR Object Handler Simplification & Cleanup
|
|
|
|
**Date:** 2026-03-20
|
|
**Branch:** `feature/ESB-56-mdr-object-handler`
|
|
**Repo:** esb-monorepo
|
|
|
|
## What Was Done
|
|
|
|
Ran a three-agent parallel code review (reuse, quality, efficiency) on the MDR-to-object-handler integration, then applied fixes directly.
|
|
|
|
### Fixes Applied (commit `13f78f3`)
|
|
|
|
1. **Pub/Sub client singleton** — `output_utils.publish_to_pubsub` was creating a new `PublisherClient` per call. Added module-level singleton matching the pattern already used in `mdr-api/event_publisher.py`.
|
|
|
|
2. **Firestore client singleton** — `write_to_firestore` was creating a new `firestore.Client` (gRPC channel + auth) per call. Added keyed singleton dict `_firestore_clients` by `(project_id, database)`.
|
|
|
|
3. **SQL injection hardening** — `write_to_cloudsql` built the Postgres function name via f-string (`f"data.fn_upsert_{object_type.lower()}"`). While guarded by an allowlist check, the pattern is fragile. Replaced with a pre-built lookup dict `_UPSERT_FUNC_MAP` so SQL identifiers are never derived by string interpolation.
|
|
|
|
4. **Removed verbose payload logging** — `logging.info(f"Silver data: {silver_data}")` logged full business object payloads at INFO level. Replaced with identity-only log line.
|
|
|
|
5. **Dead code removal** — Two `if not project_id` guards in `send_to_downstream()` could never trigger because `os.environ.get("GCP_PROJECT_ID", "amenity-integrat-dev-svc")` always returns a non-None default.
|
|
|
|
6. **Env var caching** — `_PROJECT_ID`, `_FIRESTORE_ENABLED`, `_CLOUDSQL_ENABLED` moved to module-level constants instead of being read from `os.environ` on every request.
|
|
|
|
7. **Double `list_sources()` call** — `process_object_request` called `registry.list_sources()` twice (once for membership check, once for the error response). Assigned to local variable.
|
|
|
|
8. **Test singleton isolation** — Added `autouse` pytest fixture to reset module-level singletons between tests, fixing 3 test failures caused by cross-test state leakage.
|
|
|
|
### Decision: No Connection Pool for CloudSQL
|
|
|
|
Discussed whether a SQLAlchemy connection pool made sense. Decided **no** because:
|
|
- Cloud Run functions default to single concurrent request per instance
|
|
- A pool of size 1 just adds wrapper overhead
|
|
- The `Connector` singleton (which caches IAM auth) is already the expensive part, and it's cached
|
|
- Pool would only matter with concurrency > 1, which would need thread-safety work first
|
|
|
|
## Follow-Up Issues Created (Gitea)
|
|
|
|
| # | Title | Priority |
|
|
|---|-------|----------|
|
|
| [#1](https://git.manticorum.com/cal/esb-monorepo/issues/1) | Extract shared Pub/Sub publisher to py-esb-integrations | High |
|
|
| [#2](https://git.manticorum.com/cal/esb-monorepo/issues/2) | Extract shared CloudSQL connector to py-esb-integrations | High |
|
|
| [#3](https://git.manticorum.com/cal/esb-monorepo/issues/3) | Reduce duplication in MDR API route handlers | Medium |
|
|
| [#4](https://git.manticorum.com/cal/esb-monorepo/issues/4) | Parallelize independent downstream calls | Low |
|
|
|
|
### Key Finding: Three-Way Duplication
|
|
|
|
Both Pub/Sub publishing and CloudSQL connection logic are copy-pasted identically across three functions (`object-handler`, `outbound-event-handler`, `outbound-object-router`). The shared `packages/py-esb-integrations` package has no GCP utility module yet — that's the natural home for extraction.
|
|
|
|
## Files Changed
|
|
|
|
- `functions/tac/object-handler/main.py` — env var caching, dead code removal, list_sources dedup
|
|
- `functions/tac/object-handler/output_utils.py` — singletons, SQL hardening, logging cleanup
|
|
- `functions/tac/object-handler/tests/test_output_utils.py` — singleton reset fixture
|