WorkFlex Bridge Agent | Samuel Muriuki

What it is

A 90-minute live coding challenge: build a pipeline from Microsoft Teams messages to Jira tickets via a Gemini function-calling brain, plus a Teams reply on the way back. Submitted, passed engineering review, advanced to the next round.

The problem

The challenge tested two things at once: can you ship a working LLM pipeline under time pressure, and can you handle the boring infrastructure question of 'what happens when free-tier rate limits bite at 30 seconds in?' Most candidates would crash; the grader wanted to see graceful degradation.

What I built

End-to-end pipeline

Mock Teams server emits a message → Gemini function-calling decides whether to create a new Jira ticket or update an existing one → Jira API call lands → Teams reply confirms. Schema-validated JSON throughout (Zod), so malformed LLM responses are caught and re-issued instead of swallowed.

12-key Gemini rotation engine

Round-robin dispatcher across 12 free-tier API keys with per-key quota tracking. Effective rate limit jumps from 15 req/min to ~180 req/min. Detailed walkthrough in the garden post 'A 12-key Gemini rotation system, in 90 minutes.'

Failover + backoff

When a key returns 429, dispatcher marks it cooldown-until-next-window and retries on the next key. Exponential backoff (200ms × 2^attempt) only fires if every key is in cooldown — which under the test load happened roughly never.

The bug caught at minute 78

Cooldown was stored as a boolean instead of a timestamp; a key rate-limited at second 5 stayed marked cooldown at second 95. Three-line fix: store the timestamp, check `Date.now() - cooldown > 60_000` before reusing.

Engineering decisions

Why deterministic round-robin over weighted

Deterministic dispatch is easier to reason about under tests — call N goes to key `N mod 12`, period. Weighted strategies (least-recently-used, lowest-quota-remaining) sound smarter but add state I would have had to debug under time pressure.

Why schema validation as a retry trigger

LLMs are not deterministic. Treating malformed responses as a transient error and re-issuing to the next key gave me a cheap retry budget — much cheaper than catching every Zod failure as a separate exception path.

Why function-calling over JSON-mode prompting

Function-calling is the contract Gemini's API enforces structurally. JSON-mode prompts work but rely on the model to produce conformant output, which is exactly the thing I just acknowledged isn't guaranteed.

What I'd do differently

Write the time-window logic with a timestamp on the first try, not the second. Time-window logic is always wrong the first time you write it under pressure. Also: extract the rotation engine as a small npm / PyPI library (it's on the PROJECTS.md backlog) — there are enough free-tier LLM workflows out there that the rotation pattern deserves to be a one-line dependency, not a recurring 200-line copy-paste.