Converts export-control and multi-vendor routing guidance into a planning checklist for teams that cannot assume a single geography or chip supplier will stay available. Practitioners document primary and contingency model routes (including gateways such as Helicone or LiteLLM Router configs), quantify revenue or latency exposure if a region is blocked, and set investor/customer messaging when leadership advises to "expect nothing" from a market—as publicly reported when semiconductor vendors discuss China licensing uncertainty. The skill cross-checks legal/compliance sign-off, drills failover to alternate regions or domestic stacks, and records evidence before production launches tied to geopolitically sensitive deployments.
Use cases
- Shipping a SaaS assistant that must keep running if a major GPU export market closes
- Board review after public statements that a chip vendor has "largely conceded" a regional AI market
- Preparing investor guidance when primary-region GPU sales require licenses that may not arrive
- Consolidating Helicone or LiteLLM routes before expanding into Asia-Pacific data residency requirements
- Annual resilience review for teams dependent on U.S.-origin inference hardware
Key features
- Map revenue, latency, and compliance exposure per geography and per upstream model or hardware dependency.
- List documented primary routes and at least one tested contingency route per critical workload (include gateway config artifacts).
- Align external messaging with finance: state whether forecasts assume zero, partial, or full access to restricted regions—no hidden dependencies.
- Run a tabletop plus technical drill: block primary region credentials or endpoints and verify contingency paths meet SLOs.
- Capture legal/export-control review references and ticket IDs for any approved exceptions.
- Publish a signed readiness memo with open risks, retest date, and owners for routing config changes.
When to Use This Skill
- Before launching inference features that depend on hardware or APIs subject to export licensing
- After major news or earnings commentary shifts expectations for a regional AI chip market
- When auditors ask how the org would operate if a primary geography becomes unavailable
Expected Output
A multi-region readiness memo listing exposures, tested contingency routes, messaging alignment, and compliance references.
Frequently Asked Questions
- Is this only about China?
- No—the checklist is geography-agnostic; recent semiconductor export stories are one trigger, not the only scope.
- Do we need LiteLLM specifically?
- No, but teams using LiteLLM Router or Helicone gateways should attach their actual config files as evidence.
- Can we skip the drill if counsel approves staying in one region?
- Legal approval does not replace a technical failover test; document both sign-offs.
Related
Related
3 Indexed items
LiteLLM Router fallback readiness review
Translates LiteLLM routing documentation into a pre-flight checklist before promoting multi-deployment LLM routes to production. Teams verify Router configuration covers primary and fallback model lists, retry policies, and load-balancing strategy documented at docs.litellm.ai/docs/routing, confirm proxy virtual keys and spend limits if traffic flows through LiteLLM Proxy, and rehearse provider outage drills using OpenAI-mapped exceptions (AuthenticationError, RateLimitError, APIError). The skill also points operators to enable `store_model_in_db` when MCP tools must persist alongside router definitions and to validate MCP server names comply with SEP-986 guidance referenced in LiteLLM v1.80.18 release notes.
OWASP GenAI LLM Top 10 (v1.1) threat review checklist
Maps the authoritative OWASP "Top 10 for Large Language Model Applications" (version 1.1) taxonomy—LLM01 Prompt Injection through LLM10 Model Theft—into an actionable readiness checklist for architects red-teaming Retrieval-Augmented Generation, Agents, plugins, training pipelines, or hosted inference gateways. Official project pages summarize each risk bucket (prompt injection bypassing safeguards, unchecked outputs enabling downstream exploits, poisoned corpora distorting reasoning, abusive workloads starving capacity, brittle supply-chain dependencies, sensitive data resurfacing inside generations, excessively privileged plugins/agents/autonomy, misplaced trust producing compliance failures, loss of proprietary model weights via API abuse). The skill pairs each category with tangible controls (policy, monitoring, toolchain limits) anchored to genai.owasp.org releases rather than anecdotes.
Example SLO document authoring
Operationalizes Appendix A from Google’s SRE workbook by translating the illustrative “Example Game Service” SLO dossier into a checklist teams can mimic: articulate the user-facing workload, nominate rolling measurement windows (the appendix uses four weeks), pair each subsystem with tightly defined SLIs (availability from load balancers excluding 5xx, latency percentile gates, freshness for derived tables, correctness via probers, completeness for pipelines), cite explicit numerator/denominator language, rationalize rounding policies, quantify per-objective error budgets, and cite the sibling error budget policy for enforcement.