fix: Remove nltk blocking download and make azure model listing non-blocking (#6214)
* Fix event loop blocking in NLTK downloads and Azure model listing Found via watchdog detecting 61.6s hang during file upload. **Root causes:** 1. NLTK punkt_tab downloads blocking during file processing 2. Azure model listing using sync requests.get() in async context **Fixes:** 1. Pre-download NLTK data at Docker build time 2. Async fallback download at startup if build failed 3. Move Azure model fetch to thread pool with asyncio.to_thread() **Impact:** - Eliminates 60+ second event loop hangs - Startup: instant if data baked in, ~60s async if needs download - Requests: never block, all I/O offloaded to threads * Fix Docker build: ensure /root/nltk_data exists even if download fails - Create directory before download attempt - Add verification step to confirm download success - Directory always exists so COPY won't fail in runtime stage * Fix: use venv python for NLTK download in Docker build The builder stage installs NLTK in /app/.venv but we were using system python which doesn't have NLTK. Now using venv python so download actually works. * Use uv run for NLTK download (more idiomatic) uv run automatically uses the synced venv, cleaner than hardcoding the venv path.
This commit is contained in:
@@ -106,8 +106,10 @@ class AzureProvider(Provider):
|
||||
return list(latest_models.values())
|
||||
|
||||
async def list_llm_models_async(self) -> list[LLMConfig]:
|
||||
# TODO (cliandy): asyncify
|
||||
model_list = self.azure_openai_get_deployed_model_list()
|
||||
# Run blocking model list fetch in thread pool to avoid blocking event loop
|
||||
import asyncio
|
||||
|
||||
model_list = await asyncio.to_thread(self.azure_openai_get_deployed_model_list)
|
||||
# Extract models that support text generation
|
||||
model_options = [m for m in model_list if m.get("capabilities").get("chat_completion") == True]
|
||||
|
||||
|
||||
@@ -163,6 +163,18 @@ async def lifespan(app_: FastAPI):
|
||||
except Exception as e:
|
||||
logger.warning(f"[Worker {worker_id}] Failed to start watchdog: {e}")
|
||||
|
||||
# Pre-download NLTK data to avoid blocking during requests (fallback if Docker build failed)
|
||||
try:
|
||||
import asyncio
|
||||
|
||||
import nltk
|
||||
|
||||
logger.info(f"[Worker {worker_id}] Checking NLTK data availability...")
|
||||
await asyncio.to_thread(nltk.download, "punkt_tab", quiet=True)
|
||||
logger.info(f"[Worker {worker_id}] NLTK data ready")
|
||||
except Exception as e:
|
||||
logger.warning(f"[Worker {worker_id}] Failed to download NLTK data: {e}")
|
||||
|
||||
if telemetry_settings.profiler:
|
||||
try:
|
||||
import googlecloudprofiler
|
||||
|
||||
Reference in New Issue
Block a user