A single node in our production database caused approximately 10-15% of all incoming calls to the LangSmith API to experience degraded performance and a small number would time out, resulting in a status 503 error. We have worked to have the offending node removed from our replica pool and performance has returned to normal.
Resolved
A single node in our production database caused approximately 10-15% of all incoming calls to the LangSmith API to experience degraded performance and a small number would time out, resulting in a status 503 error. We have worked to have the offending node removed from our replica pool and performance has returned to normal.
Investigating
We are currently investigating intermittent query stalls affecting the LangSmith Frontend and API.
Looking for the EU status page? Find it here: https://eu.status.smith.langchain.com