Resolved -
This incident has been resolved.
Oct 28, 23:19 UTC
Monitoring -
An infrastructure change caused a percentage of our backend calls to intermittently timeout.
This caused random timeouts in the LangSmith Frontend and caused LangSmith Run Ingestion to be delayed by anywhere from 1 to 5 minutes.
We have reverted the change and are monitoring for a recurrence.
Oct 25, 21:45 UTC
Investigating -
We are currently investigating an issue that is causing approximately 5-10% of all incoming API calls to fail with a 502 error after a 60s timeout.
Oct 25, 21:08 UTC