For AWS developers integrating with third-party APIs outside of your AWS account, there are several common problems that can cause your production application to behave unexpectedly.
With Workbench, Stripe provides next-generation debugging tools that make it easier and faster to pinpoint, understand, and resolve production problems. Stripe logs detailed information about every API request in your account without additional charges. Workbench provides powerful filtering and insights capabilities to help search logs quickly and find application requests.
This blog post shows how to find and resolve common production issues by using Workbench and provides recommendations for addressing those issues.
Getting started with Workbench
To use Workbench:
- Navigate to https://dashboard.stripe.com/ in your preferred browser and log into your Stripe account.
- If you have multiple accounts configured, use the drop-down in the top-left to select which account API activity you want to view. Workbench reports and content are scoped to the account level.
- In the bottom-right corner of the browser, hover over the terminal icon to expand the menu, then select the caret symbol to open Workbench.
- Workbench opens in the lower portion of the window:
Workbench is not a browser extension and does not rely on CLIs or other tools in your development machine, so you can use it immediately without the need for installing additional software.
Detecting duplicate API calls from an application
In this scenario, you expected code to call an API endpoint once but instead the endpoint was called twice or more. The code worked as expected in development and this issue only appears sporadically in production.
AWS offers a range of options to host your application, from serverless compute like AWS Lambda to containerized services like Amazon ECS. Some of these services provide high availability by hosting your code or application in multiple underlying availability zones. The trade-off is that there may not be an exactly-once processing guarantee.
In the case of Lambda, if the function errors out unexpectedly or if there is a transient network error, the service will retry executing your code. This means that an API call can be retried again, and Stripe is unaware that you didn’t explicitly retry in code. This can also happen if the Lambda service experiences issues in an availability zone, so you should expect that it’s possible for the code to be executed more than once. While you often don’t see these issues in the development process, as systems receive more traffic, these transient failures become more common in the long tail of traffic.
To mitigate this issue, when calling a Stripe POST API, it’s recommended that you use an idempotency header in your request to prevent duplicate requests from having unintended side effects. Most common client libraries can add idempotency requests to API calls automatically, but the feature usually must be enabled first. Since you have limited control over compute services that may invoke your code more than once, the idempotency header allows Stripe to ignore requests that it may have seen before. Stripe’s GET and DELETE APIs are already guaranteed idempotent without needing this key.
Learn more about designing with idempotency with AWS Lambda and making retries safe in AWS compute services.
Using Workbench to find duplicate API calls
If your application unexpectedly calls an API more than once, you can use Workbench to detect this behavior. Go to the Logs tab and search by the resource ID to quickly filter the logs for a specific request. You can use additional filters to drill down by HTTP method and API endpoint if needed.
In the example below, this AWS-hosted application calls the /v1/prices
endpoint multiple times with the same payload. The Idempotency header is different for each request, indicating that it either hasn’t been set correctly, or hasn’t been used by the client. Choosing one of the requests shows the multiple pricing requests in the UI.
Identifying why an application’s API request has failed
If your AWS logs confirm that an API request has been sent, you can search in Workbench to ensure that Stripe received it. If the log is missing, it was not processed by Stripe, and you should check that the code has the necessary permission to reach the API. You should also verify the URL used in the API call, and any egress restrictions in your AWS account that may prevent the traffic from reaching the endpoint.
The HTTP status code in the response defines the broad type of error. You can see this in your AWS application logs, but it doesn’t always indicate there is an issue with your code or the Stripe service. Code 401 and 403 indicate issues with your API key, while 402 means there is a problem with the payment details provided. Response codes in the 5xx range are rare and are caused by a problem at Stripe.
Learn about Stripe’s HTTP error codes.
Learning more in Workbench
The Logs tab in Workbench allows you to filter by Status (e.g. “Failed”) to list all the failed API calls from your application. From here, you can drill down into any request and learn more about the request and response bodies. While credit card data is masked, the Error insights panel can indicate problems with the card information provided.
This is useful for tracking individual errors but if users of your production application are reporting multiple problems, the Errors tab can give you more information and help accelerate problem resolution. This section aggregates errors by type, allowing you to find the common cause of the bulk of the errors, and then drill down into specific data.
You can also use the Overview tab in conjunction with this view to detect rate limiting. This results in 429 HTTP errors to requests and can be seen in busy systems in production. You can contact Stripe support to request limit increases but in some cases you can rearchitect your AWS-based application to avoid bursts of requests to the Stripe API and smooth out traffic.
Learn more about Stripe’s rate limits.
Locating and resending Stripe events that never appear in your AWS account
Many Stripe processes are asynchronous and take time to complete. To avoid polling APIs for changes and to help synchronize state closer to real-time, Stripe can push data changes directly to your AWS account as events. Once configured, events arrive in the partner event bus of the Amazon EventBridge service, where you can then route to other services to take appropriate action. However, the service does not log if events fail to arrive, so your application is unaware of the missing data.
Using Workbench to detect and replay failed events
In the Event destinations tab, set the Status filter to “Failed/Pending” to show a list of events that have not been delivered successfully. Click on an event to load the detailed view. From here, the Delivery attempts panel shows the history of failed delivery, together with the option to resend to the event bus. Clicking one of these delivery attempts shows additional information about why the delivery failed.
Learn more about setting up Stripe events in your AWS account.
Conclusion
Many AWS customers use Stripe for processing payments by calling Stripe’s APIs from their hosted applications. There are many runtime errors that can occur in production, from service failures to errors in user-provider credit card information. While you can use logs from your applications to locate failed calls, in many cases it's faster to use Stripe’s Workbench tool to isolate and resolve the cause of the problem.
Workbench logs verbose information about APIs at no extra charge to users and provides rich filtering capabilities and insights. This blog post shows how to use this to identify duplicate API calls, unexpected HTTP errors, and locate and resend events that are not reaching their AWS targets.
If there are more features you would like to see, let us know by clicking the Send feedback button at the top of the panel.