Once your Stripe integration is live, it’s easy to set it and forget it. You can continue your work and focus on other elements of your application. However, trouble may be brewing behind the scenes. Unless you’ve set up robust logging and alerting in your application you may not be aware of increasing Stripe API error rates which could impact your bottom line. Luckily, using Workbench, the new developer-centric view of your Stripe data, you can analyze API and webhook failures without any changes to your existing code.
We can examine this in practice by taking a look at an account in the Stripe Dashboard.
At first glance, the account looks healthy. A few failed payments isn’t unusual and sales are consistently flowing. There is nothing to immediately indicate any issues in the application. However if you dig a bit deeper you’ll see a different story.
Getting started with Workbench
To dig deeper, we'll use Workbench. This tool provides a more convenient way for developers to access and search logs at scale. It doesn’t require you to set a logging level, storing all available information by default while still obfuscating credit card numbers and other sensitive data. To see Workbench:
- Navigate to https://dashboard.stripe.com/ in your preferred browser and log into your Stripe account.
- If you have multiple accounts configured, use the drop-down in the top-left to select the store whose API activity you wish to view. Workbench reports and content are scoped to the store level.
- In the bottom-right corner of the browser, hover over the terminal icon to expand the menu, then select the caret symbol to open Workbench.
Workbench is not a browser extension and does not rely on CLIs or other tools in your development machine, so you can use it immediately without installing additional software.
Debugging with Workbench
Once you have opened Workbench you’ll see exactly what I meant.
The API requests graph is showing a lot of failures. Here’s a closer look at that graph:
According to the data, each day around 50% of the API calls result in some form of error. That’s alarmingly high. This could be caused by many different sources depending on the application’s structure. It could be something on the backend failing and retrying too much, it could be an issue on the front end causing real transactions to fail, or it could be some form of attack or abuse from a leaked secret key. You’ll have to dig deeper to pinpoint the cause, but there is definitely something interesting going on.
With a busy account, there are many API requests every week. It would be a frustrating task to attempt to sift through those logs and determine what is failing. Luckily, you don’t need to filter and categorize all of those by hand. Inside Workbench, the “Errors” tab shows a simplified overview of the types of errors you’re experiencing.
This view has three areas - the list of errors from the last week on the left, a sample failing request in the middle, and a list of relevant logs of this error type on the right. This lets you quickly see which types of errors are happening a lot versus errors which are less common.
According to our data, there are four types of errors occurring 20+ times this week. Those errors are:
- “invalid_cvc”
- “invalid_expiry_month”
- “invalid_expiry_year”
- “incorrect_number”
If you open the Stripe documentation for payment decline codes, you see that all of these errors relate to verifying credit card information. This means there is probably some issue in the backend which is retrying failed cards.
Choosing one of the “incorrect_number” API calls allows us to glean more information.
The rightmost pane displays data for one of the recent instances of the “incorrect_number” failures. In that pane, you can see the abridged “API Key”. This key begins with “sk_” so you know it’s a secret key (e.g., one used server-side) so we’d expect the “Source” and “IP” fields to align with the backend servers.
Let's say your application runs several backend server instances in multiple regions, this would make figuring out which of those backend instances is making these API calls a bit tricky. However, let’s say we do know that this backend application is written entirely in .NET and there is no production Golang code in the stack. That means the “Source” field claiming the request is from “Go-http-client/1.1” doesn’t line up. Either the backend is using a custom user-agent or there is some other code being run elsewhere.
To verify that the API key associated with this request is actually the production API key, copy the last 4 characters of the key and open the API Keys section of the Dashboard.
The names of the keys here indicate that a lot of keys are in Live mode but being used for testing. That’s not right! It’s like your team has never heard of Sandboxes - the successor to Test Mode which allows dozens of Stripe developers to each have their own testing space. Sandboxes would allow a much shorter list of keys and make sure that developers don’t step on each other’s toes or worse step on production data
Anyway, let's look for the key which made that failing request.
The key is named “Attacker Engineering Test Key”. That explains a lot - it looks like this key is being used by a team who is testing API calls which resemble potential attack scenarios.
At this point, you are able to track down this team in your organization, tell them about Sandboxes, and let them know just how much they’re polluting your production logs.
In this case, these failing API calls were coming from another good natured team but along the way you still managed to find a lot of ways to improve your organization's Stripe integration. For example, you should stop issuing production keys for testing, stop making test API calls in production, and start using Sandboxes.
Conclusion
Workbench allows you to better understand your Stripe integration so instead of worrying you were able to quickly track back the source of these invocations in a way which was not possible before. The best part is you’ve hardly scratched the surface of the type of observability Workbench provides. Whether you’re looking to understand trends in your webhook invocations, life cycles of Stripe objects like customers and subscriptions, or API logs like we did today then Workbench is a great place to start.
To learn more about Workbench check out the Workbench documentation.
If there are more features you would like to see, let us know by clicking the Send feedback button at the top of the Workbench panel.
To learn more about developing applications with Stripe, visit our YouTube Channel.