How to ensure 100% reliability when scraping in ecommerce
It’s been about two a half years since we started working on Two Tap. Even now the most common question we get asked is “how reliable is Two Tap?”. Or in a different format, this questions gets asked as “what percentage of orders do actually get sent through?”
The purpose of this article is to share (a little bit too much) information about how Two Tap works and how we achieved 100% reliability on placing orders with any of our supported retailers. Hopefully, our internal procedures will inspire other companies that deal with scraping.
Two Tap integrates with retailers using scraping.
We built the system like this based on our previous experiences in ecommerce and working with retailers and publishers. This was not an easy choice. Leaving aside the technical difficulties, engineers sometimes have a gut reaction to our approach – they consider it unreliable (“What if the integration breaks?”). It takes us around 10 minutes to explain our internals until we can alleviate these worries.
When we were running affiliate networks a long time ago we understood that retailer priorities are different than what we thought they were. We wrote an article about it here, but tldr is that retailers are focused on their core processes. This means keeping their websites up on Black Friday and improving loyalty/email retention versus trying to build tens of different deep integrations with their platforms. Most of these third party solutions each bring insignificant amounts of orders compared to the integration effort.
The second thing we understood is that deep technical integrations break just as often as scraping approaches. The PR of a lot of platforms says "it’s just a quick painless plugin install”, but the reality of that retailers heavily customize their platforms. Leaving aside the fact that often times retailers have to change payment providers, these cartridges are almost never drag and drop, and cause really odd and hard to debug bugs. Once retailers upgrade their platforms the plugins break as well.
No more philosophical discussions, how is Two Tap more reliable?
Two Tap now supports 1120+ retailers and any Shopify store is supported out of the box when they sign up for Two Tap. No other solution is even close to supporting as many stores as Two Tap does. What happens when an integration breaks?
The first thing we learned is that integration breakage is not the most common problem. By far, the biggest problem is retailer websites not responding for whatever reason. This is an issue that affects deep integrations as well. Retailers, like all companies, have a DB, an ISP, a network provider, a data center provider, and they receive a lot of traffic. As a platform you take an order from a consumer and send it somewhere, and that somewhere might not respond.
If you are publisher sending orders never trust an API that doesn’t handle things asynchronously for this specific reason. It means that an API might fail for reasons that don’t depend on itself, and if the API provider isn’t handling that edge case, you will have to take care of it. Retailer availability issues happen more often than you think, especially when scaling to 1200+ stores.
And here’s the main aspect of how Two Tap works. Two Tap receives an order from an app or consumer and adds it to a queue. If, for whatever reason it can’t send the order (this could be the retailer platform going down but also an integration breakage) Two Tap doesn’t give up. It raises an internal ticket and a human investigates the issue. In our experience, tickets are rare, and in 90% of these issue cases it literally means waiting for 5 to 15 minutes until the merchant website becomes available again.
For instance, Two Tap powers international checkout on a top 5 US retailer (and many others) through one of our clients. That third party accepts international orders and sends them to the merchants’ US website through Two Tap. During Black Friday 2015 that retailer’s main website was down for a couple of hours, and for the rest of day had a waiting queue for any purchase where shoppers would have to repeatedly hit add to cart until the store said “OK you can purchase the product”.
Any international order sent through Two Tap went through just fine. That is because Two Tap accepted orders and tried to place them until the website was back online, and then until the queue issue was resolved.
By that metric, sending an order to Two Tap is more reliable than trying to place that order with retailer platforms themselves.
What happens when a retailer changes code.
That being said, integrations do break. Here is how Two Tap doesn’t lose any orders.
The merchant support team
Two Tap has a merchant support team that adds and maintains stores. Someone is always on call monitoring the integrations.
Because of the way Two Tap was designed nobody writes scrapers. Think of adding stores as Excel Macros, the team fills out certain fields with information. We are able to take a person with zero coding knowledge and have them add/maintain stores in two weeks. Our merchant support team’s background is in cooking, call center support, farming and more.
Automated Daily tests
All 1200+ retailers are tested daily. When something breaks our 24/7 on call teams gets an alert.
About 4% of our websites have one tiny change a month, with 1% changing completely. The tiny changes are resolved in under 5 minutes.
Our daily tests are designed to not affect retailers stores, and we never hit the final ‘place order’ button.
Automated Weekly tests
Once a week we thoroughly test all the integrations. We run multiple tests (guest checkout, authenticated checkout, pickup from store) and we go all the way to make sure everything is fine.
Let’s say a website changes between a daily/weekly test. If something breaks, instead of giving up, the oncall team receives a ticket and investigates what’s happening. Once the issue has been resolved the purchase can be retried and the order confirmed.
Highly paranoid Two Tap
It was obvious early on that reliability is our number one priority so we became paranoid about it. At Two Tap we’re so careful about not losing orders that if for whatever reason our API crashes it’s designed to create tickets before dying.
There’s a lot of magic happening in the background. For instance product crawling is completely separate than the placing of orders. Two Tap has a rabbitmq powered phantomjs cloud that just fetches product availability in realtime. It takes into account if for some reason a website is down for a bit, and retries multiple times from different servers.
It’s all about the retailers
By far, we consider Two Tap’s biggest advantage the fact that retailers can focus on their core business and we’ll take care of the rest.
Whatever people say about scraping (and Two Tap indirectly) the fact is that retailers love us, and consider us “as magic”. Two Tap is the most reliable solution on the market and we have a guarantee that we don’t lose any customer order that can be fulfilled by a retailer partner.
We are reliable because we are highly paranoid and always assume and take into consideration the worst possible outcome. However, unlike deep integrations which depend on retailer IT departments, if something breaks, it’s on Two Tap end to fix. This allows us to guarantee a level of security to our publishers.
Two Tap is now able to payout commissions from over 400 retailers, including American Eagle Outfitters, Target, BestBuy, Forever21, and more. The 400+ retailers have Two Tap accounts where you can send them messages, get coupons, learn of deals ahead of time, and more.
Everything we do is public
We can’t mention all our clients publicly due to NDAs but Two Tap has been powering order placement on 50 leading retailers for the biggest dropshipper in the world – with no orders lost in the past 18 months. They’re using Two Tap because it’s easier to manage one API integration than 50 different deep integrations with these retailers and they can be up and running in a matter of days instead of months – all with zero investment on the retailer’s side.
Two Tap is completely open about what we do, and never hide ourselves. Our docs are publicly available. As a consumer if you want to buy something right now you can go to our simple consumer interface at http://twotapit.com.