April 19, 2020

Amazon Wishlist Price Check

My wife reads a lot of books. I try to keep up but barely manage to read twelve in a year. She likes physical copies, but since we started traveling full-time this isn’t an option. Our local library offers a digital library card to check out books through the Libby app. While this is excellent, they don’t always have the book we want to read. Amazon, on the other hand, has every book imaginable, many of them being available digitally. We don’t, however, like to pay full price. So we are typically stuck waiting for a deep discount before making a purchase. The problem with this approach is knowing when the books on our wishlist go on sale. We needed a way to track when books on multiple wishlists fell below a certain price point. Of course, we can solve this with some code.

Requirements

I had some basic requirements for the tool.

Support multiple wishlists
Alert me for any books that met a price threshold
Link to buy now for each book

Solution

As of the time of this writing, Amazon doesn’t provide an API to fetch your wishlist details. This indicated some scraping would be required to get the needed data. Some people, myself included, keep multiple wishlists sorted by genre. This will require we loop over multiple lists when scraping. For the alerting, email appeared to be the best option for a couple of reasons. First, we could potentially have many items on the list return below our price point. Second, it is helpful to see the books laid out in list form with the title, price, and a buy link. This allows us to easily pick and choose which books to buy now and which to save for later. The last requirement is an easy one. Knowing we have a plan lets get into some code.

Scraping Wishlists

I decided to write the tool in Go so I could practice the language. First, it must be noted that the Amazon wishlist(s) must be set to PUBLIC. I am using the Colly library to scrape the wishlist page for data. The library gives us easy mechanisms to configure the parallelism and timing of requests. Amazon has been known to block scrapers, so we need to keep the settings sane by human standards. Since we aren’t concerned with speed here, we don’t need to lookup pages faster than we would browsing the UI. Based on that assumption, the code loads one page at a time, with a one-second delay between requests. Most Amazon sales are 24 hours, so if you run it twice per day, you will achieve the maximum result without alerting the powers that be.

You can’t view all the wishlist books in a single view. This means we need to support scraping paginated pages of books. This is easily accomplished by looking for the SEE MORE link at the bottom of the page.

seeMoreLink := e.ChildAttr("a.wl-see-more", "href")	 

 if seeMoreLink != "" {	 
     c.Visit("https://smile.amazon.com" + seeMoreLink)	 
}

As you can see from the code snippet above, we get the Child Attribute of the list, check if the link is present, and, if so, visit the page.

The Price Threshold

To keep the code simple, I am matching on whole dollar amount, dropping the cents. For my use case, I don’t care about 0.99 cents. Plus, I don’t have to worry about the accuracy of the int64 values. Setting a maxPrice of 2 will return any book in the price range of $0.01 to $2.99. The setting is hardcoded but could easily be moved into a variable.

Emails

The code supports sending emails via Sendgrid. You just have to provide a token, to, and from emails via files. Emails can be sent to multiple addresses at once so multiple people can manage book purchases. See the Configuration section below for more details.

Get the Binary

Download the binary from the releases page on Github

Configuration

Wishlist IDs

To get the Amazon wishlist IDs you will need to go to each wishlist and get the ID from the URL.

Amazon URL: https://amazon.com/hz/wishlist/ls/3GMM56QSK63GT

Wishlist ID: 3GMM56QSK63GT

Secrets

The code will look for secrets stored in the following directory /var/amzn_wishlist_price_check/secrets

This directory should contain the following files:

wishlistids - A comma-separated string of wishlist IDs
sendgridtoken - The API token from Sendgrid

fromemail - This is a JSON file with the From information

{
    "name": "Amazon Wishlist Price Check",
    "email": "[email protected]"
}

toemail - This is a JSON array with the To information

[
    {
        "name": "John Doe",
        "email": "[email protected]"
    },
    {
        "name": "Jane Doe",
        "email": "[email protected]"
    }
]

Logs

Create the log file touch var/log/amzn-wishlist-price-check.log

Run with Cron

You can edit the cron jobs on a Linux machine with crontab -e

0 6,18 * * * /usr/local/bin/amzn-wishlist-price-check >> /var/log/amzn-wishlist-price-check.log 2>&1

Conclusion

As with anything scraped from the web, you will have days where the email shows more or fewer books than it should. I have made a best effort to account for inconsistencies in the scrapped data, but I know I missed a few. Overall though, it gets things right more times than not so I can purchase the books I want while they are on sale. Feel free to fork the code and alter it to your liking. You can run multiple copies for different lists and different price points or even use it for regular Amazon items outside of books, although this is untested.