Improving the books-like pages and misc fun :)
Preview of the "Best Books Of The Year" category pages...
Every two weeks, I share my notes on building Shepherd.com.
It will include struggles and frustrations 🤣.
I have a polished newsletter for authors here (it is focused on how we help authors + sharing data that helps them + business updates).
Shepherd is like wandering around your favorite bookstore. We make exploring and discovering your next book magical (while helping authors get their books in front of the most likely readers). Please join us as a founding member and support our mission.
June 19th to June 30th
Fun week! Lots of design work :)
Books-like page improvements…
I’ve been working to improve the books-like page (for an example of the current page, try books like “The Spy Who Came in From the Cold”).
What do I want to improve?
I want to show why each book was picked as being “like” that book. This is something readers have requested and that I want. I want to show how humans connected this book to the other (or if we don’t have human data explain how the algo worked).
I want to add filters so that readers can filter the books to see only books within certain genres/topics. So if I loved Dune because it is classic science fiction, I could limit the results to only classic science fiction that fans recommend.
I want to improve the quality of the results. Now that we have genres/ages, we will use those to improve the results. And I want to do more to show the most recommended books alongside that one. For example, if Project Hail Mary is recommended alongside Dune more than any other book, I want to show it first.
Here is what we are playing with:
I’ve done 20+ random video tests with users to figure out a design that clearly explains why the books are connected and is not confusing. I still need to play more with it, but we are finally getting somewhere.
And here is the opening we are playing with:
What do you think? Hit reply and let me know.
Best Books Of The Year Category pages…
We are playing with something like this for the opener of the BBOY category pages.
The body I am still struggling with.
We are looking at doing a left column on the desktop to navigate the genres.
And the right is still a mess, but we are working on some ideas.
I’ll share more as we make progress.
What about the front page?
It is 99% done on the design side, and I am super excited to start this work soon (Marton, our part-time dev, gets back in a week). It looks beautiful and does a better job of showing our ethos.
Topic and genre accuracy…
Shepherd has an internal mess as many things we call genres are topics (and many things we call genres are topics).
The book industry is an utter mess in this regard. I am stuck using the book industry’s data for now, but soon I want to clean it up. And I want to improve the quality of our genre and topic data.
Why is this data such a mess?
I don’t understand it, but the publishers don’t know how to implement this data, don’t spend time making sure it is correct, and some abuse it.
I’ve seen editions of Dune that say it is published in the 1700s and others in the 1800s. If you see that level of mistake on a book selling hundreds of thousands of copies each year, you can guess how bad this is.
Or, a book like Good Omens is marked as science fiction by the publisher, or a fiction book is marked as being about A.I., automatically classifying it as nonfiction (so then it is marked both fiction and nonfiction).
And one of the most frustrating things is that publishers try to slip marketing crap into the book description so that instead of a clean book description, you have quotes from newspapers, award mentions, and other spammy stuff (weird characters as well).
Nobody is checking this data or teaching publishers how to use it properly.
Which makes sense, as the book industry has no idea what to do with the internet or how to empower a book ecosystem to grow.
Instead, companies like Nielsen, Ingram, and Bowker sell access to this database but have no incentive to improve it. And they make it so expensive that almost no one can afford to use this data to make cool apps or websites for readers (which would also help authors/publishers).
It feels like I am watching someone sitting in a pot of water, and it is slowly getting hotter and hotter. They sit there doing the same thing they did for the last twenty years ago and refuse to change or try anything new.
(Note, I am venting some frustration, and I could be wrong, but this is my read on the situation as of June 30th, 2023. )
What can I do to fix this data?
This is incredibly hard to fix.
For example, with Good Omens being assigned to Science Fiction, how do you fix that? How do you even know if it is incorrect if you haven’t personally read that book?
We have 30,000+ books in our database right now; any fix requires a scaled solution.
One option is to crowdsource improvements.
The idea is that website users can improve the data about books (both readers and authors). They could suggest a book's genre, fix errors from the publisher’s data, and other suggestions.
Then I would need to build something to review those submissions, approve them and ensure the system is working correctly. This is hard to get right.
This is what Goodreads does; they have a large team of volunteers. I think it is also what Story Graph is doing on a smaller scale.
Another option is to use artificial intelligence…
We use the fantastic Wikifier project to identify Wikipedia topics in a book.
This project uses Natural Language Processing from data we give it about the book to identify topics. It isn’t perfect, but it scales very well. It struggles with fiction as that is harder for it to make topical connections with.
I am looking to switch this process to ChatGPT as, in early tests, it appears to be quite a bit better (although there are a lot of A.I. hallucination problems).
When we make this switch, I could also use ChatGPT to improve each book's genre data. I’ve been playing with this for a few months… and I think ChatGPT might be able to do this in six months.
It will still require a lot of work on our end, but I think this could significantly improve genre and topic accuracy.
What is going on outside of Shepherd?
My son had his last day of school yesterday! He has officially graduated 1st grade. Unfortunately, he was sick and missed the last day of school.
We leave in a few days to drive to rural France for our month-long family vacation. I am super excited and looking forward to more time with my wife and son, reading, and biking.
What am I reading?
If you love crime fiction, check out Brian Klinborg’s Inspector Lu Fei series, it is fantastic; I read all three books last week. Book three was the best yet, and I look forward to the next one. I found it on his list of the best books about international crime, both fiction and nonfiction.
Savage Peace - I need to finish this one, put it down, and haven’t picked it up in a few weeks, even though I love it. It is about the year right after WW1.
Thanks, Ben
P.S. Beautiful Cannes, France…