The struggle to automate a book recommendation website - a matter of the available data

Montero

Senior Member
Supporter
Joined
Jan 2, 2008
Messages
3,376
Location
Up the clum
For a year or so now, I've been supporting the book recommendation website, Shepherd.com and every so often sending in emails to the founder about "hey, this needs a fix" or "that book really doesn't belong in that category". Shepherd.com is a work in progress - already a good place to go to browse virtual bookshelves - and being worked on to make it better yet.
Anyway back to the automation of book recommendations - I get replies from Ben Fox of Shepherd, saying "oh yes, that happened because ......" . Just learned that in fact he has been blogging about the development for a while (how did I miss that) and the blog is here:
A chunk of the problem is the underlying book data, so for anyone who has said why did Amazon/Goodreads/Whatever recommend that book to me because it is nothing to do with anything I've read, or nothing to do with the other books in the recommendation list, here are a few answers as pertains to the available book data, and also how Shepherd.com is working on handling it. (Not saying that Amazon/Goodreads are using the same methods as Shepherd, but they do have data sources in common.)
 
Last edited:
I really wish someone would bite the bullet and make a book recommendation system that analyzes the full text of books and uses whatever tools we have now (LLMs, sentiment analyzers etc etc) to find books that align with books we've liked in the past.
 
Suspect that would be really difficult but would be interesting to get better targeted results.

What Shepherd is doing is two-fold:

1. Authors create themed lists recommending up to five books on that theme, and the author's own book appears as an advert at the top. For example @Stephen Palmer's list The best books that explain the mystery of consciousness

2. The system lists books by topics and genre, picking up on links created by different author lists having one or two books in common on them, and also by the data about the books. The article at the top of the thread is largely dealing with the data on the books provided by the various indexing/catalogue system/publishers.

I rather like the whole books on a theme part of it, as per bullet point 1, as it gives you a clear feel of why the author likes those five books, and a way to stroll sideways to other lists with the same books on them. For me it really is like browsing along bookshelves and going "ooh, looks interesting, wouldn't have thought of that." It is orthogonal to getting precisely the book wanted and went looking for - it is getting the book you didn't realise you wanted.
 
Last edited:
Suspect that would be really difficult but would be interesting to get better targeted results.

What Shepherd is doing is two-fold:

1. Authors create themed lists recommending up to five books on that theme, and the author's own book appears as an advert at the top. For example @Stephen Palmer's list The best books that explain the mystery of consciousness

2. The system lists books by topics and genre, picking up on links created by different author lists having one or two books in common on them, and also by the data about the books. The article at the top of the thread is largely dealing with the data on the books provided by the various indexing/catalogue system/publishers.

I rather like the whole books on a theme part of it, as per bullet point 1, as it gives you a clear feel of why the author likes those five books, and a way to stroll sideways to other lists with the same books on them. For me it really is like browsing along bookshelves and going "ooh, looks interesting, wouldn't have thought of that." It is orthogonal to getting precisely the book wanted and went looking for - it is getting the book you didn't realise you wanted.
There is another site which, I think, is something like what you describe: Five Books

It has some well curated lists on a broad range of subjects, and some impressive contributors.
 
There is another site which, I think, is something like what you describe: Five Books

It has some well curated lists on a broad range of subjects, and some impressive contributors.
Thanks for that, not heard of Five Books, so went to look. What I've found is that they are similar but not identical.
Five Books explains on the page Experts we've interviewed for book recommendations, listed by discipline that "In case you’re new to Five Books, the format is: an expert, a topic and the five best books on that topic, explained in an interview. Tip: you’ll normally learn quite a lot about the topic from our interviews, even if you don’t get around to reading all the books." Which is good and I will go exploring a bit further.

What Shepherd.com is doing is authors get to pick the theme that their five books are on - which is not quite the same as topic. It can be a topic, or it can be something off beat such as
or Tad Williams pick
The authors explain the theme and why each book on the list qualifies, plus get to advertise their own book at the top of the page that also applies to the theme.

Shepherd is also all about finding connections, and wandering off to browse.

A second functionality, which is a work in progress, is exploring the book. So take for example one of the books from Tad William's list, Zelazny's Lord of Light Why read Lord of Light? - you then get offered:

Exploring books like Lord of Light,
Exploring other book lists with this book,
Exploring why people like this book
Topics from the book - Space Colonisation, Hinduism and Buddhism
Genre - Science Fiction

Due to problems with book data, sometimes the "Exploring books like xxxx" can have some mismatches. That is being worked on.

Finally on both sites there is the straight search functionality, as in type in your search and press on the magnifying glass icon.

Last week I was on Shepherd and put in "cat" as I fancied a non-fiction book or a memoir about cats and found a problem which I've reported and is now on the "to fix" list. I've just found exactly the same problem on Five Books.
You type in "cat" and you get author names with cat in them - Catriona for example. Book titles with "cat" in them like "The Cat in the Hat" by Dr Seuss, or thrillers with a "cat and mouse chase" etc. Failed to pop up anything like I was looking for. As in no topic headed "cat" with books about cats on a topic page.
 
I really wish someone would bite the bullet and make a book recommendation system that analyzes the full text of books and uses whatever tools we have now (LLMs, sentiment analyzers etc etc) to find books that align with books we've liked in the past.

While I can't ever look at the full text of the book... I am working toward mapping someone's "Book DNA" and then finding people who share their "Book DNA" to hopefully find similar books.

IE, what are your absolute favorite 10 books? Then look within our website for the people who might have a 7-point match and look at what those groups of people also liked and go from there...

We are a long way from this, as we don't even have user accounts yet :). But that is the direction I am working toward for readers, as it is something I really want for myself with a "book DNA" feature :giggle:.
 
Last week I was on Shepherd and put in "cat" as I fancied a non-fiction book or a memoir about cats and found a problem which I've reported and is now on the "to fix" list. I've just found exactly the same problem on Five Books.
You type in "cat" and you get author names with cat in them - Catriona for example. Book titles with "cat" in them like "The Cat in the Hat" by Dr Seuss, or thrillers with a "cat and mouse chase" etc. Failed to pop up anything like I was looking for. As in no topic headed "cat" with books about cats on a topic page.
Ya I am working to fix this :), problem with search weights where are currently putting author/book too far ahead of genre/topic. Should have a fix in a few weeks to get that balanced right.
 
Thanks :)

happy to answer any questions, and happy to answer broader questions if anyone here is thinking of building a book related website. I can tell you a lot of the pain points :)
 

Similar threads


Back
Top