If you've ever wanted to get Related Posts based on similar Tags from other Posts using SQL, the following SQL code should help with creating your own SQL query. In this example, the Related Posts are articles, but they could be Blog posts or any other data that uses Tags. The SQL query below gets the count of Posts that have the most amount of Tags in common and orders these Related Posts highest. So in essence, the Related Posts returned should be more highly related based on the number of tags that the Posts have in common. SEE BELOW for the SQL Table Structure of the Tags.
Basic code to query the number of Common Tags an Article has with other Related Articles (Ordered by Common Tag):
SELECT tpe1.ArticleId, tpe2.ArticleId, COUNT(*) as CommonTags
FROM dbo.article_TagPostEntry tpe1
INNER join dbo.article_TagPostEntry tpe2 ON tpe1.TagId = tpe2.TagId
WHERE tpe1.ArticleId <> tpe2.ArticleId
GROUP BY tpe1.ArticleId, tpe2.ArticleId
ORDER BY CommonTags DESC;
Here's the SQL code to query Related Article Posts based on similar Tags:
DECLARE @ArticleId INT
SET @ArticleId = 3 -- TEST NUMBER
-- Get the Related Articles based on similiar Tags
SELECT TOP 5 a.Title, a.Description FROM dbo.article_Articles a
WHERE a.ArticleId IN (
SELECT TOP 8 tpe2.ArticleId
FROM dbo.article_TagPostEntry tpe1
INNER join dbo.article_TagPostEntry tpe2 ON tpe1.TagId = tpe2.TagId
WHERE tpe1.ArticleId != tpe2.ArticleId AND tpe1.ArticleId = @ArticleId
GROUP BY tpe2.ArticleId
ORDER BY NEWID())
Here's another way to query Related Article Posts based on similar Tags (THIS METHOD WILL PROBABLY be SLOWER, but you could modify the code to query TAGS from seperate tables that may not be related by IDs):
DECLARE @ArticleId INT
SET @ArticleId = 3 -- TEST NUMBER
-- Get the Related Articles based on similiar Tags
SELECT TOP 5 a.Title, a.Description FROM dbo.article_Articles a
WHERE a.ArticleId IN (
SELECT TOP 5 ArticleId
FROM dbo.article_TagPostEntry
WHERE ArticleId <> @ArticleId AND TagId IN (SELECT tpe.TagId FROM dbo.article_TagPostEntry tpe WHERE tpe.ArticleId = @ArticleId)
GROUP BY ArticleId
ORDER BY COUNT(TagId) DESC)
ORDER BY NEWID()
Alternatively, to get a more randomized listing of related Posts, replace ORDER BY COUNT(TagID) DESC wtih:
ORDER BY NEWID(), COUNT(TagId) DESC
----------------------
Keep in mind the Table Structure for Tags uses 2 tables:
Table #1 is the "article_TagPostEntry" table and looks like this:
TABLE [dbo].[article_TagPostEntry](
[PostTagId] [bigint] IDENTITY(1,1) NOT NULL,
[ArticleId] [int] NOT NULL,
[TagId] [bigint] NOT NULL,
[DateCreated] [datetime] NOT NULL,
Table #2 is the "article_TagNames" table and is where the Names of the Tags are actually kept. TagId is used to relate the two Tag tables together. The Related Posts SQL code in the example above doesn't need to access this table.
TABLE [dbo].[article_TagNames](
[TagId] [bigint] IDENTITY(1,1) NOT NULL,
[TagName] [nvarchar](50) NOT NULL,
[TagPathName] [nvarchar](100) NOT NULL,
[TagCount] [int] NOT NULL,
[DateCreated] [date] NOT NULL,
NOTE: The main Articles table gets related to the "article_TagPostEntry" table using ArticleId.
JUST Keep in mind that these queries will put a serious strain on your database if you run them dynamically everytime someone visits a page. You'd probably be better off inserting the Related Posts that this query outputs into a Table, and retrieving the data by ID's. Otherwise, you may find you get a lot of SQL Timeout errors if your database server is not powerful enough to handle the load on a high traffic site. You don't want to deal with the following error:
"Timeout expired. The timeout period elapsed prior to completion of the
operation or the server is not responding"