Evolution of AI & Search Engines
Basically, search engines are getting smarter - using more and more advanced techniques to analyze pages, sites & links in order to return more relevant results. One of these techniques is that of topic analysis (in many, many forms) that tell a search engine whether a page is focused on a specific topic (term/phrase) based on the other words and phrases on the page (and their format, usage, placement, etc.) and in links & linking pages(not just anchor text). The basic method of retrieval of these "related terms" is the first item I'll focus on, followed by some basic instructions on how to use these terms to improve optmization on and off-page.
Retrieval and Discovery of Related Terms
- Search for the target term/phrase at Google and use 100 results per page.
. - Analyze (either manually, or through a script) the top 100 SERPs and put the text into rows of a table that can be compared and picked apart. note: I will try to have a tool that will do this for you by the end of the month on the site in my signature
. - Pull out the top 20 occuring phrases/terms of 1, 2, 3 and 4 words in length (don't count stop words - a good stop word list can be found at http://www.princeton.edu/~biolib/instruct/MedSW.html)
. - Conduct semantic connectivity (C-Index) analysis on each word/phrase in comparison to the target term/phrase
C-Indices use the following formula to come up with a PPT (parts per thousand) number:
C=Z/(X+Y-Z)
Where:
X = The number of pages containing keyword 1 (your target term/phrase)
Y = The number of pages containing keyword 2 (the term/phrase you're comparing it against)
Z = The number of pages containing BOTH keyword 1 & keyword 2
This is important to understand and use, so I'll create a sample for the phrase 'seattle restaurants' compared to another phrase 'lake union':
C=Z/(X+Y-Z) which is 14.77=6740/(58100+405000-6740)
In this equation:
X = The number of results at Google for a search of "seattle restaurants" (always use quotes for a multi-word phrase) - 58,100
Y = The number of results at Google for a search of "lake union" - 405,000
Z = The number of results at Google for a search of "seattle restaurants" "lake union" - 6,740
The highest C-Index I've ever seen is between norton & antivirus - 140. Commonly, I'd start thinking of a word as semantically connected at around 10ppt and closely related over 25ppt.
. - A high C-Index means the terms are related. Rank your 10-25 phrases/terms according to C-Index and remove any that are lower than 10ppt. For caution's sake, I often repeat this activity at Yahoo! - BTW, Excel makes this take very easy.
Many SEO specialists recommend natural language writing and I could not agree more. Write your text without thinking of SEO at all, the SEO pieces can be added in later. Just remember to base your the topic of your page on the term/phrase you're optimizing for. Once again, I'll use the step-by-step guide:
- Write your page naturally, think of marketing and conversion rates, not SEO (but keep the topic on the subject of your keyword).
. - Go back over your text and see if you can use the related terms/phrases discovered above 1 or more times in the text effectively. If you can't don't worry. Just do your best.
. - Check the term weight of your target term/phrase using the 2 tf*idf (term frequency inverse document frequency) formulas:
Classic Normalized Term Weight uses the following equation:Wi = tfdi / max tfdi * log (D/dfi)
Where:
tfdi = term (or phrase of a given length) frequency in document
tfdi = maximum frequency of any (same number word) phrase in document
D = number of documents in the database (when using Google, I estimate at 8.1 billion)
dfi = number of documents containing the term/phrase (# of results for a search in quotes)
A second equation, Glasgow Weight, can also be useful (I generally use both when analyzing my own site vs. the competition):Wij = log(freqij + 1) / log(lengthj) * log (N/ni)
Where:
freqij = frequency of term i (a word or phrase of a given length) in document j
lengthj = number of unique terms (word or phrase of the same length) in document j
N = number of documents in database (again, I use 8.1 billion for Google)
ni - number of documents containing the term (results of a search in quotes)
Once again, I'll try to have a tool built to do this automatically for a page very soon. In the meantime, it's still worth using, and once again, Excel can come in handy.
. - Check the term weight of your top related words - they should optimally be lower than your target term, but higher than any other term (of the same word length, not counting stop words). You really do not need to get this exactly right, close really is good enough.
Once you have the list of related terms and the formulas for term weight, you can see where off-page optimization can be done. Simply check the term weight of your target phrase and related phrases at the sites and pages you want to get links from. The more on-topic the pages/sites are to your phrase, the more relevant the link will be. You don't even need the page or site to mention your particular term once, as long as the term weights of your related phrases are high.
I hope this has been of use to everyone. Please give me your honest feedback and I'll try to edit any errors/omissions.
Nuk ka komente:
Posto një koment