NYCPHP Meetup

NYPHP.org

[nycphp-talk] Dynamically Add Links to Text

tedd tedd at sperling.com
Fri Aug 28 10:51:36 EDT 2009


At 1:00 PM +0300 8/28/09, Petros Ziogas wrote:
>I would just like to mention a point of failure in that automated 
>proccess. I had to deal with this in a previous project so it's 
>quite fresh.
>
>What will happen if:

Problem 1

>There are 3 articles. Article A is titled "History of America". 
>Article B is titled "Glorious History of America". In article C 
>there is this text "The book is talking about the glorious history 
>of America". If you run an automated proccess and the test for 
>article A comes first then the text will be  "The book is talking 
>about the glorious <a href="/id1111/">history of America</a>" and 
>the next test will fail.
>
>If you run a test for article B first the text will become "The book 
>is talking about the <a href="/id2222/">glorious history of 
>America</a>". Then if you test for article A it might end up 
>being "The book is talking about the <a href="/id2222/">glorious <a 
>href="/id1111/">history of America</a></a>"
>
>The possibilities of such procedured practically ruining your 
>content are endless. If you want to dive into tag nesting and html 
>validation you will be opening another whole.

Problem 2

>Also what will happen if an editor want to insert this "I loved the 
>book <a href="LINKTOAMAZON">George Washington and the 
>Glorious history of America</a>." and there are articles with titles 
>using "George Washington", "Glorious history", "History of America", 
>"America"?
>
>I think you get my point...

Petros:

Yes,  I see your point and the two problems you raise (good concerns).

Problem 1

My initial solution would solve the first problem *provided* that the 
titles were unique and not contained within another title, right? So 
why not start with the longest title and search/replace downwards?

For example, "Glorious History of America" is searched, found, and 
made a link. Then "History of America" is searched -- however -- the 
search excludes links! The phrase "History of America" in "Glorious 
History of America" would never be considered because it's within a 
link.

The process would continue until you run out of titles -- simple, right?

Problem 2

The second problem can be solved two ways:

Way one -- by removing all organic links from the initial search. In 
other words, when the FULL TEXT search is started the search is done 
on articles absent of all organic links. You can easily add the 
organic links back-in after the search/replace is finished.

Please note when the automated links are added, they also have an 
unique class attribute, such as class="autotag", which will allow 
them to be easily identified and removed for a rebuild.

Way two -- you could solve the problem by excluding organic links 
from the search because they DO NOT have the unique class attribute 
identifier -- thus no real reason to remove them at all for the 
search/replace routine (i.e., Way 1). I only presented "Way 1" to get 
you to think in terms of removing the organic links from the problem.

Possible problem

The only fly in the ointment here would be if an editor wants to 
manually link an article by trying to mimic the automated process. 
For example, he/she inserts a "<a href="/id1111/">History of 
America</a>" using the *index* of the article. Everything would still 
work unless that article is deleted. In such case the link would 
become dead.

However, if the editor simply added the class identifier tag (i.e., 
class="autotag") to the link, then the automated process would treat 
his entry like it's own and adjust accordingly.

If the editors simply followed the rules, which aren't complicated, 
then editors could participate as they want in the process.

The solution presented here doesn't require tag nesting or html 
validation. As such, I don't see any additional problems -- do you?

Cheers,

tedd

-- 
-------
http://sperling.com  http://ancientstones.com  http://earthstones.com



More information about the talk mailing list