Dynamic RSS Feed from ColdFusion and bad characters like Â
Are you creating a dynamic RSS Feed in ColdFusion? Maybe you learned some tips from Pete Freitag's blog article. Maybe you have a feed that has worked fine then one day you encounter an issue where bad character(s) are throwing an error in the RSS feed. If you use Internet Explorer, you may see an error like this: An invalid character was found in text content. Line: 71 Character: 271
At the end of this blog I'll describe a simple fix that helped me and hopefully will help you too. Let's say you work at a fictious company and are responsible for publishing news content to the website. Today, each news release is saved as a content record in your CMS database; title, publish date, keywords, content. For the content, each news release contains a text file saved to the web server's file system, and the path to that file is saved in the database. The content file originated in MS Word from your company's communications department. You create a HTML version of the file, mainly using some simple HTML tags; paragraphs, italics, bold, etc. The website's news.cfm page queries the CMS database for a list of news titles, showing the most recent at the top. If a user clicks a news title, they visit the page newsdetail.cfm which does a
<cfquery name="qNews" datasource="CMS" maxrows="10">
select id,title,pubdate,contentpath
from news_table
order by pubdate desc
</cfquery>
<cfsavecontent variable="xmldata"><?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Company ABC News Releases</title>
<link>http://www.companyabc.com/</link>
<description>Company ABC is a worldwide leader in making news that people like to read.</description>
<lastBuildDate>#DateFormat(Now(),'#ddd, d mmm yyyy')# #TimeFormat(DateAdd('h',variables.currentoffset,Now()),'h:mm:ss')# GMT</lastBuildDate>
<language>en-us</language>
<cfloop query="qNews">
<item>
<title>#title#</title>
<link>http://www.companyabc.com/newsdetail.cfm?id=#id#</link>
<pubDate>#DateFormat(pubdate,'#ddd, d mmm yyyy')# #TimeFormat(DateAdd('h',variables.currentoffset,pubdate),'h:mm:ss')# GMT</pubDate>
<content:encoded><![CDATA[<cfinclude template="#contentpath#">]]></content:encoded>
</item>
</cfloop>
</channel>
</rss>
</cfsavecontent>
<cfcontent type="text/xml" reset="yes" /><cfoutput>#variables.xmldata#</cfoutput>
If your content source is MS Word, it may contain special characters that are not UTF-8 friendly. Examples:
2010–2012 [long/short dash - you are seeing: 2010â€"2012]
€110 bn [Euro - you are seeing: €110 bn]
Castaeda [foreign alpha character - you are seeing: Castañeeda]
value of ? [greek character - you are seeing: value of θ]
[angled quotes - you are seeing: “my quote�]
rental. Minneapolis [double space - you may see nothing obvious or maybe you are seeing: rental. Â Minneapolis]
The fix is very simple, just append charset=utf-8 in this line at bottom of page:
For some reason, even though you have the browser does not always interpret it that way. Read this blog to learn more






