public string CleanHtml(object Html) {
var s = Html.ToString();
var b = new StringBuilder();
foreach (char c in s) {
var l = b.Length;
// Known Replacements
if (c == '&') {
b.Append("&");
}
if (c == '<') {
b.Append("<");
}
if (c == '>') {
b.Append(">");
}
if (c == '`') {
b.Append("'");
}
// Always Encode Extended Ascii Codes
if ((int)(c) > 127) {
b.Append("&#" + (int)(c) + ";");
}
// If nothing was added, add the original char
if (l == b.Length) {
b.Append(c);
}
}
return b.ToString();
}
Refactorings
No refactoring yet !
Moonshield
March 6, 2009, March 06, 2009 00:13, permalink
Use 'else if', so it doesn't need to evaluate other conditions when it succeeds to the first one. As Rikkus said, you better use the method provided in the framework. Still don't know why people use the var keyword when type is known, I'm curious. It is very usefull with Linq queries.. aside that
public static string CleanHtml(string pHtml)
{
StringBuilder oSb = new StringBuilder();
if (!string.IsNullOrEmpty(pHtml))
{
foreach (char c in pHtml)
{
// Known Replacements
if (c == '&')
oSb.Append("&");
else if (c == '<')
oSb.Append("<");
else if (c == '>')
oSb.Append(">");
else if (c == '`')
oSb.Append("'");
else if ((int)(c) > 127)
{
// Always Encode Extended Ascii Codes
oSb.AppendFormat("&#{0};", (int)c);
}
else
{
// If nothing was added, add the original char
oSb.Append(c);
}
}
}
return oSb.ToString();
}
Shawn
March 8, 2009, March 08, 2009 20:23, permalink
"var" keyword means "variable", not "variant." In the original posters usage, it can just be used to simplify (shorten) the definition removing redundant type info.
Some.Long.TypeName myVariable = new Some.Long.TypeName(); // compared to var myVariable = new Some.LongTypeName(); // no need to repeat "Some.Long.TypeName" twice as the type is evident.
Moonshield
March 8, 2009, March 08, 2009 20:36, permalink
I know what it means and should be used for. For exemple : I think it should not be used to get the length of a string like in posted code var l = b.Length;, use and simple int. And for long type name, using keyword should be the right solution IMHO
ybo
March 11, 2009, March 11, 2009 13:59, permalink
Reduce nesting and don't instanciante StringBuilder if not needed. Preallocate minimum space for StringBuilder. Type cast when comparing c to 127 is redundant. Don't hesitate to use longer variable names (unless you code with notepad).
Open-Closed Principle : your class should be open to extension and closed for modification... Introducing the knownReplacements dictionary and filling it from any data source (configuration, file, database, ...) will make your code open to extension without modification.
static Dictionary<char, string> knownReplacements;
public static string CleanHtml(string pHtml)
{
if (string.IsNullOrEmpty(pHtml)) return string.Empty;
var cleanedHtml = new StringBuilder(pHtml.Length);
foreach (char c in pHtml)
{
string replacement;
if (knownReplacements.TryGetValue(c, out replacement))
{
// Known Replacements
cleanedHtml.Append(replacement);
}
else if (c > 127)
{
// Always Encode Extended Ascii Codes
cleanedHtml.AppendFormat("&#{0};", (int)c);
}
else
{
// If nothing was added, add the original char
cleanedHtml.Append(c);
}
}
return cleanedHtml.ToString();
}
EGarren
March 13, 2009, March 13, 2009 05:13, permalink
Pretty hackish.
using System;
using System.Collections.Generic;
namespace Encode {
public static class MyExtensions {
private static readonly Dictionary<char, string> reserved = new Dictionary<char, string>() {
{'<',"<"}, {'>', ">"}, {'&', "&"}
};
public static string HTMLEscape(this Char c) {
string encodedChar, tmp = "" + c;
if(!reserved.TryGetValue(c, out encodedChar)) {
encodedChar = (c > 127) ? string.Format("&#{0}", (int) c) : tmp;
}
return encodedChar;
}
}
class Program {
static void Main(string[] args) {
char[] tests = {'>', '<', '&', '€', 'A'};
foreach (var c in tests) {
Console.WriteLine("'{0}' encoded is '{1}'", c, c.HTMLEscape());
}
}
}
}
Rick R
July 31, 2009, July 31, 2009 21:25, permalink
Why reinvent the wheel?
using System;
using System.Web;
namespace Encode {
class Program {
static void Main(string[] args) {
char[] tests = {'>', '<', '&', '€', 'A'};
foreach (var c in tests) {
Console.WriteLine("'{0}' encoded is '{1}'", c, HttpUtility.HtmlEncode(c.ToString()));
}
}
}
}
This is my own attempt at writing a HTML Encoder for ASP.NET.
C&C Welcome :)