Vanilla 1 is no longer supported or maintained. If you need a copy, you can get it here.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

Search engine friendly URL's to work with Scandinavian letters?

edited August 2006 in Vanilla 1.0 Help
I'm having a small problem.

I would like to see the finnish letter "ä" transformed into an "a" and the finnish letter "ö" transformed into an "o" in the URL of my forum posts. Now it just cuts those letters off and the URL looks a bit weird if those letters are used (which is quite often).

For example I have a discussion called "Onko nettideittailu vieläkään hyväksyttävää nykypäivänä?" which shows in the URL as "/onko-nettideittailu-vielkn-hyvksyttv-nykypivn/". Here's the live example (hope you see all the letters correctly): http://otamut.org/keskustelu/discussion/5/onko-nettideittailu-vielkn-hyvksyttv-nykypivn/

How could I fix this?

Comments

  • Well spotted, Mania! Applies to Swedish as well.
  • And while we're at it, danish too. (æøå)
  • edited August 2006
    Some function like thisone maybe?
    function unaccent($text) { $trans = get_html_translation_table(HTML_ENTITIES); foreach ($trans as $literal =>$entity) { if (ord($literal)>=192) { $replace[]=substr($entity,1,1); $search[]=$literal; } } return str_replace($search, $replace, $text); }
  • I think this list cover the Scandinavian languages? Or am I missing somewhing? Ã¥ = a ä = a ö = o æ = a ø = o
  • Think more global please. What about other languages with special characters?
  • MarkMark Vanilla Staff
    Here is the current function used to encode the url titles (found in library/framework/framework.functions.php):

    function CleanupString($InString) { $Code = explode(',', '<,>,&#039;,&,",À,Á,Â,Ã,Ä,&Auml;,Å,Ā,Ą,Ă,Æ,Ç,Ć,Č,Ĉ,Ċ,Ď,Đ,Ð,È,É,Ê,Ë,Ē,Ę,Ě,Ĕ,Ė,Ĝ,Ğ,Ġ,Ģ,Ĥ,Ħ,Ì,Í,Î,Ï,Ī,Ĩ,Ĭ,Į,İ,IJ,Ĵ,Ķ,Ł,Ľ,Ĺ,Ļ,Ŀ,Ñ,Ń,Ň,Ņ,Ŋ,Ò,Ó,Ô,Õ,Ö,&Ouml;,Ø,Ō,Ő,Ŏ,Œ,Ŕ,Ř,Ŗ,Ś,Š,Ş,Ŝ,Ș,Ť,Ţ,Ŧ,Ț,Ù,Ú,Û,Ü,Ū,&Uuml;,Ů,Ű,Ŭ,Ũ,Ų,Ŵ,Ý,Ŷ,Ÿ,Ź,Ž,Ż,Þ,Þ,à,á,â,ã,ä,&auml;,å,ā,ą,ă,æ,ç,ć,č,ĉ,ċ,ď,đ,ð,è,é,ê,ë,ē,ę,ě,ĕ,ė,ƒ,ĝ,ğ,ġ,ģ,ĥ,ħ,ì,í,î,ï,ī,ĩ,ĭ,į,ı,ij,ĵ,ķ,ĸ,ł,ľ,ĺ,ļ,ŀ,ñ,ń,ň,ņ,ʼn,ŋ,ò,ó,ô,õ,ö,&ouml;,ø,ō,ő,ŏ,œ,ŕ,ř,ŗ,š,ù,ú,û,ü,ū,&uuml;,ů,ű,ŭ,ũ,ų,ŵ,ý,ÿ,ŷ,ž,ż,ź,þ,ß,ſ,А,Б,В,Г,Д,Е,Ё,Ж,З,И,Й,К,Л,М,Н,О,П,Р,С,Т,У,Ф,Х,Ц,Ч,Ш,Щ,Ъ,Ы,Э,Ю,Я,а,б,в,г,д,е,ё,ж,з,и,й,к,л,м,н,о,п,р,с,т,у,ф,х,ц,ч,ш,щ,ъ,ы,э,ю,я'); $Translation = explode(',', ',,,,,A,A,A,A,Ae,A,A,A,A,A,Ae,C,C,C,C,C,D,D,D,E,E,E,E,E,E,E,E,E,G,G,G,G,H,H,I,I,I,I,I,I,I,I,I,IJ,J,K,K,K,K,K,K,N,N,N,N,N,O,O,O,O,Oe,Oe,O,O,O,O,OE,R,R,R,S,S,S,S,S,T,T,T,T,U,U,U,Ue,U,Ue,U,U,U,U,U,W,Y,Y,Y,Z,Z,Z,T,T,a,a,a,a,ae,ae,a,a,a,a,ae,c,c,c,c,c,d,d,d,e,e,e,e,e,e,e,e,e,f,g,g,g,g,h,h,i,i,i,i,i,i,i,i,i,ij,j,k,k,l,l,l,l,l,n,n,n,n,n,n,o,o,o,o,oe,oe,o,o,o,o,oe,r,r,r,s,u,u,u,ue,u,ue,u,u,u,u,u,w,y,y,y,z,z,z,t,ss,ss,A,B,V,G,D,E,YO,ZH,Z,I,Y,K,L,M,N,O,P,R,S,T,U,F,H,C,CH,SH,SCH,Y,Y,E,YU,YA,a,b,v,g,d,e,yo,zh,z,i,y,k,l,m,n,o,p,r,s,t,u,f,h,c,ch,sh,sch,y,y,e,yu,ya'); $sReturn = $InString; $sReturn = str_replace($Code, $Translation, $sReturn); $sReturn = urldecode($sReturn); $sReturn = preg_replace('/[^A-Za-z0-9 ]/', '', $sReturn); $sReturn = str_replace(' ', '-', $sReturn); return strtolower(str_replace('--', '-', $sReturn)); }
  • its works on the brute force. maybe something like .... lussumo tries to create appropriate url. if it fails then gives you an option to name it. (there will be always some missing characters) wordpress has "post slug"
  • Impressive list of characters :)
  • Hmm.. so what to do? :)

    If I understand correctly that function replaces some of the special characters into more common characters. And I spotted my letters there so it should work?
  • MarkMark Vanilla Staff
    Yep, it should work...

    I'm not great with the inner workings of character encodings, but I know that there are a few other members on here who are - I'm hoping one or two of them will join in on the discussion...
  • mania wrote:
    If I understand correctly that function replaces some of the special characters into more common characters. And I spotted my letters there so it should work?
    …if you add the corresponding "flat" char at the same place in the translation array.
This discussion has been closed.