Please upgrade here. These earlier versions are no longer being updated and have security issues.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

Search not working correctly for Japanese language - multibyte character problem?

edited September 2011 in Vanilla 2.0 - 2.8
We are using Vanilla version 2.0.18b2 and users will post in Japanese language. The problem is that when doing a search, it will only find the exact phrase that was searched but not part of it (individual words).

For example, we can search for: いつもご利用ありがとうございます
But not this: いつも (which is part of this phrase)

I noticed in another post that someone had the same problem with Chinese and other unicode (multibyte) characters. Does anyone have any tips or a solution for this? Database is using utf-8 characters so I don't think there's a problem with database.

I have been checking the source files and I'm guessing that /applications/dashboard/models/class.searchmodel.php handles this but maybe we'd need to change something somewhere else?

Here's searchmodel.php:
---------------
class SearchModel extends Gdn_Model {
/// PROPERTIES ///
protected $_Parameters = array();

protected $_SearchSql = array();

protected $_SearchMode = 'match';

public $ForceSearchMode = '';

protected $_SearchText = '';

/// METHODS ///
public function AddSearch($Sql) {
$this->_SearchSql[] = $Sql;
}

/** Add the sql to perform a search.
*
* @param Gdn_SQLDriver $Sql
* @param string $Columns a comma seperated list of columns to search on.
*/
public function AddMatchSql($Sql, $Columns, $LikeRelavenceColumn = '') {
if ($this->_SearchMode == 'like') {
if ($LikeRelavenceColumn)
$Sql->Select($LikeRelavenceColumn, '', 'Relavence');
else
$Sql->Select(1, '', 'Relavence');

$Sql->BeginWhereGroup();

$ColumnsArray = explode(',', $Columns);
foreach ($ColumnsArray as $Column) {
$Column = trim($Column);

$Param = $this->Parameter();
$Sql->OrWhere("$Column like $Param", NULL, FALSE, FALSE);
}

$Sql->EndWhereGroup();
} else {
$Boolean = $this->_SearchMode == 'boolean' ? ' in boolean mode' : '';

$Param = $this->Parameter();
$Sql->Select($Columns, "match(%s) against($Param{$Boolean})", 'Relavence');
$Param = $this->Parameter();
$Sql->Where("match($Columns) against ($Param{$Boolean})", NULL, FALSE, FALSE);
}
}

public function Parameter() {
$Parameter = ':Search'.count($this->_Parameters);
$this->_Parameters[$Parameter] = '';
return $Parameter;
}

public function Reset() {
$this->_Parameters = array();
$this->_SearchSql = '';
}

public function Search($Search, $Offset = 0, $Limit = 20) {
// If there are no searches then return an empty array.
if(trim($Search) == '')
return NULL;

// Figure out the exact search mode.
if ($this->ForceSearchMode)
$SearchMode = $this->ForceSearchMode;
else
$SearchMode = strtolower(C('Garden.Search.Mode', 'matchboolean'));

if ($SearchMode == 'matchboolean') {
if (strpos($Search, '+') !== FALSE || strpos($Search, '-') !== FALSE)
$SearchMode = 'boolean';
else
$SearcMode = 'match';
} else {
$this->_SearchMode = $SearchMode;
}
$this->_SearchMode = $SearchMode;

$this->FireEvent('Search');

if(count($this->_SearchSql) == 0)
return NULL;

// Perform the search by unioning all of the sql together.
$Sql = $this->SQL
->Select()
->From('_TBL_ s')
->OrderBy('s.DateInserted', 'desc')
->Limit($Limit, $Offset)
->GetSelect();

$Sql = str_replace($this->Database->DatabasePrefix.'_TBL_', "(\n".implode("\nunion all\n", $this->_SearchSql)."\n)", $Sql);

$this->EventArguments['Search'] = $Search;
$this->FireEvent('AfterBuildSearchQuery');

if ($this->_SearchMode == 'like')
$Search = '%'.$Search.'%';

foreach($this->_Parameters as $Key => $Value) {
$this->_Parameters[$Key] = $Search;
}

$Result = $this->Database->Query($Sql, $this->_Parameters)->ResultArray();
foreach ($Result as $Key => $Value) {
if (isset($Value['Summary'])) {
$Value['Summary'] = Gdn_Format::Text($Value['Summary']);
$Result[$Key] = $Value;
}
}
$this->Reset();
$this->SQL->Reset();
return $Result;
}
}

---------------

Any help would be greatly appreciated! Thanks!

Comments

  • how about the like mode?

  • van919 is right.
    You can use MySQL like mode with adding this to your conf/config.php.
    $Configuration['Garden']['Search']['Mode']='like';
    We did this for Japanese installer and it's working for me.
  • Thanks for help guys, but this doesn't seem to work either. I changed the mode to like but now the search finds *all* discussions every time, regardless of the search term. I also added other extra lines in the config file (locale, charset..) for Japanese, same as on the Github link yu_tang provided.

    Any clues??
  • hi achikochi . I think PHP mbstring settings incorrect.

    Have you checked the php.ini settings ?
    /--check--/
    mbstring.language = Japanese
    mbstring.internal_encoding = UTF-8
    mbstring.http_input = pass or UTF-8
    mbstring.http_output = pass
    mbstring.encoding_translation = Off
    mbstring.detect_order = UTF-8,SJIS,EUC-JP,JIS,ASCII
    mbstring.substitute_character = none
    mbstring.func_overload = 0
    mbstring.strict_detection = Off

    I also had the same problem, but I was resolved :awesome:

Sign In or Register to comment.