上图是改进之后的搜索效果,wordpress的搜索按照网上的说法做的比较烂,貌似是全字段匹配,于是搜索上面的关键词”ida调试器”就出现了下面的状况,啥都没有:
虽然blog的访问量不大,但是作为一个强迫症患者,这样的结果其实相对来说有点难以接受。网上搜索了一下相关的文章和插件,发现貌似都没啥用。于是就只能自己动手来实现这个东西了。python下的结巴分词相对来说使用还是比较方便的,搜了一下发现还真有个jieba的php版本https://github.com/jonnywang/phpjieba。那就简单了,首先安装结巴分词,按照github上的指导进行安装结课,不过安装过程中可能会遇到如下的错误:
configure: error: Cannot find php-config. Please use –with-php-config=PATH
这是因为没有定位到php-config文件,通过–with-php-config参数指定文件路径即可:./configure –with-php-config=/usr/local/php/bin/php-config
完整的安装命令以及输出日志如下:
root@blog:~# git clone https://github.com/jonnywang/phpjieba.git Cloning into 'phpjieba'... remote: Enumerating objects: 178, done. remote: Total 178 (delta 0), reused 0 (delta 0), pack-reused 178 Receiving objects: 100% (178/178), 4.25 MiB | 220.00 KiB/s, done. Resolving deltas: 100% (73/73), done. root@blog:~# cd phpjieba/cjieba/ root@blog:~/phpjieba/cjieba# make g++ -O2 -o lib/cjieba.o -c -DLOGGING_LEVEL=LL_WARNING -I./deps/ include/jieba.cpp g++ -O2 -fPIC -o lib/libcjieba.so -c -DLOGGING_LEVEL=LL_WARNING -I./deps/ include/jieba.cpp ar rs lib/libcjieba.a lib/cjieba.o ar: creating lib/libcjieba.a gcc -O2 -o demo demo.c -L./lib -lcjieba -lstdc++ -lm root@blog:~/phpjieba/cjieba# cd .. root@blog:~/phpjieba# php php php-fpm phpize root@blog:~/phpjieba# php php php-fpm phpize root@blog:~/phpjieba# phpize Configuring for: PHP Api Version: 20180731 Zend Module Api No: 20180731 Zend Extension Api No: 320180731 root@blog:~/phpjieba# ./config -bash: ./config: No such file or directory root@blog:~/phpjieba# ./configure checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for a sed that does not truncate output... /usr/bin/sed checking for cc... cc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether cc accepts -g... yes checking for cc option to accept ISO C89... none needed checking how to run the C preprocessor... cc -E checking for icc... no checking for suncc... no checking whether cc understands -c and -o together... yes checking for system library directory... lib checking if compiler supports -R... no checking if compiler supports -Wl,-rpath,... yes checking build system type... x86_64-pc-linux-gnu checking host system type... x86_64-pc-linux-gnu checking target system type... x86_64-pc-linux-gnu configure: error: Cannot find php-config. Please use --with-php-config=PATH root@blog:~/phpjieba# make make: *** No targets specified and no makefile found. Stop. root@blog:~/phpjieba# which php /usr/bin/php root@blog:~/phpjieba# ls /usr/bin/php php php-fpm phpize root@blog:~/phpjieba# ls /usr/bin/php php php-fpm phpize root@blog:~/phpjieba# ls /usr/local/php/bin/php php php-cgi php-config phpdbg phpize root@blog:~/phpjieba# ls /usr/local/php/bin/php php php-cgi php-config phpdbg phpize root@blog:~/phpjieba# ls /usr/local/php/bin/php php php-cgi php-config phpdbg phpize root@blog:~/phpjieba# ls /usr/local/php/bin/php-c php-cgi php-config root@blog:~/phpjieba# ls /usr/local/php/bin/php-c php-cgi php-config root@blog:~/phpjieba# ls /usr/local/php/bin/php-config /usr/local/php/bin/php-config root@blog:~/phpjieba# ./configure --with-php-config=/usr/local/php/bin/php-config checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for a sed that does not truncate output... /usr/bin/sed checking for cc... cc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether cc accepts -g... yes checking for cc option to accept ISO C89... none needed checking how to run the C preprocessor... cc -E checking for icc... no checking for suncc... no checking whether cc understands -c and -o together... yes checking for system library directory... lib checking if compiler supports -R... no checking if compiler supports -Wl,-rpath,... yes checking build system type... x86_64-pc-linux-gnu checking host system type... x86_64-pc-linux-gnu checking target system type... x86_64-pc-linux-gnu checking for PHP prefix... /usr/local/php checking for PHP includes... -I/usr/local/php/include/php -I/usr/local/php/include/php/main -I/usr/local/php/include/php/TSRM -I/usr/local/php/include/php/Zend -I/usr/local/php/include/php/ext -I/usr/local/php/include/php/ext/date/lib checking for PHP extension directory... /usr/local/php/lib/php/extensions/no-debug-non-zts-20180731 checking for PHP installed headers prefix... /usr/local/php/include/php checking if debug is enabled... no checking if zts is enabled... no checking for re2c... re2c checking for re2c version... 1.3 (ok) checking for gawk... gawk checking whether to enable jieba support... yes, shared checking for ld used by cc... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking for /usr/bin/ld option to reload object files... -r checking for BSD-compatible nm... /usr/bin/nm -B checking whether ln -s works... yes checking how to recognize dependent libraries... pass_all checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking dlfcn.h usability... yes checking dlfcn.h presence... yes checking for dlfcn.h... yes checking the maximum length of command line arguments... 1572864 checking command to parse /usr/bin/nm -B output from cc object... ok checking for objdir... .libs checking for ar... ar checking for ranlib... ranlib checking for strip... strip checking if cc supports -fno-rtti -fno-exceptions... no checking for cc option to produce PIC... -fPIC checking if cc PIC flag -fPIC works... yes checking if cc static flag -static works... yes checking if cc supports -c -o file.o... yes checking whether the cc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... no creating libtool appending configuration tag "CXX" to libtool configure: creating ./config.status config.status: creating config.h root@blog:~/phpjieba# make /bin/bash /root/phpjieba/libtool --mode=compile cc -I. -I/root/phpjieba -DPHP_ATOM_INC -I/root/phpjieba/include -I/root/phpjieba/main -I/root/phpjieba -I/usr/local/php/include/php -I/usr/local/php/include/php/main -I/usr/local/php/include/php/TSRM -I/usr/local/php/include/php/Zend -I/usr/local/php/include/php/ext -I/usr/local/php/include/php/ext/date/lib -I/root/phpjieba/cjieba/include -DHAVE_CONFIG_H -g -O2 -c /root/phpjieba/jieba.c -o jieba.lo mkdir .libs cc -I. -I/root/phpjieba -DPHP_ATOM_INC -I/root/phpjieba/include -I/root/phpjieba/main -I/root/phpjieba -I/usr/local/php/include/php -I/usr/local/php/include/php/main -I/usr/local/php/include/php/TSRM -I/usr/local/php/include/php/Zend -I/usr/local/php/include/php/ext -I/usr/local/php/include/php/ext/date/lib -I/root/phpjieba/cjieba/include -DHAVE_CONFIG_H -g -O2 -c /root/phpjieba/jieba.c -fPIC -DPIC -o .libs/jieba.o /bin/bash /root/phpjieba/libtool --mode=link cc -DPHP_ATOM_INC -I/root/phpjieba/include -I/root/phpjieba/main -I/root/phpjieba -I/usr/local/php/include/php -I/usr/local/php/include/php/main -I/usr/local/php/include/php/TSRM -I/usr/local/php/include/php/Zend -I/usr/local/php/include/php/ext -I/usr/local/php/include/php/ext/date/lib -I/root/phpjieba/cjieba/include -DHAVE_CONFIG_H -g -O2 -o jieba.la -export-dynamic -avoid-version -prefer-pic -module -rpath /root/phpjieba/modules jieba.lo -Wl,-rpath,/root/phpjieba/cjieba/lib -L/root/phpjieba/cjieba/lib -lcjieba -lstdc++ cc -shared .libs/jieba.o -L/root/phpjieba/cjieba/lib -lcjieba -lstdc++ -Wl,-rpath -Wl,/root/phpjieba/cjieba/lib -Wl,-soname -Wl,jieba.so -o .libs/jieba.so creating jieba.la (cd .libs && rm -f jieba.la && ln -s ../jieba.la jieba.la) /bin/bash /root/phpjieba/libtool --mode=install cp ./jieba.la /root/phpjieba/modules cp ./.libs/jieba.so /root/phpjieba/modules/jieba.so cp ./.libs/jieba.lai /root/phpjieba/modules/jieba.la PATH="$PATH:/sbin" ldconfig -n /root/phpjieba/modules ---------------------------------------------------------------------- Libraries have been installed in: /root/phpjieba/modules If you ever happen to want to link against installed libraries in a given directory, LIBDIR, you must either use libtool, and specify the full pathname of the library, or use the `-LLIBDIR' flag during linking and do at least one of the following: - add LIBDIR to the `LD_LIBRARY_PATH' environment variable during execution - add LIBDIR to the `LD_RUN_PATH' environment variable during linking - use the `-Wl,--rpath -Wl,LIBDIR' linker flag - have your system administrator add LIBDIR to `/etc/ld.so.conf' See any operating system documentation about shared libraries for more information, such as the ld(1) and ld.so(8) manual pages. ---------------------------------------------------------------------- Build complete. Don't forget to run 'make test'. root@blog:~/phpjieba# make install Installing shared extensions: /usr/local/php/lib/php/extensions/no-debug-non-zts-20180731/ root@blog:~/phpjieba# ls /usr/local/php/lib/php/extensions/no-debug-non-zts-20180731/ jieba.so opcache.a opcache.so root@blog:~/phpjieba# ls ~/phpjieba/cjieba/dict/ hmm_model.utf8 idf.utf8 jieba.dict.utf8 stop_words.utf8 user.dict.utf8 |
安装完成之后修改php.ini添加如下代码:
; jieba extension=jieba.so jieba.enable=1 jieba.dict_path=/root/phpjieba/cjieba/dict |
安装之后重启php-fpm服务,重启之后可以通过phpjieba 的example目录下的test_jieba.php检测是否可以正常运行:
如果能够正常运行那么就证明安装成功了。到这里第一步就成功了,下面进行第二部,修改搜索相关代码。
修改主体的functions.php添加如下代码:
function custom_search( $search_result, $wp_query ) { global $wpdb; if( !$wp_query->is_search ) { return $search_result; } if( !isset( $wp_query->query_vars ) ) { return $search_result; } $key_string=$wp_query->query_vars['s']; $keywords =jieba($key_string); if ( count( $keywords ) > 0 ) { $search_result = ''; foreach ( $keywords as $keyword ) { if ( !empty( $keyword ) ) { $keywords = '%' . esc_sql( $keyword ) . '%'; $search_result .= " AND ( {$wpdb->posts}.post_title LIKE '{$keywords}' OR {$wpdb->posts}.post_content LIKE '{$keywords}' OR {$wpdb->posts}.ID IN ( SELECT distinct post_id FROM {$wpdb->postmeta} WHERE meta_value LIKE '{$keywords}' ) ) "; } } } return $search_result; } add_filter( 'posts_search','custom_search', 10, 2 ); |
添加完成无误之后就可以尝试新的搜索功能了。
另外如果要让404页面支持分词,那么需要修改为以下代码:
foreach($result as $value){ //echo "{$value}<br />"; $args = array('s'=>$value); $the_query = new WP_Query( $args ); if ( $the_query->have_posts() ) { //_e("<h2 style='font-weight:bold;color:#000'>Search Results for: ".get_query_var('s')."</h2>"); while ( $the_query->have_posts() ) { $the_query->the_post(); ?> <li><a href="<?php the_permalink(); ?>"><?php the_title(); ?> -- <?php the_modified_date(); ?></a> (Keyword: <?php echo($value); ?>)</li> <?php } } } |
修改之后效果
参考链接:
https://designsupply-web.com/media/knowledgeside/5811/
https://github.com/jonnywang/phpjieba
https://www.zhaokeli.com/article/1570.html
原创文章,转载请注明: 转载自 obaby@mars
本文标题: 《WordPress 中文分词搜索》
一条评论
相关代码见:https://github.com/obaby/baby-word-press