203

ubuntu 环境下编译安装 xapiand

乐果   发表于   2023 年 12 月 08 日 标签:xapianc++

Xapiand 是基于开源检索引擎 Xapian 而二次封装开发的支持 RESTfulApi 服务。

在公司的前期项目中,因为需要对一些建筑/场所的坐标进行 经纬度距离排序 ,因此用到 Xapiand 服务, 但因为只是基于 docker 方式简单部署应用,并未对这个服务自身作深入研究,因此在后续 使用过程中,有同事反馈存在 “索引库容易丢失”、“内存居高不下” 等问题。

抱着学习研究的心态,遂尝试拉取源码研究一番,希望捣腾中或许能洞察到同事反馈的那些问题原因。

Xapiand 介绍

XapiandRESTful 搜索引擎, Xapiand 是一种现代的高可用分布式 RESTful 搜索和存储引擎,专为云计算而设计,并考虑了数据局部性。

它需要 JSON (或 MessagePack )文档以及 inde Xapiand

官方站点位于: https://kronuz.io/Xapiand

代码仓库地址:https://github.com/Kronuz/Xapiand

编译安装

拉取代码、编译:

git clone https://github.com/Kronuz/Xapiand.git
cd Xapiand
mkdir build
cmake CNinja ..
ninja

注意,上面编译用了 ninja 编译加速工具,因此需要提前安装类库:

sudo apt install ninja-build

编译过程中可能会存在各种报错,修复即可。

例如如下情况

报错1:

如上图,可能出现这种报错: error: ‘numeric_limits’ is not a member of ‘std’

修复方式:找到相应的源码文件,在文件头部加入如下两行代码:

#include <stdexcept>
#include <limits>

报错2:

因为c++编译器对类型转换要求比较严格,不会自动转换。 要么更换编译器,要么修改一下源码:

找到对应源码文件:

运行测试

继续编译直到OK,在 bin 目录下会出现有三个可执行文件,其中 xapiand 文件就是本服务的主程序, 在窗口执行它,出现如下图所示,即代表源码编译已经成功了:

查看可用参数:

$ xapiand -h

Xapiand v0.40.0 (rev:20200512201748) (git:a71570859)
[https://github.com/Kronuz/Xapiand/issues]
Usage: 
   xapiand  [-D <path>] [--force] [--strict] [--solo] [-d] [--human]
            [--no-human] [--echo] [--no-echo] [--comments] [--no-comments]
            [--pretty] [--no-pretty] [--colors] [--no-colors] [--admin-commands]
            [-L <file>] [-P <file>] [--uid <uid>] [--gid <gid>] [--log <epoch
            |iso8601|timeless|seconds|milliseconds|microseconds|thread-names
            |locations|replicas>] ...  [--iterm2] [--bind-address <bind>] [--port
            <port>] [--replica-port <port>] [--xapian-port <port>]
            [--primary-node <name>] [--use <select|poll|epoll>] [--processors
            <processors>] [--max-clients <clients>] [--http-servers <servers>]
            [--http-clients <threads>] [--database-stall-time <seconds>]
            [--replication-servers <servers>] [--replication-clients <threads>]
            [--remote-servers <servers>] [--remote-clients <threads>]
            [--flush-threshold <threshold>] [--max-files <files>] [--discoverers
            <discoverers>] [--replicators <replicators>] [--fsynchers
            <fsynchers>] [--wal-writer-cache-size <size>] [--resolver-cache-size
            <size>] [--scripts-cache-size <size>] [--schema-versions-size <size>]
            [--schema-pool-size <size>] [--database-pool-size <size>]
            [--max-database-readers <databases>] [--committers <threads>]
            [--bulk-indexers <threads>] [--bulk-preparers <threads>] [--matchers
            <threads>] [--shards <shards>] [--replicas <replicas>] [--writers
            <writers>] [--name <node>] [--cluster <cluster>] [--discovery-group
            <group>] [--discovery-port <port>] [--uuid <vanilla|compact|encoded
            |partition>] ...  [--verbosity <verbosity>] [-v] ...  [--restore
            <endpoint>] [-i <file>] [--dump <endpoint>] [-o <file>] [--]
            [--version] [-h]

Where: 
   -D <path>,  --database <path>   Path to the root of the node.
   --force                         Force using path as the root of the node.
   --strict                        Force the user to define the type for each field.
   --solo                          Run solo indexer. (no replication or discovery)
   -d,  --detach                   detach process. (run in background)
   --human                         Enables objects humanizer in results.
   --no-human                      Disables objects humanizer in results.
   --echo                          Enables objects echo in results.
   --no-echo                       Disables objects echo in results.
   --comments                      Enables result comments.
   --no-comments                   Disables result comments.
   --pretty                        Enables pretty results.
   --no-pretty                     Disables pretty results.
   --colors                        Enables colors on the console.
   --no-colors                     Disables colors on the console.
   --admin-commands                Enables administrative HTTP commands.
   -L <file>,  --logfile <file>    Save logs in <file>.
   -P <file>,  --pidfile <file>    Save PID in <file>.
   --uid <uid>                     User ID.
   --gid <gid>                     Group ID.
   --log <epoch|iso8601|timeless|seconds|milliseconds|microseconds
      |thread-names|locations|replicas>
                                   Enable logging settings. (accepted multiple
                                   times)
   --iterm2                        Set marks, tabs, title, badges and growl.
   --bind-address <bind>           Bind address to listen to.
   --port <port>                   TCP HTTP port number to listen on for REST API.
   --replica-port <port>           Xapiand replication protocol TCP port number to listen on.
   --xapian-port <port>            Xapian binary protocol TCP port number to listen on.
   --primary-node <name>           Primary node (the one with the primary cluster database).
   --use <select|poll|epoll>       Connection processing backend.
   --processors <processors>       Number of processors to use.
   --max-clients <clients>         Max number of open client connections.
   --http-servers <servers>        Number of http servers.
   --http-clients <threads>        Number of http client threads.
   --database-stall-time <seconds>
                                   Seconds before allowing a shard to be
                                   promoted to primary.
   --replication-servers <servers>
                                   Number of replication protocol servers.
   --replication-clients <threads>
                                   Number of replication protocol client
                                   threads.
   --remote-servers <servers>      Number of remote protocol servers.
   --remote-clients <threads>      Number of remote protocol client threads.
   --flush-threshold <threshold>   Xapian flush threshold.
   --max-files <files>             Maximum number of files to open.
   --discoverers <discoverers>     Number of discoverers doing cluster discovery.
   --replicators <replicators>     Number of replicators triggering database replication.
   --fsynchers <fsynchers>         Number of threads handling the fsyncs.
   --wal-writer-cache-size <size>  Cache size wal writer.
   --resolver-cache-size <size>    Cache size for index resolver.
   --scripts-cache-size <size>     Cache size for scripts.
   --schema-versions-size <size>   Maximum number of versions of schema in cache.
   --schema-pool-size <size>       Maximum number of schemas in schema pool.
   --database-pool-size <size>     Maximum number of databases in database pool.
   --max-database-readers <databases>
                                   Max number of open databases.
   --committers <threads>          Number of threads handling the commits.
   --bulk-indexers <threads>       Number of threads handling bulk documents indexing.
   --bulk-preparers <threads>      Number of threads handling bulk documents preparing.
   --matchers <threads>            Number of threads handling parallel document matching.
   --shards <shards>               Default number of database shards per index.
   --replicas <replicas>           Default number of database replicas per index.
   --writers <writers>             Number of database async wal writers.
   --name <node>                   Node name.
   --cluster <cluster>             Cluster name to join.
   --discovery-group <group>       Discovery UDP group name.
   --discovery-port <port>         Discovery UDP port number to listen on.
   --uuid <vanilla|compact|encoded|partition>
                                   Toggle modes for compact and/or encoded
                                   UUIDs and UUID index path partitioning. (accepted multiple times)
   --verbosity <verbosity>         Set verbosity.
   -v,  --verbose                  Increase verbosity. (accepted multiple times)
   --restore <endpoint>            Restore endpoint from stdin.
   -i <file>,  --in <file>         Input filename for restore.
   --dump <endpoint>               Dump endpoint to stdout.
   -o <file>,  --out <file>        Output filename for dump.
   --,  --ignore_rest              Ignores the rest of the labeled arguments following this flag.
   --version                       Displays version information and exits.
   -h,  --help                     Displays usage information and exits.

后续将慢慢研究它吧。。。

乐果   发表于   2023 年 12 月 08 日 标签:xapianc++

0

文章评论