Elasticsearch-我需要JDBC驱动程序吗？

Z时代
2024-01-10
分类：问答

要将我的Elasticsearch服务器与SQL数据库中的新数据和过期数据同步

我可以通过两种非常不同的方式来实现这一目标，但我不知道哪种更好。我可以使用JDBC river插件通过直接连接到SQL数据库的方式信息

入elasticsearch。另外，我可以使用下面的代码作为示例，使用PHP客户端数据

送到elasticsearch：

// The Id of the document
$id = 1;
// Create a document
$tweet = array(
    'id'      => $id,
    'user'    => array(
        'name'      => 'mewantcookie',
        'fullName'  => 'Cookie Monster'
    ),
    'msg'     => 'Me wish there were expression for cookies like there is for apples. "A cookie a day make the doctor diagnose you with diabetes" not catchy.',
    'tstamp'  => '1238081389',
    'location'=> '41.12,-71.34',
    '_boost'  => 1.0
);
// First parameter is the id of document.
$tweetDocument = new \Elastica\Document($id, $tweet);
// Add tweet to type
$elasticaType->addDocument($tweetDocument);
// Refresh Index
$elasticaType->getIndex()->refresh();

我打算每30分钟运行一次cron，以检查数据库中不仅有“活动”标志而且没有“索引”标志的项目，这意味着我需要将它们添加到索引中。

看来我有两种方法可以通过两种不同方式在elasticsearch和mysql之间同步数据，每种选择的优点和缺点是什么。是否有一个特定的用例定义了一个使用？

回答：

如果您暂时忘了需要将初始数据导入到Elasticsearch中，那么我将使用事件系统将数据推

送到Elasticsearch。从长远来看，这会更有效率。

你的应用程序知道正是

当事情需要由Elasticsearch索引。以您的tweet为例，有时某个新的tweet将进入您的应用程序（例如，用户编写了一个）。这将触发一个newTweet事件。您已经有了一个侦听器，该侦听器将侦听该事件，并在分派此类事件时将其存储在Elasticsearch中。

如果你不希望使用资源/时间在Web请求要做到这一点（和你肯定不

希望这样做），听众可以一个作业添加到队列（Gearman的或Beanstalkd为例）。然后，您将需要一个工作人员来接替该工作并将推文存储在Elasticsearch中。

主要优势是使Elasticsearch保持最新的实时性。您不需要会造成延迟的cronjob。您（通常）一次将处理一个文档。您无需费心SQL数据库来找出需要（重新）索引的内容。

另一个优点是，当事件/数据量失控时，您可以轻松扩展。当Elasticsearch本身需要更多功能时，请将服务器添加到集群中。当工作人员无法处理负载时，只需添加更多负载（并将它们放置在专用机器上）。再加上您的Web服务器和SQL数据库不会有任何影响。

以上是 Elasticsearch-我需要JDBC驱动程序吗？的全部内容，来源链接： utcz.com/qa/421323.html

Elasticsearch-我需要JDBC驱动程序吗？

回答：

其他人也看了：