3 posts tagged with "elasticsearch"

elasticsearch找不到ik_max_word的分詞器怎麼辦?

March 8, 2024 · One min read

Product Developer

問題

在使用 Elasticsearch 時, 如果某個 index 中的 field 有使用到 ik_max_word 分詞器, 但Elasticsearch 沒有安裝該 plugin, 就會出現以下訊息

Custom Analyzer [synonym] failed to find tokenizer under name [ik_max_word]"}

解決步驟

進入elasticsearch container

docker exec -it elasticsearch bash

根據不同版本的elasticsearch下載對應的ik plugin

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.0/elasticsearch-analysis-ik-7.17.0.zip

參考網址

https://github.com/infinilabs/analysis-ik

elasticsearch 出現 usage exceeded flood-stage watermark 怎麼辦呢

March 7, 2024 · 2 min read

junminhong(jasper)

Product Developer

問題

最近 elasticsearch 常常出現 watermark 已滿的問題, 這問題是因為硬碟空間不夠觸發了elasticsearch 的 watermark 的保護機制, 會先暫時將 index 鎖住不給寫入, 但此時是可以讀取的

TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block

解決步驟

優先將硬碟空間加大或者是清理空間, 可以善用 df、du 等指令去排查 disk space 並做清理
設定 elasticsearch watermark, default 是95%

curl -X PUT --user es_account:password "https://elasticsearch_domain/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": 97%,
    "cluster.routing.allocation.disk.watermark.high": 97%,
    "cluster.routing.allocation.disk.watermark.flood_stage": 97%
  }
}
'

設定完等 elasticsearch 沒問題時要將 watermark 設定移除

curl -X PUT --user es_account:password "https://elasticsearch_domain/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": null,
    "cluster.routing.allocation.disk.watermark.high": null,
    "cluster.routing.allocation.disk.watermark.flood_stage": null
  }
}
'

建議

善用 Grafana 等監控隨時注意機器的硬碟空間, 避免空間爆了造成index被鎖住
調整 index 的 lifecycle policy, 避免資料一直保存然後消耗過多的硬碟空間
機器硬碟空間蠻貴的, 可以考慮用某些服務的 block storage 有提供HDD的選項, 可以節省花費, 或者可以考慮往S3、GCS等有提供較便宜儲存方式的服務去放 cold data

如何透過shell script監控Elasticsearch

October 4, 2023 · 3 min read

junminhong(jasper)

Product Developer

前言

最近一直收到Elasticsearch監控發出錯誤的訊息, 所以就來排查一下發生的什麼事, 同時順便優化了一下監控 🚧

實作

使用萬用的shell script

# elasticsearch_health_check.sh
#!/bin/bash

# Elasticsearch cluster URL
ES_URL="https://localhost:9200"

# call cluster health api, 非常推薦用這串參數
RESPONSE=$(curl -s -o response.txt -w "%{http_code}" --user username:password "${ES_URL}/_cluster/health?level=shards&wait_for_status=green&timeout=60s")

# Extract the cluster status, 這邊是用jq解析, 看個人喜好
STATUS=$(cat response.txt | jq -r .status)

# Logic to handle the status
if [ "$STATUS" == "green" ]; then
  # 節點綠燈時輸出的訊息
elif [ "$STATUS" == "yellow" ]; then
  # 節點黃燈時輸出的訊息
else
  # 節點出現嚴重錯誤時輸出的訊息
fi

if [ -n "$message" ]; then
  # 可以串接至個人使用的通訊軟體
fi

cluster health api參數解釋

# api endpoint
_cluster/health

# 為什麼會設定level是shards呢？
# 監控顧名思義就是要在出現問題時, 要有相關紀錄可以排查問題, 所以一定是希望資訊越多越好
# 所以蠻推薦調整一下level
level=shards

# 等待節點狀態變成綠色
wait_for_status=green

# 預設30s, 建議可以調整一下
timeout=60s

搭配crontab

# 查看目前系統有哪些crontab
crontab -l

# 設定crontab
crontab -e

# 每十分鐘執行一次剛剛寫的shell script, 已達到定期檢查elasticsearch健康度的需求
*/10 * * * * elasticsearch_health_check.sh

結論

基本shell script可以再根據個人的需求調整相對應的邏輯, 就可以做出一個蠻好用的elasticsearch的監控了

善用crontab ~~ 👍🏻

參考資料

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/cluster-health.html

問題​

解決步驟​

參考網址​

問題​

解決步驟​

建議​

前言​

實作​

使用萬用的shell script​

cluster health api參數解釋​

搭配crontab​

結論​

參考資料​

問題

解決步驟

參考網址

問題

解決步驟

建議

前言

實作

使用萬用的shell script

cluster health api參數解釋

搭配crontab

結論

參考資料