Elasticsearch：如何“拯救”无法通过映射解析的文档？

Question 1

事实证明，这很简单，只要允许“格式错误”的属性即可。有两种方法可以做到这一点。要么在整个索引上：

PUT /_template/ignore_malformed_attributes
{
  "index_patterns": ["my_index"],
  "settings": {
      "index.mapping.ignore_malformed": true
  }
}

或者按属性（参见此处的示例：https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-malformed.html）

PUT my_index
{
  "mappings": {
    "properties": {
      "number_one": {
        "type": "integer",
        "ignore_malformed": true
      },
      "number_two": {
        "type": "integer"
      }
    }
  }
}

# Will work
PUT my_index/_doc/1
{
  "text":       "Some text value",
  "number_one": "foo" 
}

# Will be rejected
PUT my_index/_doc/2
{
  "text":       "Some text value",
  "number_two": "foo" 
}

请注意，您也可以更改现有索引的属性，但您需要先关闭它：

POST my_existing_index/_close
PUT my_existing_index/_settings
{
  "index.mapping.ignore_malformed": false
}
POST my_existing_index/_open

笔记：除非您刷新索引模式，否则类型更改不会在 kibana 中显示。然后您将遇到类型冲突，这需要您重新索引数据以再次搜索它……真麻烦。

POST _reindex
{
  "source": {
    "index": "my_index"
  },
  "dest": {
    "index": "my_new_index"
  }
}

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

Answer

事实证明，这很简单，只要允许“格式错误”的属性即可。有两种方法可以做到这一点。要么在整个索引上：

PUT /_template/ignore_malformed_attributes
{
  "index_patterns": ["my_index"],
  "settings": {
      "index.mapping.ignore_malformed": true
  }
}

或者按属性（参见此处的示例：https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-malformed.html）

PUT my_index
{
  "mappings": {
    "properties": {
      "number_one": {
        "type": "integer",
        "ignore_malformed": true
      },
      "number_two": {
        "type": "integer"
      }
    }
  }
}

# Will work
PUT my_index/_doc/1
{
  "text":       "Some text value",
  "number_one": "foo" 
}

# Will be rejected
PUT my_index/_doc/2
{
  "text":       "Some text value",
  "number_two": "foo" 
}

请注意，您也可以更改现有索引的属性，但您需要先关闭它：

POST my_existing_index/_close
PUT my_existing_index/_settings
{
  "index.mapping.ignore_malformed": false
}
POST my_existing_index/_open

笔记：除非您刷新索引模式，否则类型更改不会在 kibana 中显示。然后您将遇到类型冲突，这需要您重新索引数据以再次搜索它……真麻烦。

POST _reindex
{
  "source": {
    "index": "my_index"
  },
  "dest": {
    "index": "my_new_index"
  }
}

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

Question 2

对于相当多的用例来说，可能更可取的替代方法是将 logstash 放在生产者和 Elasticseaech 之间。logstash 可以重新格式化和/或检查并路由到特定索引。
或者当然，如果您有本地生产者，请让它们验证和路由。

Answer

对于相当多的用例来说，可能更可取的替代方法是将 logstash 放在生产者和 Elasticseaech 之间。logstash 可以重新格式化和/或检查并路由到特定索引。
或者当然，如果您有本地生产者，请让它们验证和路由。

Elasticsearch：如何“拯救”无法通过映射解析的文档？

答案1

答案2

相关内容