Spring Data Elasticsearch篇(1):常用注解

2023-11-10

1、注解@Document

1.1、@Document源码

@Persistent
@Inherited
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE})
public @interface Document {
    //索引库名称
    String indexName();
    //类型
    String type() default "";
    //
    boolean useServerConfiguration() default false;
    //默认分片数5
    short shards() default 5;
    //默认副本数1
    short replicas() default 1;
    //刷新间隔
    String refreshInterval() default "1s";
    //索引文件存储类型
    String indexStoreType() default "fs";
   //是否创建索引
    boolean createIndex() default true;
}

1.2、@Document注解使用

@Document注解作用在类上，标记实体类为文档对象，常用属性如下：

（1）indexName：对应索引库名称；

（2）type：对应在索引库中的类型；

（3）shards：分片数

（4）replicas：副本数；

2、注解@Field

2.1、@Field源码

@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.FIELD})
@Documented
@Inherited
public @interface Field {
    //自动检测属性类型
    FieldType type() default FieldType.Auto;
    //
    boolean index() default true;
    //时间类型的字段格式化
    DateFormat format() default DateFormat.none;
    //
    String pattern() default "";
    //默认情况下不存储
    boolean store() default false;
    //
    boolean fielddata() default false;
    //指定字段使用搜索时的分词
    String searchAnalyzer() default "";
    //
    String analyzer() default "";
    //如果某个字段需要被忽略
    String[] ignoreFields() default {};
    //
    boolean includeInParent() default false;
}

2.2、枚举类FieldType

【FieldType源码】

public enum FieldType {
    Text,
    Integer,
    Long,
    Date,
    Float,
    Double,
    Boolean,
    Object,
    Auto,
    Nested,
    Ip,
    Attachment,
    Keyword;

    private FieldType() {
    }
}

2.3、@Field注解使用

@Field作用在成员变量，标记为文档的字段，并制定映射属性；

（1）@Id：作用在成员变量，标记一个字段为id主键；一般id字段或是域不需要存储也不需要分词；

（2）type：字段的类型，取值是枚举，FieldType；

（3）index：是否索引，布尔值类型，默认是true；

（4）store：是否存储，布尔值类型，默认值是false；

（5）analyzer：分词器名称

【 @Field(type = FieldType.Keyword)和 @Field(type = FieldType.Text)区别】

在早期elasticsearch5.x之前的版本存储字符串只有string字段；但是在elasticsearch5.x之后的版本存储了Keyword和Text，都是存储字符串的。FieldType.Keyword存储字符串数据时，不会建立索引；而FieldType.Text在存储字符串数据的时候，会自动建立索引，也会占用部分空间资源。

【 @Field(store = true)】

其实不管我们将store值设置为true或false，elasticsearch都会将该字段存储到Field域中；但是他们的区别是什么？

（1）store = false时，默认设置；那么给字段只存储在"_source"的Field域中；

（2）store = true时，该字段的value会存储在一个跟_source平级的独立Field域中；同时也会存储在_source中，所以有两份拷贝。

那么我们在什么样的业务场景下使用store field功能？

（1）_source field在索引的mapping 中disable了。这种情况下，如果不将某个field定义成store=true，那些将无法在返回的查询结果中看到这个field。
（2）_source的内容非常大。这时候如果我们想要在返回的_source document中解释出某个field的值的话，开销会很大（当然你也可以定义source filtering将减少network overhead），比例某个document中保存的是一本书，所以document中可能有这些field: title, date, content。假如我们只是想查询书的title 跟date信息，而不需要解释整个_source（非常大），这个时候我们可以考虑将title, date这些field设置成store=true。
需要注意的是，看起来将field store可以减少查询的开销，但其实这样也会加大disk的访问频率。假如你将_source中的10个field都定义store，那么在你查询这些field的时候会将会有10次disk seek的操作。而返回_source只有一次disk seek的操作。所以这个也是我们在定义的时候需要blance的。

3、实体类代码

@Document(indexName = "item",type = "docs",shards = 1,replicas = 0)
public class Item {
    @Id
    private Long id;
    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String title;
    @Field(type=FieldType.Keyword)
    private String category;
    @Field(type=FieldType.Keyword)
    private String brand;
    @Field(type=FieldType.Double)
    private Double price;
    @Field(index = false,type = FieldType.Keyword)
    private String images;
}

【注意】（1）@Field(index=true)表示是否索引，如果是索引表示该字段(或者叫域)能能够搜索。

（2）@Field(analyzer="ik_max_word",searchAnalyzer="ik_max_word")表示是否分词，如果是分词就会按照分词的单词搜索，如果不是分词就按照整体搜索。

（3）@Field(store=true)是否存储，也就是页面上显示。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

elasticsearch笔记