etcd单机&集群安装笔记

一、etcd的安装方式
1、golang实现的,可以直接在github上下载源码,编译运行;
或者直接通过go get安装

go get github.com/etcd-io/etcd

2、也可以直接下载二进制包,解压后运行即可
3、通过Docker安装

这里主要介绍通过直接下载二进制包安装

二、快速入门安装, 简单启动
1、下载
各个版本下载地址:https://github.com/etcd-io/etcd/releases/ 往下Assets中有下载地址

wget https://github.com/etcd-io/etcd/releases/download/v3.4.14/etcd-v3.4.14-linux-amd64.tar.gz

2、解压
解压后的包中有两个二进制文件etcd为服务,etcdctl为客户端,均可直接运行

tar -zxvf etcd-v3.4.14-linux-amd64.tar.gz

3、启动etcd服务

cd etcd-v3.4.14-linux-amd64

nohup ./etcd >/tmp/etcd.log 2>&1 &

4、查看版本

#服务端
./etcd --version
etcd Version: 3.4.14
Git SHA: 8a03d2e96
Go Version: go1.12.17
Go OS/Arch: linux/amd64

#客户端
./etcdctl version
etcdctl version: 3.4.14
API version: 3.4

5、写入值

./etcdctl put /blog/name niliu
OK

6、读取值

 ./etcdctl get /blog/name
/blog/name
niliu

三、自定义参数启动服务
1、查看启动参数的含义

./etcd -h

Member:
  --name 'default'
    Human-readable name for this member.
  --data-dir '${name}.etcd'
    Path to the data directory.
  --wal-dir ''
    Path to the dedicated wal directory.
  --snapshot-count '100000'
    Number of committed transactions to trigger a snapshot to disk.
  --heartbeat-interval '100'
    Time (in milliseconds) of a heartbeat interval.
  --election-timeout '1000'
    Time (in milliseconds) for an election to timeout. See tuning documentation for details.
  --initial-election-tick-advance 'true'
    Whether to fast-forward initial election ticks on boot for faster election.
  --listen-peer-urls 'http://localhost:2380'
    List of URLs to listen on for peer traffic.
  --listen-client-urls 'http://localhost:2379'
    List of URLs to listen on for client traffic.
  --max-snapshots '5'
    Maximum number of snapshot files to retain (0 is unlimited).
  --max-wals '5'
    Maximum number of wal files to retain (0 is unlimited).
  --quota-backend-bytes '0'
    Raise alarms when backend size exceeds the given quota (0 defaults to low space quota).
  --backend-batch-interval ''
    BackendBatchInterval is the maximum time before commit the backend transaction.
  --backend-batch-limit '0'
    BackendBatchLimit is the maximum operations before commit the backend transaction.
  --max-txn-ops '128'
    Maximum number of operations permitted in a transaction.
  --max-request-bytes '1572864'
    Maximum client request size in bytes the server will accept.
  --grpc-keepalive-min-time '5s'
    Minimum duration interval that a client should wait before pinging server.
  --grpc-keepalive-interval '2h'
    Frequency duration of server-to-client ping to check if a connection is alive (0 to disable).
  --grpc-keepalive-timeout '20s'
    Additional duration of wait before closing a non-responsive connection (0 to disable).

Clustering:
  --initial-advertise-peer-urls 'http://localhost:2380'
    List of this member's peer URLs to advertise to the rest of the cluster.
  --initial-cluster 'default=http://localhost:2380'
    Initial cluster configuration for bootstrapping.
  --initial-cluster-state 'new'
    Initial cluster state ('new' or 'existing').
  --initial-cluster-token 'etcd-cluster'
    Initial cluster token for the etcd cluster during bootstrap.
    Specifying this can protect you from unintended cross-cluster interaction when running multiple clusters.
  --advertise-client-urls 'http://localhost:2379'
    List of this member's client URLs to advertise to the public.
    The client URLs advertised should be accessible to machines that talk to etcd cluster. etcd client libraries parse these URLs to connect to the cluster.
  --discovery ''
    Discovery URL used to bootstrap the cluster.
  --discovery-fallback 'proxy'
    Expected behavior ('exit' or 'proxy') when discovery services fails.
    "proxy" supports v2 API only.
  --discovery-proxy ''
    HTTP proxy to use for traffic to discovery service.
  --discovery-srv ''
    DNS srv domain used to bootstrap the cluster.
  --discovery-srv-name ''
    Suffix to the dns srv name queried when bootstrapping.
  --strict-reconfig-check 'true'
    Reject reconfiguration requests that would cause quorum loss.
  --pre-vote 'false'
    Enable to run an additional Raft election phase.
  --auto-compaction-retention '0'
    Auto compaction retention length. 0 means disable auto compaction.
  --auto-compaction-mode 'periodic'
    Interpret 'auto-compaction-retention' one of: periodic|revision. 'periodic' for duration based retention, defaulting to hours if no time unit is provided (e.g. '5m'). 'revision' for revision number based retention.
  --enable-v2 'false'
    Accept etcd V2 client requests.

Security:
  --cert-file ''
    Path to the client server TLS cert file.
  --key-file ''
    Path to the client server TLS key file.
  --client-cert-auth 'false'
    Enable client cert authentication.
  --client-crl-file ''
    Path to the client certificate revocation list file.
  --client-cert-allowed-hostname ''
    Allowed TLS hostname for client cert authentication.
  --trusted-ca-file ''
    Path to the client server TLS trusted CA cert file.
  --auto-tls 'false'
    Client TLS using generated certificates.
  --peer-cert-file ''
    Path to the peer server TLS cert file.
  --peer-key-file ''
    Path to the peer server TLS key file.
  --peer-client-cert-auth 'false'
    Enable peer client cert authentication.
  --peer-trusted-ca-file ''
    Path to the peer server TLS trusted CA file.
  --peer-cert-allowed-cn ''
    Required CN for client certs connecting to the peer endpoint.
  --peer-cert-allowed-hostname ''
    Allowed TLS hostname for inter peer authentication.
  --peer-auto-tls 'false'
    Peer TLS using self-generated certificates if --peer-key-file and --peer-cert-file are not provided.
  --peer-crl-file ''
    Path to the peer certificate revocation list file.
  --cipher-suites ''
    Comma-separated list of supported TLS cipher suites between client/server and peers (empty will be auto-populated by Go).
  --cors '*'
    Comma-separated whitelist of origins for CORS, or cross-origin resource sharing, (empty or * means allow all).
  --host-whitelist '*'
    Acceptable hostnames from HTTP client requests, if server is not secure (empty or * means allow all).

Auth:
  --auth-token 'simple'
    Specify a v3 authentication token type and its options ('simple' or 'jwt').
  --bcrypt-cost 10
    Specify the cost / strength of the bcrypt algorithm for hashing auth passwords. Valid values are between 4 and 31.
  --auth-token-ttl 300
    Time (in seconds) of the auth-token-ttl.

Profiling and Monitoring:
  --enable-pprof 'false'
    Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof/"
  --metrics 'basic'
    Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
  --listen-metrics-urls ''
    List of URLs to listen on for the metrics and health endpoints.

Logging:
  --logger 'capnslog'
    Specify 'zap' for structured logging or 'capnslog'. [WARN] 'capnslog' will be deprecated in v3.5.
  --log-outputs 'default'
    Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd, or list of comma separated output targets.
  --log-level 'info'
    Configures log level. Only supports debug, info, warn, error, panic, or fatal.

v2 Proxy (to be deprecated in v4):
  --proxy 'off'
    Proxy mode setting ('off', 'readonly' or 'on').
  --proxy-failure-wait 5000
    Time (in milliseconds) an endpoint will be held in a failed state.
  --proxy-refresh-interval 30000
    Time (in milliseconds) of the endpoints refresh interval.
  --proxy-dial-timeout 1000
    Time (in milliseconds) for a dial to timeout.
  --proxy-write-timeout 5000
    Time (in milliseconds) for a write to timeout.
  --proxy-read-timeout 0
    Time (in milliseconds) for a read to timeout.

2、选几个比较重要的参数
–name: 节点名称
–listen-peer-urls: 集群内各个节点通信的URL地址。本member侧使用,用于监听其他member发送信息的地址。ip为全0代表监听本member侧所有接口。
–initial-advertise-peer-urls:集群内各个节点通信的URL地址。其他member使用,其他member通过该地址与本member交互信息
–listen-client-urls:当前节点与客户端交互的URL地址。本member侧使用,用于监听etcd客户发送信息的地址。ip为全0代表监听本member侧所有接口
–advertise-client-urls:当前节点与客户端交互的URL地址。etcd客户使用,客户通过该地址与本member交互信息
–initial-cluster:集群中所有的initial-advertise-peer-urls合集。
–initial-cluster-state:new为新建集群的标示。
–initial-cluster-token:集群的唯一标示。

3、通过日志分析启动参数

tail -n 50 /tmp/etcd.log
nohup: ignoring input
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2021-01-14 22:53:00.826635 I | etcdmain: etcd Version: 3.4.14
2021-01-14 22:53:00.826679 I | etcdmain: Git SHA: 8a03d2e96
2021-01-14 22:53:00.826685 I | etcdmain: Go Version: go1.12.17
2021-01-14 22:53:00.826690 I | etcdmain: Go OS/Arch: linux/amd64
2021-01-14 22:53:00.826695 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2021-01-14 22:53:00.826704 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd
2021-01-14 22:53:00.826769 N | etcdmain: the server is already initialized as member before, starting as etcd member...
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2021-01-14 22:53:00.827428 I | embed: name = default
2021-01-14 22:53:00.827440 I | embed: data dir = default.etcd
2021-01-14 22:53:00.827446 I | embed: member dir = default.etcd/member
2021-01-14 22:53:00.827451 I | embed: heartbeat = 100ms
2021-01-14 22:53:00.827456 I | embed: election = 1000ms
2021-01-14 22:53:00.827460 I | embed: snapshot count = 100000
2021-01-14 22:53:00.827474 I | embed: advertise client URLs = http://localhost:2379
2021-01-14 22:53:00.827480 I | embed: initial advertise peer URLs = http://localhost:2380
2021-01-14 22:53:00.827487 I | embed: initial cluster =
2021-01-14 22:53:00.828261 I | etcdserver: restarting member 8e9e05c52164694d in cluster cdf818194e3a8c32 at commit index 5
raft2021/01/14 22:53:00 INFO: 8e9e05c52164694d switched to configuration voters=()
raft2021/01/14 22:53:00 INFO: 8e9e05c52164694d became follower at term 2
raft2021/01/14 22:53:00 INFO: newRaft 8e9e05c52164694d [peers: [], term: 2, commit: 5, applied: 0, lastindex: 5, lastterm: 2]
2021-01-14 22:53:00.830378 W | auth: simple token is not cryptographically signed
2021-01-14 22:53:00.832714 I | etcdserver: starting server... [version: 3.4.14, cluster version: to_be_decided]
raft2021/01/14 22:53:00 INFO: 8e9e05c52164694d switched to configuration voters=(10276657743932975437)
2021-01-14 22:53:00.834100 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2021-01-14 22:53:00.834255 N | etcdserver/membership: set the initial cluster version to 3.4
2021-01-14 22:53:00.834300 I | etcdserver/api: enabled capabilities for version 3.4
2021-01-14 22:53:00.835193 I | embed: listening for peers on 127.0.0.1:2380
raft2021/01/14 22:53:02 INFO: 8e9e05c52164694d is starting a new election at term 2
raft2021/01/14 22:53:02 INFO: 8e9e05c52164694d became candidate at term 3
raft2021/01/14 22:53:02 INFO: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 3
raft2021/01/14 22:53:02 INFO: 8e9e05c52164694d became leader at term 3
raft2021/01/14 22:53:02 INFO: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 3
2021-01-14 22:53:02.530035 I | embed: ready to serve client requests
2021-01-14 22:53:02.530163 I | etcdserver: published {Name:default ClientURLs:[http://localhost:2379]} to cluster cdf818194e3a8c32
2021-01-14 22:53:02.530832 N | embed: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged!

从上面的输出中,我们可以看到很多信息。以下是几个比较重要的信息:
参数name是节点名称,默认值是default
参数data-dir保存日志和快照的目录,默认为当前工作目录default.etcd/目录下
在http://localhost:2380和集群中其他节点通信。
在http://localhost:2379提供HTTP API服务,供客户端交互。
参数heartbeat作用是leader多久发送一次心跳到followers,默认值是100ms。
参数election重新投票的超时时间,如果follow在该时间间隔没有收到心跳包,会触发重新投票,默认为1000ms。
参数snapshot count是指定有多少事务被提交时,触发截取快照保存到磁盘,默认值10000
集群和每个节点都会生成一个uuid
启动的时候会运行raft,选举出leader

4、自定义参数启动服务
方式一:通过命令参数启动

./etcd --name niliu1 \
--listen-client-urls http://127.0.0.1:2379 \
--advertise-client-urls http://127.0.0.1:2379 \
--listen-peer-urls http://127.0.0.1:2380 \
--initial-advertise-peer-urls http://127.0.0.1:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new

方式二:通过配置文件启动
把如下内容放到新建文件/etc/etcd/conf.yml中

cat /etc/etcd/conf.yml

--name: niliu1
--listen-client-urls: http://127.0.0.1:2379
--advertise-client-urls:  http://127.0.0.1:2379
--listen-peer-urls:  http://127.0.0.1:2380
--initial-advertise-peer-urls:  http://127.0.0.1:2380
--initial-cluster-token:  etcd-cluster-1
--initial-cluster-state:  new

启动命令

nohup ./etcd --config-file=/etc/etcd/conf.yml &

四、启动集群
1、启动三个节点(这里在一台机器上,用不同端口启动3个伪节点)

nohup ./etcd --name niliu1 \
--listen-peer-urls http://127.0.0.1:2380 \
--initial-advertise-peer-urls http://127.0.0.1:2380 \
--listen-client-urls http://127.0.0.1:2379 \
--advertise-client-urls http://127.0.0.1:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster 'niliu1=http://127.0.0.1:2380,niliu2=http://127.0.0.1:2381,niliu3=http://127.0.0.1:2383' \
--initial-cluster-state new > /tmp/etcd_niliu1.log 2>&1 &

nohup ./etcd --name niliu2 \
--listen-peer-urls http://127.0.0.1:2381 \
--initial-advertise-peer-urls http://127.0.0.1:2381 \
--listen-client-urls http://127.0.0.1:2382 \
--advertise-client-urls http://127.0.0.1:2382 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster 'niliu1=http://127.0.0.1:2380,niliu2=http://127.0.0.1:2381,niliu3=http://127.0.0.1:2383' \
--initial-cluster-state new > /tmp/etcd_niliu2.log 2>&1 &

nohup ./etcd --name niliu3 \
--listen-peer-urls http://127.0.0.1:2383 \
--initial-advertise-peer-urls http://127.0.0.1:2383 \
--listen-client-urls http://127.0.0.1:2384 \
--advertise-client-urls http://127.0.0.1:2384 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster 'niliu1=http://127.0.0.1:2380,niliu2=http://127.0.0.1:2381,niliu3=http://127.0.0.1:2383' \
--initial-cluster-state new > /tmp/etcd_niliu3.log 2>&1 &

2、查看member启动状态

./etcdctl --endpoints=127.0.0.1:2380,127.0.0.1:2381,127.0.0.1:2383  member list -w table
+------------------+---------+--------+-----------------------+-----------------------+------------+
|        ID        | STATUS  |  NAME  |      PEER ADDRS       |     CLIENT ADDRS      | IS LEARNER |
+------------------+---------+--------+-----------------------+-----------------------+------------+
| 8f65a5fdd9ee42ad | started | niliu3 | http://127.0.0.1:2383 | http://127.0.0.1:2384 |      false |
| bf9071f4639c75cc | started | niliu1 | http://127.0.0.1:2380 |                       |      false |
| e7b968b9fb1bc003 | started | niliu2 | http://127.0.0.1:2381 | http://127.0.0.1:2382 |      false |
+------------------+---------+--------+-----------------------+-----------------------+------------+

3、查看选举状态

./etcdctl endpoint status --endpoints=127.0.0.1:2380,127.0.0.1:2381,127.0.0.1:2383  member list -w table
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|    ENDPOINT    |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 127.0.0.1:2380 | bf9071f4639c75cc |  3.4.14 |   20 kB |      true |      false |         4 |         11 |                 11 |        |
| 127.0.0.1:2381 | e7b968b9fb1bc003 |  3.4.14 |   20 kB |     false |      false |         8 |         10 |                 10 |        |
| 127.0.0.1:2383 | 8f65a5fdd9ee42ad |  3.4.14 |   20 kB |      true |      false |         8 |         10 |                 10 |        |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

注:这里IS LEADER有两个节点为true, 是因为任期不一样。这里是有问题的,需要把–data_dir目录下的数据删掉,重启服务,RAFT TERM一致才正常。

4、查看健康状态

./etcdctl endpoint health --endpoints=127.0.0.1:2380,127.0.0.1:2381,127.0.0.1:2383  member list -w table
+----------------+--------+------------+-------+
|    ENDPOINT    | HEALTH |    TOOK    | ERROR |
+----------------+--------+------------+-------+
| 127.0.0.1:2380 |   true | 1.141665ms |       |
| 127.0.0.1:2383 |   true | 1.639378ms |       |
| 127.0.0.1:2381 |   true | 1.610224ms |       |
+----------------+--------+------------+-------+

5、在一个节点写入,在其他节点读取

./etcdctl --endpoints=127.0.0.1:2380 put /niliu/config/date 20210116
OK
./etcdctl --endpoints=127.0.0.1:2380 get /niliu/config/date
/niliu/config/date
20210116
./etcdctl --endpoints=127.0.0.1:2381 get /niliu/config/date
./etcdctl --endpoints=127.0.0.1:2383 get /niliu/config/date

常见问题:
1、wget 提示302
https://unix.stackexchange.com/questions/74334/how-to-download-files-with-wget-where-the-page-makes-you-wait-for-download

2、部分节点数据不一致,日志显示

2021-01-17 00:50:20.655998 E | rafthttp: request cluster ID mismatch (got 38cb1decd4761d5f want 7bdc11851051b492)
2021-01-17 00:50:20.697749 E | rafthttp: request cluster ID mismatch (got 38cb1decd4761d5f want 7bdc11851051b492)

解决:把–data_dir目录下的数据删掉,重启服务。

参考:
《etcd技术内幕》
https://www.youtube.com/watch?v=O0cKW9BKmzs&pbjreload=101

发表评论

电子邮件地址不会被公开。 必填项已用*标注