1. 03 Mar, 2019 16 commits
      [FAB-12709] Enable CheckQuorum · 50a09fd0
      When CheckQuorum is enabled, leader steps down if it cannot reach
      the quorum of network, so that clients have a chance to disconnect
      and try other nodes.
      [FAB-12709] Use another way to elect leader in UT · 5c3e2fce
      In etcdraft UT, we often need to deterministically elect a leader.
      This was done by ticking ONLY one node in the network, so it is
      the only node that start campaign.
      HOWEVER, there are several problems with this approach:
      1. it's slow. We need real time interval between ticks due to the
         way fake clock is implemented: it drops tick on the floor in
         case of slow consumer.
      2. there is random factor in election timeout of etcd/raft. It is
         calculated as follow:
      randomElectionTimeout = electionTimeout + rand.Intn(electionTimeout)
         in another word, if we send electionTimeout ticks, it's not
         guaranteed to trigger a leader election
      3. if CheckQuorum is enabled, a lease is imposed on follower nodes
         which gets expired if
            electionTimeout <= elapsedTicks < randomElectionTimeout
         (if it's greater than randomElectionTimeout, it's reset to 0 and
         node starts campaign)
      In this CR, we send an artificial MsgTimeoutNow to the node to be
      elected. This message reliably triggers campaign and skip the lease
      This CR also fixes several potential data race and flakes in tests.
      [FAB-13848] Fix flaky integration test in raft cft · 20dc27fc
      Test may query chaincode too fast after invocation, before block
      is actually committed.
      [FAB-13845] Increase default raft tick interval · ebd9127c
      Increase default etcdraft tick interval to 500ms, for several reasons:
      - in a WAN/Cloud environment, this is more realistic
      - WAL sync in CI often exceeds 1s, which causes heartbeats not being
        sent timely. Increasing election timeout can decrease the chance of
        unexpected leader failover.
      This CR also increases default timeout of peer cmd, because now
      it takes 5~10s to elect a leader for a newly created channel, and
      `peer channel create` can only retrieve genesis block of that channel
      when leader exists (Deliver API returns error if leaderless).
      Note that this value is still configurable by users.
      FAB-13705 refine Bundle.validateNew · 2979b8cc
      This task addresses two issues:
      1) In common/multichannel/Bundle.ValidateNew() it is possible
         to identify the system channel using:
             _, isSys := b.ConsortiumsConfig()
         this can be used to refine validateMigrationStep() such that
         it deals more accurately with the migration-state transitions
         on the system vs. standard channels.
      2) In addition, prevent user from adding ConsortiumsConfig() to
         standard channels. This will protect multichannel.Registrar
         from blowing up on next initialization. Explanation:
         - Looking at the code in multichannel.Registrar, we see that
             _, ok := ledgerResources.ConsortiumsConfig()
         is used to identify the system channel. If two system channels
         are identified, the code panics.
         - Now, in Bundle.ValidateNew(), currently there is no mechanism
         to prevent a user (orderer admin) from adding a ConsortiumsConfig()
         to a standard channel. If a user does that,  multichannel.Registrar
         will blow up in the next initialization.
      FAB-13704 Update doc of ConsensusType proto · 5af9c275
      Update documentation of ConsensusType in protos/orderer/configuration.proto
      to reflect implementation.
       - spell out permitted type strings: "solo" / "kafka" / "etcdraft"
       - update migration_state for which messages are permitted on system / standard channels
       - update migration_context for what is required on each migration_state
       - make protos
      [FAB-13808] Address code review comments for FAB-13363 · 6ebade28
      This change set addresses code review comments for FAB-13363.
      - Deletion of the redundant logger instance in the server main.go
      [FAB-13363] Block verification for onboarding · b6dc844a
      This change set connects the block verification infrastructure
      for onboarding to the production code.
      Now, whenever an orderer onboards a channel - it also verifies the blocks
      of the application channels, by:
      1) Creating a bundle from the genesis block, which is derived from
         the system channel (which is verified using backward hash chain validation).
      2) Verifying blocks using the bundle.
      3) Replacing the bundle with a new bundle whenever a config block is pulled.
      It also adds a check in the integration test, that ensures that no errors
      are reported in the log of the onboarded OSN.
      [FAB-13178] Move `SendSubmit` out of serveRequest · 100e1ad7
      When gRPC buffer of `Submit` stream is full, `SendSubmit` would
      block, which freezes the `serveRequest` go routine. This CR moves
      this out of go routine, and clients should be blocked on waiting
      for room in buffer.
      [FAB-13716] Block verifier book-keeping for onboarding · e0e3ddbb
      This change set adds the following supporting structs for adding
      support for verifying blocks pulled by onboarding in future CRs:
      - Ledger interceptor: intercepts a commit of a block, and invokes
        a callback.
      - VerificationRegistry: tracks commit of config blocks, and builds
        channelconfig bundles from them, in order to support verification
        of blocks pulled.
      - BlockVerifierAssembler and BlockValidationPolicyVerifier: together
        they build block verifiers out of config blocks.
      - verifierLoader: Loads a mapping of chainID->cluster.BlockVerifier,
        which is to be used at OSN startup to preload the existing verifiers.
        It is needed in cases we recover from a crash, or if we do
        dynamic onboarding and the previous config blocks have been committed
        to the ledger before the OSN was started.
      In the next CR, I will wire all these into the onboarding infrastructure
      itself, and they will be used to hold the latest bundle per channel
      in order to verify block signatures.
      [FAB-13178] Use MaxInflightMsgs to throttle requests · 9b78a9d8
      If there are MaxInflightMsgs blocks proposed but not
      committed, chain blocks further incoming requests.
      Change-Id: I58c84e23c882ccc152e5c9a248434e466a8b5266
      [FAB-13438] Errored should reflect correct state · 0276480c
      This CR changes Errored to return a channel that is
      closed when node becomes candidate.
      [FAB-13438] Store raft SoftState · 21a49bad
      Store raft SoftState in raft chain so it returns error
      while election is ongoing. This prevents a disconnected
      follower from returning success on Broadcast API.
      [FAB-13438] pass SoftState on observe channel · 657b8095
      This CR changes type of etcdraft observe channel from uint64
      to raft.SoftState, so that chain_test can assert not only leader
      id, but also the state of node.
      [FAB-13613] Fix race in etcdraft chain UT · 5dadb3a5
      Add a lock to guard manipulation of `StepStub`.
      [FAB-13447] new leader should wait for in flight msg · 0d247c1d
      Newly elected raft leader should wait for in flight blocks
      to be committed, before accepting new envelopes and creating
      new blocks. Otherwise all those blocks created would be uncle
      blocks and we don't permit this situation in Fabric.
  2. 02 Mar, 2019 1 commit
  3. 01 Mar, 2019 7 commits
      Merge changes I67612917,I226ee34d,If5e785e0,Ibefe74a3,I4e32cb15, ... into release-1.4 · 160a228c
      Merge changes I12f42470,I3c2a84e6,I9fe663c4,Ib6acf6fd,I3331f2ab, ... into release-1.4 · c4c0ce0c
      Merge changes I3aa68e4b,Idf10bff7,I5db2adbd,If1ce27b2,Ica00d5e6, ... into release-1.4 · 4445fa12
  4. 28 Feb, 2019 5 commits
  5. 27 Feb, 2019 11 commits
      [FAB-13447] Streamline the code · d735a06c
      Instead of returning status several levels up, several methods
      in etcdraft chain can just set member var to store current state.
      [FAB-13465] Max retry attempts for orderer replication · 0da0ecee
      This change set adds an option to configure the block puller
      used for the replication with a maximum retry attempts.
      It is needed because during onboarding, a specific application channel
      might become unavailable, but it shouldn't block onboarding now when
      we have dynamic periodical onboarding for channels we were unable to join.
      [FAB-13331] Refactor metadata updates in nwo · 4f802d51
      This change set refactors metadata updates by making them
      use a function that dictates how to handle consensus metadata.
      FAB-12986: ledger per chain for raft chain_test.go · c7db89e0
      Currently there is a single instance of ledger shared between instance
      of chain mock in unit-tests. This commit introduces ledger instance per
      [FAB-12945] add raft reconfiguration unit-tests · 07b7309c
      [FAB-14332] disable flaky CouchDB healthcheck test · c0450096
      [FAB-13178] A dumb version of etcdraft BlockCreator · ff843afd
      This CR rewrites BlockCreator so that it doesn't return nil block.
      blockcreator holds a channel of created blocks, which is buffered
      with size of createdBlocksBuffersize (default 20). It also stores
      the hash and number of latest block.
      When requested to create new block, blockcreator does so
      by assembling a block based on that hash and number, enque the
      block to buffered channel. If channel is full, a nil is returned.
      When commit a block, it drains the channel. If there's nothing in
      the channel, it implies the blockcreator is manipulated by raft
      follower, therefore blockreator simply updates hash and number.
      what we need is actually as simple as: a blockcreator holds the
      hash and number of latest block. When it is requested to create
      a block, it just uses that hash and number to assemble one.
      And ONLY raft leader holds a blockcreator. Followers blindly
      commit whatever comes from consensus. When a follower is elected
      as new leader, it simply looks up the ledger, find hash and number
      of latest block, and creates a new blockcreator.
      [FAB-13180] Orderer: auto-join existing inactive chains · 4bc13c8e
      This change set makes cluster type OSNs autonomously detect channels
      that exist and that they should be part of (the channel configuration
      has their public credentials as a consenter for the channel),
      but that they do not run chains for, or have the blocks in their ledger.
      This can happen from several reasons:
      - The OSN is added to an existing chain, and since it didn't participate
        in the chain so far, it didn't get the blocks that tell it is now
        part of the channel.
      - The OSN tried to detect whether it is part of a channel, but it
        wasn't able, because all OSNs of the system channel returned
        service-unavailable. This can happen if:
        - a leader election takes place
        - the network is acting up so the leadership was lost
        - a channel has been deserted (all OSNs left it).
      To take care of such use cases, all OSNs now:
      - Track inactive chains that they know of, but they do not participate in
      - Periodically(*) probe the system channel OSNs to see if they are now
        part of these chains or not.
      - If so, then they replicate the chains, and create instances of them,
        and replace the instances of the inactive chains in the registrar
        with the new instances of type etcdraft.
      (*) - 10 seconds after boot, then after 20 seconds,
            then after 40 seconds, etc. etc. eventually- every 5 minutes.
      [FAB-13298] Fix test flake on MacOS · 567981aa
      Fixed a problem on MacOS but it seems that the error string
      that is returned from the operating system's system call
      differs on linux and Mac.
      This change set addresses this by making the panic error
      comparison look for a substring instead of a full comparison.
      [FAB-13178] Remove global leader var in etcdraft chain · f28884a4
      This CR removes the global leader var in etcdraft chain because
      it is racy in following case: several requests are to be enqued
      into submitC while leader loses its leadership.
      This also removes the lock on rpc.SendSubmit because it's guarded
      by the channel.
      [FAB-13456] Fix race in etcdraft test · e8514271
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>