1. 03 Mar, 2019 16 commits
    • Jay Guo's avatar
      [FAB-12709] Enable CheckQuorum · 50a09fd0
      Jay Guo authored
      When CheckQuorum is enabled, leader steps down if it cannot reach
      the quorum of network, so that clients have a chance to disconnect
      and try other nodes.
      Change-Id: I901c0e3009f9d354a2b504fe16174432345055b3
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-12709] Use another way to elect leader in UT · 5c3e2fce
      Jay Guo authored
      In etcdraft UT, we often need to deterministically elect a leader.
      This was done by ticking ONLY one node in the network, so it is
      the only node that start campaign.
      HOWEVER, there are several problems with this approach:
      1. it's slow. We need real time interval between ticks due to the
         way fake clock is implemented: it drops tick on the floor in
         case of slow consumer.
      2. there is random factor in election timeout of etcd/raft. It is
         calculated as follow:
      randomElectionTimeout = electionTimeout + rand.Intn(electionTimeout)
         in another word, if we send electionTimeout ticks, it's not
         guaranteed to trigger a leader election
      3. if CheckQuorum is enabled, a lease is imposed on follower nodes
         which gets expired if
            electionTimeout <= elapsedTicks < randomElectionTimeout
         (if it's greater than randomElectionTimeout, it's reset to 0 and
         node starts campaign)
      In this CR, we send an artificial MsgTimeoutNow to the node to be
      elected. This message reliably triggers campaign and skip the lease
      This CR also fixes several potential data race and flakes in tests.
      Change-Id: I3c8e0bcadbb8cfa1ae3393de2ea711fdd0d8b7aa
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13848] Fix flaky integration test in raft cft · 20dc27fc
      Jay Guo authored
      Test may query chaincode too fast after invocation, before block
      is actually committed.
      Change-Id: I4159fb2dfb31310eccfd64fcb9a9a99ceef54db0
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13845] Increase default raft tick interval · ebd9127c
      Jay Guo authored
      Increase default etcdraft tick interval to 500ms, for several reasons:
      - in a WAN/Cloud environment, this is more realistic
      - WAL sync in CI often exceeds 1s, which causes heartbeats not being
        sent timely. Increasing election timeout can decrease the chance of
        unexpected leader failover.
      This CR also increases default timeout of peer cmd, because now
      it takes 5~10s to elect a leader for a newly created channel, and
      `peer channel create` can only retrieve genesis block of that channel
      when leader exists (Deliver API returns error if leaderless).
      Note that this value is still configurable by users.
      Change-Id: I94fbbc750fa096cce6ef9e2d65eb981c6202b675
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Yoav Tock's avatar
      FAB-13705 refine Bundle.validateNew · 2979b8cc
      Yoav Tock authored
      This task addresses two issues:
      1) In common/multichannel/Bundle.ValidateNew() it is possible
         to identify the system channel using:
             _, isSys := b.ConsortiumsConfig()
         this can be used to refine validateMigrationStep() such that
         it deals more accurately with the migration-state transitions
         on the system vs. standard channels.
      2) In addition, prevent user from adding ConsortiumsConfig() to
         standard channels. This will protect multichannel.Registrar
         from blowing up on next initialization. Explanation:
         - Looking at the code in multichannel.Registrar, we see that
             _, ok := ledgerResources.ConsortiumsConfig()
         is used to identify the system channel. If two system channels
         are identified, the code panics.
         - Now, in Bundle.ValidateNew(), currently there is no mechanism
         to prevent a user (orderer admin) from adding a ConsortiumsConfig()
         to a standard channel. If a user does that,  multichannel.Registrar
         will blow up in the next initialization.
      Change-Id: Ia7551cbd27389a9988757af0224abdc0d1bfef5b
      Signed-off-by: default avatarYoav Tock <tock@il.ibm.com>
    • Yoav Tock's avatar
      FAB-13704 Update doc of ConsensusType proto · 5af9c275
      Yoav Tock authored
      Update documentation of ConsensusType in protos/orderer/configuration.proto
      to reflect implementation.
       - spell out permitted type strings: "solo" / "kafka" / "etcdraft"
       - update migration_state for which messages are permitted on system / standard channels
       - update migration_context for what is required on each migration_state
       - make protos
      Change-Id: Ia27d9cd162fe6656fd2bd56ceaf5aae8a6fe5222
      Signed-off-by: default avatarYoav Tock <tock@il.ibm.com>
    • yacovm's avatar
      [FAB-13808] Address code review comments for FAB-13363 · 6ebade28
      yacovm authored
      This change set addresses code review comments for FAB-13363.
      - Deletion of the redundant logger instance in the server main.go
      Change-Id: I0ee0db21894a352c7d1679efc27524162723a895
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
    • yacovm's avatar
      [FAB-13363] Block verification for onboarding · b6dc844a
      yacovm authored
      This change set connects the block verification infrastructure
      for onboarding to the production code.
      Now, whenever an orderer onboards a channel - it also verifies the blocks
      of the application channels, by:
      1) Creating a bundle from the genesis block, which is derived from
         the system channel (which is verified using backward hash chain validation).
      2) Verifying blocks using the bundle.
      3) Replacing the bundle with a new bundle whenever a config block is pulled.
      It also adds a check in the integration test, that ensures that no errors
      are reported in the log of the onboarded OSN.
      Change-Id: I3c5714f9d4491cdfd78e4e47407925136906d413
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
    • Jay Guo's avatar
      [FAB-13178] Move `SendSubmit` out of serveRequest · 100e1ad7
      Jay Guo authored
      When gRPC buffer of `Submit` stream is full, `SendSubmit` would
      block, which freezes the `serveRequest` go routine. This CR moves
      this out of go routine, and clients should be blocked on waiting
      for room in buffer.
      Change-Id: I62cd261b9419bd8df3fa1bfaeff14551168d2e65
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • yacovm's avatar
      [FAB-13716] Block verifier book-keeping for onboarding · e0e3ddbb
      yacovm authored
      This change set adds the following supporting structs for adding
      support for verifying blocks pulled by onboarding in future CRs:
      - Ledger interceptor: intercepts a commit of a block, and invokes
        a callback.
      - VerificationRegistry: tracks commit of config blocks, and builds
        channelconfig bundles from them, in order to support verification
        of blocks pulled.
      - BlockVerifierAssembler and BlockValidationPolicyVerifier: together
        they build block verifiers out of config blocks.
      - verifierLoader: Loads a mapping of chainID->cluster.BlockVerifier,
        which is to be used at OSN startup to preload the existing verifiers.
        It is needed in cases we recover from a crash, or if we do
        dynamic onboarding and the previous config blocks have been committed
        to the ledger before the OSN was started.
      In the next CR, I will wire all these into the onboarding infrastructure
      itself, and they will be used to hold the latest bundle per channel
      in order to verify block signatures.
      Change-Id: Ic9fc99243baa5c2cef97103d001180207414d98a
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
    • Jay Guo's avatar
      [FAB-13178] Use MaxInflightMsgs to throttle requests · 9b78a9d8
      Jay Guo authored
      If there are MaxInflightMsgs blocks proposed but not
      committed, chain blocks further incoming requests.
      Change-Id: I58c84e23c882ccc152e5c9a248434e466a8b5266
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13438] Errored should reflect correct state · 0276480c
      Jay Guo authored
      This CR changes Errored to return a channel that is
      closed when node becomes candidate.
      Change-Id: Ibd0ece763b9d93c4da93825d1b302ecc55a9b32e
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13438] Store raft SoftState · 21a49bad
      Jay Guo authored
      Store raft SoftState in raft chain so it returns error
      while election is ongoing. This prevents a disconnected
      follower from returning success on Broadcast API.
      Change-Id: Ib6619b230938f0d6c10240b8cd8e34e346056145
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13438] pass SoftState on observe channel · 657b8095
      Jay Guo authored
      This CR changes type of etcdraft observe channel from uint64
      to raft.SoftState, so that chain_test can assert not only leader
      id, but also the state of node.
      Change-Id: Ia0c5f8c9060c234ceb84133e0c5598ed064dd1ee
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13613] Fix race in etcdraft chain UT · 5dadb3a5
      Jay Guo authored
      Add a lock to guard manipulation of `StepStub`.
      Change-Id: Icaadb1f5aea0cb7f266f24ed6756c4f6541768bd
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13447] new leader should wait for in flight msg · 0d247c1d
      Jay Guo authored
      Newly elected raft leader should wait for in flight blocks
      to be committed, before accepting new envelopes and creating
      new blocks. Otherwise all those blocks created would be uncle
      blocks and we don't permit this situation in Fabric.
      Change-Id: Ia5adac185263735eace1fc805ebea0f5c98b2fb1
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
  2. 02 Mar, 2019 1 commit
  3. 01 Mar, 2019 7 commits
    • Jason Yellick's avatar
    • Yacov Manevich's avatar
      Merge changes I67612917,I226ee34d,If5e785e0,Ibefe74a3,I4e32cb15, ... into release-1.4 · 160a228c
      Yacov Manevich authored
      * changes:
        [FAB-13447] Streamline the code
        [FAB-13178] A dumb version of etcdraft BlockCreator
        [FAB-13178] Remove global leader var in etcdraft chain
        [FAB-13178] Move raft logic to its own file
        [FAB-13178] do not accept new env when conf in flight
        [FAB-13178] Refactor etcdraft chain to avoid sync
        [FAB-13694] Move LastConfigBlock to orderer common
        [FAB-13698] disable flaky test TestReconnect
        [FAB-13643] Leader crash and failover integration test
        FAB-13265 migration status in channelconfig
        FAB-12984 consensus migration protos
        [FAB-13633] Make Step RPC failures non blocking
        [FAB-13178] Simplify the proposition of config block
        [FAB-11996] Fix failed UT
        [FAB-13481] Make onboarding code more idiomatic
        [FAB-13495] Activate onboarding max retries
        FAB-12983 capability V2_0 for Kafka2RaftMigration
    • Yacov Manevich's avatar
      Merge changes I12f42470,I3c2a84e6,I9fe663c4,Ib6acf6fd,I3331f2ab, ... into release-1.4 · c4c0ce0c
      Yacov Manevich authored
      * changes:
        [FAB-13465] Max retry attempts for orderer replication
        [FAB-13180] Orderer: auto-join existing inactive chains
        [FAB-13456] Fix race in etcdraft test
        [FAB-13456] Use empty peer list to join raft cluster
        [FAB-13444] Prepare onboarding to multi-time use
        [FAB-13362] Pulling not servicing chains in onboarding
        [FAB-13441] Properly capture OSN output
        [FAB-13428] Make TestReplicateChainsFailures robust
        [FAB-13427] Make replication tests not depend on time
        [FAB-13360] Fix an etcdraft flaky UT
        [FAB-13415] DRY up UpdateConsensusMetadata in nwo
        [FAB-13367] Fix flaky etcdraft UT
        [FAB-1337] Raft: Commit genesis blocks for non-members
        [FAB-13208] Raft Reconfig&Onboarding integration test
        [FAB-13333] Orderer config update to use orderer creds
    • Yacov Manevich's avatar
      Merge changes I3aa68e4b,Idf10bff7,I5db2adbd,If1ce27b2,Ica00d5e6, ... into release-1.4 · 4445fa12
      Yacov Manevich authored
      * changes:
        [FAB-13331] Refactor metadata updates in nwo
        [FAB-13298] Fix test flake on MacOS
        [FAB-13332] Add cryptogen extend to integration tests
        [FAB-13334] Onboarding: Allow empty channels
        [FAB-13330] Rename GetConfigBlock to GetConfig in nwo
        [FAB-13349] Add more assertion to etcdraft UT.
        [FAB-13095] fix UT flake RPC timeout
        [FAB-13350] Fix etcdraft flaky test
        [FAB-13298] Fix TestConfigureClusterListener in MacOS
        [FAB-13299] Onboarding: Skip committing existing blocks
        [FAB-12579] Separate TLS listener for intra-cluster
        [FAB-13262] typo in configblock.go
        [FAB-13053] Add an UT to assert retransmission.
        [FAB-12949] Fix etcdraft reconfiguration UT
        [FAB-12729] Support subset of system channel OSNs
        [FAB-13150] Re-enable etcdraft for v2.0 development
        [FAB-13225] address code review comments
        [FAB-13057] Remove applied index check in storage
        [FAB-13199] Reduce etcdraft test time.
        [FAB-12949] finish reconfiguration after restart
    • Yacov Manevich's avatar
    • Kostas Christidis's avatar
    • Kostas Christidis's avatar
  4. 28 Feb, 2019 5 commits
  5. 27 Feb, 2019 11 commits
    • Jay Guo's avatar
      [FAB-13447] Streamline the code · d735a06c
      Jay Guo authored
      Instead of returning status several levels up, several methods
      in etcdraft chain can just set member var to store current state.
      Change-Id: I67612917bf3bb3225f1507c8b7376d730b18e9f4
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • yacovm's avatar
      [FAB-13465] Max retry attempts for orderer replication · 0da0ecee
      yacovm authored
      This change set adds an option to configure the block puller
      used for the replication with a maximum retry attempts.
      It is needed because during onboarding, a specific application channel
      might become unavailable, but it shouldn't block onboarding now when
      we have dynamic periodical onboarding for channels we were unable to join.
      Change-Id: I12f4247040c258809885f0e5fdc07d60914a56e2
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
    • yacovm's avatar
      [FAB-13331] Refactor metadata updates in nwo · 4f802d51
      yacovm authored
      This change set refactors metadata updates by making them
      use a function that dictates how to handle consensus metadata.
      Change-Id: I3aa68e4b268a24887e4cba891e02ebce1a2ec65d
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
    • Artem Barger's avatar
      FAB-12986: ledger per chain for raft chain_test.go · c7db89e0
      Artem Barger authored
      Currently there is a single instance of ledger shared between instance
      of chain mock in unit-tests. This commit introduces ledger instance per
      Change-Id: I333fa2819490c995931a7e0d241eb6428e67c87e
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
    • Artem Barger's avatar
      [FAB-12945] add raft reconfiguration unit-tests · 07b7309c
      Artem Barger authored
      Change-Id: Ib77c866a30ed5108ad53908b0ca25a60a89e9a7c
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
    • Artem Barger's avatar
      [FAB-14332] disable flaky CouchDB healthcheck test · c0450096
      Artem Barger authored
      Change-Id: Ic17894d5eff66a195f93fcccacf2e3115587d7a5
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
    • Jay Guo's avatar
      [FAB-13178] A dumb version of etcdraft BlockCreator · ff843afd
      Jay Guo authored
      This CR rewrites BlockCreator so that it doesn't return nil block.
      blockcreator holds a channel of created blocks, which is buffered
      with size of createdBlocksBuffersize (default 20). It also stores
      the hash and number of latest block.
      When requested to create new block, blockcreator does so
      by assembling a block based on that hash and number, enque the
      block to buffered channel. If channel is full, a nil is returned.
      When commit a block, it drains the channel. If there's nothing in
      the channel, it implies the blockcreator is manipulated by raft
      follower, therefore blockreator simply updates hash and number.
      what we need is actually as simple as: a blockcreator holds the
      hash and number of latest block. When it is requested to create
      a block, it just uses that hash and number to assemble one.
      And ONLY raft leader holds a blockcreator. Followers blindly
      commit whatever comes from consensus. When a follower is elected
      as new leader, it simply looks up the ledger, find hash and number
      of latest block, and creates a new blockcreator.
      Change-Id: I226ee34d666fbb1e8d034dc22ea6800df993f7a4
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • yacovm's avatar
      [FAB-13180] Orderer: auto-join existing inactive chains · 4bc13c8e
      yacovm authored
      This change set makes cluster type OSNs autonomously detect channels
      that exist and that they should be part of (the channel configuration
      has their public credentials as a consenter for the channel),
      but that they do not run chains for, or have the blocks in their ledger.
      This can happen from several reasons:
      - The OSN is added to an existing chain, and since it didn't participate
        in the chain so far, it didn't get the blocks that tell it is now
        part of the channel.
      - The OSN tried to detect whether it is part of a channel, but it
        wasn't able, because all OSNs of the system channel returned
        service-unavailable. This can happen if:
        - a leader election takes place
        - the network is acting up so the leadership was lost
        - a channel has been deserted (all OSNs left it).
      To take care of such use cases, all OSNs now:
      - Track inactive chains that they know of, but they do not participate in
      - Periodically(*) probe the system channel OSNs to see if they are now
        part of these chains or not.
      - If so, then they replicate the chains, and create instances of them,
        and replace the instances of the inactive chains in the registrar
        with the new instances of type etcdraft.
      (*) - 10 seconds after boot, then after 20 seconds,
            then after 40 seconds, etc. etc. eventually- every 5 minutes.
      Change-Id: I3c2a84e6f4f402e011e7a895345b3d3982247083
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
    • yacovm's avatar
      [FAB-13298] Fix test flake on MacOS · 567981aa
      yacovm authored
      Fixed a problem on MacOS but it seems that the error string
      that is returned from the operating system's system call
      differs on linux and Mac.
      This change set addresses this by making the panic error
      comparison look for a substring instead of a full comparison.
      Change-Id: Idf10bff7b4dde6009ce01bb83b7bd576be4df2b4
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
    • Jay Guo's avatar
      [FAB-13178] Remove global leader var in etcdraft chain · f28884a4
      Jay Guo authored
      This CR removes the global leader var in etcdraft chain because
      it is racy in following case: several requests are to be enqued
      into submitC while leader loses its leadership.
      This also removes the lock on rpc.SendSubmit because it's guarded
      by the channel.
      Change-Id: If5e785e05dcf9bfc60e403f2d5813baf769ee103
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
    • Jay Guo's avatar
      [FAB-13456] Fix race in etcdraft test · e8514271
      Jay Guo authored
      Change-Id: I9fe663c4efa46e5644571d238f4d7ea8f4e51626
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>