1. 28 Mar, 2019 2 commits
  2. 27 Mar, 2019 1 commit
    • Artem Barger's avatar
      FAB-14838 properly clean IT test resources · d568cc02
      Artem Barger authored
      
      
      This commit moves code which takes care to free resources and stop peers
      and orderers processes after the integration test. Also need to make
      sure system channel is actually ready to pull configuration blocks, by
      actually waiting for leader to be selected.
      
      Change-Id: Iefb37732edbd2b375022190691667700840d2b52
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
      d568cc02
  3. 24 Mar, 2019 1 commit
  4. 22 Mar, 2019 1 commit
  5. 20 Mar, 2019 1 commit
    • Jay Guo's avatar
      FAB-14540 transfer leader if cert of it is rotated · 7e440c73
      Jay Guo authored
      
      
      When the certificate of leader is rotated, it will certainly be
      disconnected after reconfiguring communication. Instead of waiting
      for ElectionTimeout and elect new leader, the old leader should be
      more cooporative and transfer its leadership to others.
      
      Note that proposals sent during this transition will be automatically
      dropped by etcd/raft, however transition should be fairly short.
      
      Change-Id: Iabd005d00864afe09b4738f1ed36b939b1d83eed
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      7e440c73
  6. 16 Mar, 2019 1 commit
    • Jay Guo's avatar
      FAB-14593 Refine etcdraft parameters · 2a4e15e9
      Jay Guo authored
      
      
      - MaxInflightMsgs is internal to etcd/raft and should be exposed
      to users with a more appropriate name: MaxInflightBlocks
      
      - MaxSizePerMsg is also internal to etcd/raft, and it's defaulted
      to PreferredMaxBytes in BatchSize, so that if a big block is created,
      it is sent in a its own etcd/raft message, instead of being batched
      with other blocks. This parameter takes effect when a batch of entries
      is sent to lagged node. During normal replication, each block is
      sent in its own message.
        It's not necessary to expose this config option to users.
      
      - SnapInterval is renamed to SnapshotIntervalSize
      
      FAB-14593 #done
      
      Change-Id: Icaf2848a41c5f0f0a02f4b0b4a80ba852fddd584
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      2a4e15e9
  7. 14 Mar, 2019 1 commit
    • Jason Yellick's avatar
      FAB-14619 Rename Raft metadata protos · d645c833
      Jason Yellick authored
      
      
      There are presently two etcdraft protos around metadata.  One is the
      metadata stored in the config and it is named 'Metadata', the other is
      the metadata stored in each block, this is named 'RaftMetadata'.  This
      causes confusion when reading the code.  This CR transforms those names
      to be:
      
       Metadata -> ConfigMetadata
       RaftMetadata -> BlockMetadata
      
      Change-Id: Ia0394ebe78f5541996c010c3c67d760f336f75d8
      Signed-off-by: default avatarJason Yellick <jyellick@us.ibm.com>
      d645c833
  8. 13 Mar, 2019 1 commit
    • yacovm's avatar
      [FAB-14607] Dynamically add channel verifiers · 668b81c5
      yacovm authored
      
      
      The channel verifiers for onboarding are used to verify blocks
      that are pulled.
      
      When the code was written, it was not possible to create channels
      with a subset of OSNs in the channel, and therefore all OSNs that
      were added to channels dynamically - loaded verifiers for these
      channels at their startup.
      
      However, now when we can create channels with subsets of OSNs,
      we need the ability to dynamically register verifiers for channels
      that are created.
      
      This change set adds this capability in the least intrusive way-
      before the blocks are pulled via dynamic onboarding, the verifiers
      for channels that were discovered, are created on demand.
      
      This change set also makes the block puller empty
      its internal block buffer when Close() is called,
      for additional safety.
      
      Change-Id: Ife68ec1fe5b554a2089cc70640e4bc1d3014c39e
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      668b81c5
  9. 09 Mar, 2019 1 commit
  10. 07 Mar, 2019 1 commit
  11. 05 Mar, 2019 6 commits
  12. 04 Mar, 2019 4 commits
    • yacovm's avatar
      [FAB-14136] Always Deliver if cluster smaller than 3 · 8b87f05a
      yacovm authored
      
      
      When a single node etcdraft cluster is expanded and a new node
      is added, the new node needs to pull blocks from the existing node.
      
      However, the existing node loses leadership and then rejects deliver requests.
      
      This change set, makes a raft leader not reject deliver responses if the cluster
      has less than 3 members in it.
      
      Change-Id: I75bd028d5a46fcb6ae81dc29012e3e839149c319
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      8b87f05a
    • yacovm's avatar
      [FAB-14217] Harden etcdraft eviction integration test · 5ac7e000
      yacovm authored
      
      
      The test finds a leader, and then creates a channel,
      and then removes the leader from both channels.
      
      However, obviously - that is only the system channel leader,
      and it may not be the leader of the application channel.
      
      When it's not the leader - the other nodes might cease communication
      with that evicted node (in case it's slow) - and then the test fails
      because it doesn't detect its own eviction.
      
      This change set makes the test only evict the leader from the system channel,
      and removes the creation of the application channel.
      
      Change-Id: I8ff384c0c2003d137dccf9eba948d25bcc14188e
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      5ac7e000
    • Jason Yellick's avatar
      FAB-14057 Specify tx base profile in configtxgen · 104ec7c1
      Jason Yellick authored
      
      
      In order to support altering ordering parameters at channel creation
      time, we add support for specifying a base orderer system channel
      configuration profile.  The expected usage is:
      
       configtxgen -outputChannelCreateTx my.pb.tx \
                   -channelCreateTxBaseProfile sysChannelProfile
      
      Change-Id: I191bb730251178241b57f7ec10a7810bc76b66bd
      Signed-off-by: default avatarJason Yellick <jyellick@us.ibm.com>
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
      104ec7c1
    • Yoav Tock's avatar
      FAB-12991 kafka2raft e2e tests green path · 6604d494
      Yoav Tock authored
      
      
      This task focuses on testing the consensus-type migration "green" path,
      that was introduced in FAB-13264.
      
      The main contribution is in migration_test.go, which defines 3 test-cases
      that test the green path. This is not the complete test suite for migration.
      It is introduced in this stage to allow reviewers to get the full picture
      of the feature defined in FAB-13264. The tests for the abort path and failure
      scenarios will be added in later tasks.
      
      The three test-cases are:
      
      1. A test that executes the migration flow on the Kafka side
         (from START-TX until COMMIT-TX), on the system channel only.
      2. A test that executes the migration flow on the Kafka side
         (from START-TX, CONTEXT-TX until COMMIT-TX), on the system
         channel and a single application channel. 
      3. A test that executes the migration flow on the Raft side
         (from START-TX, CONTEXT-TX until COMMIT-TX, followed by restart
         of the orderer), on the system channel and a two application channel.
      
      The tests are somewhat overlapping but are verifying different aspects of
      the expected behavior. Overall, the tests verify that the flow of:
      
       - START-TX => CONTEXT-TX (x #std-channels) =>COMMIT-TX => Restart
         => (optional NONE-TX)
      
      results in a functional etcdraft-based ordering service. That is, normal
      transactions can be executed, and new channels can be created.
       
      
      The task introduces some minor changes to the nwo test framework in order to
      support the testing of the new feature:
      
       - Add OrdererCapabilites to Config, since kafka-to-raft migration is gates
         by a new V2_0 capability
         - add the following to tests that read the network config from file
      
           orderercapabilities:
             v20: false
      
       - Adds method to verify failure to update the OrdererConfig
       - Extends the configtx template to include support for the V2_0 orderer
         capability
       - Extend the network.go to
         - support V2_0 orderer capability
         - verify channel creation is blocked
         - support ConsensusType.Type changes
       - Extend standard_networks.go to
         - Support V2_0 orderer capability
         - define a Kafka2Raft and Kafka2RaftMultiChannel configurations for
           migration tests
      
      Change-Id: I043b133b4c716f3bf53512f1999c7dfbc8aa67bb
      Signed-off-by: default avatarYoav Tock <tock@il.ibm.com>
      6604d494
  13. 03 Mar, 2019 9 commits
    • Jay Guo's avatar
      [FAB-14181] GinkgoRecover should be deferred directly · 9543fa2b
      Jay Guo authored
      
      
      Change-Id: I7d5d3d70705668d18dfd5c4f1cc68b4d59573eda
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      9543fa2b
    • Jay Guo's avatar
      [FAB-14179] Perform checks of instantiation in test · 4e612182
      Jay Guo authored
      
      
      `checkPeers` should be supplied to `nwo.InstantiateChaincode`
      if called separately, so that checks are performed to ensure
      the instantiation. Otherwise, an immediate query/invoke following
      this would fail, because that deploy tx may not be committed yet.
      
      Change-Id: I8e870b183c279aca53961745031fdee7085efe18
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      4e612182
    • Jay Guo's avatar
      [FAB-13656] Size-based snapshotting · 566562e7
      Jay Guo authored
      
      
      Instead of taking snapshot every N blocks, this CR
      changes it to taking snapshot every N bytes.
      
      This also sets default SnapshotInterval to 100MB, if
      it's unset. Otherwise data in memory is never compacted
      till OOM.
      
      Meanwhile, DefaultSnapshotCatchUpEntries is shrunk so
      it does not take too much space to preserve extra entries
      every time a snapshot is taken. Slow nodes are catching up
      using blockpuller, which is also efficient.
      
      Change-Id: I79cfeb8652fcbafdeb5793bf4f06267b95a858d6
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      566562e7
    • Jay Guo's avatar
      [FAB-13934] Add GinkgoRecover to integration tests. · 9e9000a2
      Jay Guo authored
      
      
      This makes assertion failure more debuggable.
      
      Change-Id: I66f8ac8c9b755eaab37f89a10a39c3bfa44ef39a
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      9e9000a2
    • yacovm's avatar
      [FAB-13618] Fix test flake in OSN eviction test · 06671310
      yacovm authored
      
      
      The integration test that checks that an orderer is evicted from a channel
      and stops its service for the channel has broken due to:
      
      1) A removed log message that it used as an indicator was removed
         in a parallel CR.
      2) In another parallel CR, the communication layer now puts message into
         the log asynchronously and doesn't block - and as a result -
         a node might be evicted from the channel but the other nodes will
         close the connection to it before it has a chance of obtaining the block
         that evicts it from the channel.
      
      For (1) - the message that no longer exists was removed from the test.
      
      For (2) - the node that is removed is now always the leader, and this way
                it always gets the block update (because it sends it in the first
                place).
      
      Change-Id: Ib67d1a448447ef44d9b41f52c8ee8bddb6b064ce
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      06671310
    • yacovm's avatar
      [FAB-14010] Integration test- remove OSN from cluster · e1b2171d
      yacovm authored
      
      
      This change set adds an integration test that removes an OSN
      from an application channel and system channel and ensures
      that the OSN gracefully shuts down for these channels.
      
      Change-Id: Idcdad8083f5881c6194185ad5f623c9c64323a02
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      e1b2171d
    • Jay Guo's avatar
      [FAB-12709] Add integration test for CheckQuorum · bd6bd0ec
      Jay Guo authored
      
      
      Change-Id: Ie80ff2f11de59a216a94fc61330f9d625ed16e59
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      bd6bd0ec
    • Jay Guo's avatar
      [FAB-13848] Fix flaky integration test in raft cft · 20dc27fc
      Jay Guo authored
      
      
      Test may query chaincode too fast after invocation, before block
      is actually committed.
      
      Change-Id: I4159fb2dfb31310eccfd64fcb9a9a99ceef54db0
      Signed-off-by: default avatarJay Guo <guojiannan1101@gmail.com>
      20dc27fc
    • yacovm's avatar
      [FAB-13363] Block verification for onboarding · b6dc844a
      yacovm authored
      
      
      This change set connects the block verification infrastructure
      for onboarding to the production code.
      
      Now, whenever an orderer onboards a channel - it also verifies the blocks
      of the application channels, by:
      
      1) Creating a bundle from the genesis block, which is derived from
         the system channel (which is verified using backward hash chain validation).
      2) Verifying blocks using the bundle.
      3) Replacing the bundle with a new bundle whenever a config block is pulled.
      
      It also adds a check in the integration test, that ensures that no errors
      are reported in the log of the onboarded OSN.
      
      Change-Id: I3c5714f9d4491cdfd78e4e47407925136906d413
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
      b6dc844a
  14. 28 Feb, 2019 1 commit
    • Saad Karim's avatar
      [FAB-14331] - Fix flaky Couch DB health check test · 67b42faf
      Saad Karim authored
      
      
      There is a delay between the call to stop container,
      and the container getting removed from the docker network.
      
      Modified the test to poll the health check endpoint until
      the docker container is completely removed and the health
      check returns back the expected result.
      
      Enables the Couch DB health check test [FAB-14333]
      
      Change-Id: If6da40793e63378e6fd79e1d446b66ffeb72af72
      Signed-off-by: default avatarSaad Karim <skarim@us.ibm.com>
      67b42faf
  15. 27 Feb, 2019 9 commits
    • yacovm's avatar
      [FAB-13331] Refactor metadata updates in nwo · 4f802d51
      yacovm authored
      
      
      This change set refactors metadata updates by making them
      use a function that dictates how to handle consensus metadata.
      
      Change-Id: I3aa68e4b268a24887e4cba891e02ebce1a2ec65d
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      4f802d51
    • Artem Barger's avatar
      [FAB-14332] disable flaky CouchDB healthcheck test · c0450096
      Artem Barger authored
      
      
      Change-Id: Ic17894d5eff66a195f93fcccacf2e3115587d7a5
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
      c0450096
    • yacovm's avatar
      [FAB-13180] Orderer: auto-join existing inactive chains · 4bc13c8e
      yacovm authored
      
      
      This change set makes cluster type OSNs autonomously detect channels
      that exist and that they should be part of (the channel configuration
      has their public credentials as a consenter for the channel),
      but that they do not run chains for, or have the blocks in their ledger.
      
      This can happen from several reasons:
      - The OSN is added to an existing chain, and since it didn't participate
        in the chain so far, it didn't get the blocks that tell it is now
        part of the channel.
      - The OSN tried to detect whether it is part of a channel, but it
        wasn't able, because all OSNs of the system channel returned
        service-unavailable. This can happen if:
        - a leader election takes place
        - the network is acting up so the leadership was lost
        - a channel has been deserted (all OSNs left it).
      
      To take care of such use cases, all OSNs now:
      - Track inactive chains that they know of, but they do not participate in
      - Periodically(*) probe the system channel OSNs to see if they are now
        part of these chains or not.
      - If so, then they replicate the chains, and create instances of them,
        and replace the instances of the inactive chains in the registrar
        with the new instances of type etcdraft.
      
      (*) - 10 seconds after boot, then after 20 seconds,
            then after 40 seconds, etc. etc. eventually- every 5 minutes.
      
      Change-Id: I3c2a84e6f4f402e011e7a895345b3d3982247083
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      Signed-off-by: default avatarArtem Barger <bartem@il.ibm.com>
      4bc13c8e
    • yacovm's avatar
      [FAB-13332] Add cryptogen extend to integration tests · 6e34e329
      yacovm authored
      
      
      This change set adds an ability to call "cryptogen extend"
      in integration tests.
      
      Change-Id: I5db2adbdb1260bf47da33ad1b5df9022a8fb1c95
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      6e34e329
    • yacovm's avatar
      [FAB-13330] Rename GetConfigBlock to GetConfig in nwo · 4ba5d615
      yacovm authored
      
      
      This change sets renames the GetConfigBlock to GetConfig, to fit
      what it actually returns.
      
      Change-Id: Ica00d5e6dab91852767c1c4fd1d8af0454bd1bd5
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      4ba5d615
    • yacovm's avatar
      [FAB-13362] Pulling not servicing chains in onboarding · 12948f81
      yacovm authored
      
      
      An orderer might not have permission to try and probe whether it belongs
      to a certain application channel.
      In addition, since the OSNs of an application channel might be a subset
      of the system channel OSNs, they may be unreachable at the time of
      onboarding, so all we will get from other OSNs is "service unavailable".
      
      This change set addresses this by making that if we try to pull blocks
      in order to see whether we belong to the channel (by pulling the latest block)
      and we only bad responses from all OSNs that say: un-authorized, not available,
      we don't panic. Instead we just skip pulling the chain.
      
      If some orderer returns unauthorized, and the rest either not return
      anything, or return a bad request, unavailable, etc. - we return
      that we are unauthorized.
      
      If some orderer returns service unavailable, and the rest return
      anything that is not unauthorized, then we classify it as service
      unavailable.
      
      If no orderer returns unauthorized/unavailable,
      and all orderers return something bad or not return anything at all -
      we now panic as before, because it means we probably misconfigured the
      node, or we are in a network partition so we don't want to
      skip pulling blocks.
      
      This change set also enchances the reconfiguration integration test
      to include a third channel for which the onboarded OSN is not authorized.
      
      Change-Id: I6f9b0cfe3671794ef1c036b432e77e2ac55b1efd
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      12948f81
    • yacovm's avatar
      [FAB-13441] Properly capture OSN output · c73870ff
      yacovm authored
      
      
      The reconfiguration and onboarding integration tests ensures
      that the OSNs stop logging errors at the end of the test,
      in order to ensure there aren't any not noticed faults
      that occurred due to reconfiguration/onboarding.
      
      The function used the wrong method to obtain a buffer
      that is used to read the process's output.
      
      Change-Id: Ieadae1bb083454b195cbfe52b41582dc9dbbf80a
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      c73870ff
    • yacovm's avatar
      [FAB-13643] Leader crash and failover integration test · 5c3e122f
      yacovm authored
      
      
      This change set adds a test case that:
      
      1) Spawns 3 etcdraft OSNs
      2) Finds out who is the leader
      3) Kills it
      4) Makes sure one of the remaining 2 OSNs takes over.
      
      Change-Id: I6f42003ddd987b7927a0f06018682829844f3994
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      5c3e122f
    • yacovm's avatar
      [FAB-13415] DRY up UpdateConsensusMetadata in nwo · 86cb2d8c
      yacovm authored
      
      
      This change set makes AddConsenter and RemoveConsenter use
      a consensus specific method UpdateEtcdRaftMetadata instead
      of the generic UpdateConsensusMetadata one, to remove
      code duplication.
      
      It also addresses a few nits in etcdraft_reconfig_test.
      
      Change-Id: I86d50fd80d4985df77474c054ce916f0d2fb62e7
      Signed-off-by: default avataryacovm <yacovm@il.ibm.com>
      86cb2d8c