Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashring alignment and cache reads #26

Open
dolvany opened this issue Sep 15, 2017 · 29 comments
Open

Hashring alignment and cache reads #26

dolvany opened this issue Sep 15, 2017 · 29 comments

Comments

@dolvany
Copy link
Contributor

dolvany commented Sep 15, 2017

I would like to consult the trifecta of graphite wisdom @jjneely @deniszh @grobian regarding some general questions.

First, I would like to understand how to align the carbon-c-relay hashring with the buckytools hashring given the following configurations.

cluster cache1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
    srv2:2053
    srv2:2054
  ;
buckyd -hash jump_fnv1a srv1 srv2

Would this result in aligned hashrings even though carbon-c-relay is sending to multiple cache processes on each server?

Does it make sense to have cache instances dedicated to writing and cache instances dedicated to reading? Would this make reads and writes more performant?

@deniszh
Copy link
Contributor

deniszh commented Sep 15, 2017

I don't think that it will work like this.
Usually, people using 2-tier setup in that situation - one relay distribute metrics across servers, another relay on the host itself distributes metrics across carbon caches for load balancing.
Or you can use go-carbon and get rid of 2-nd tier - it's performant enough.

Does it make sense to have cache instances dedicated to writing and cache instances dedicated to reading? Would this make reads and writes more performant?

It has no sense - if carbon daemon gets no writes then it will have no cache for reading.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 15, 2017

Hmm, judging by the docs, it seems that instance names may solve this...

cluster cache1
  jump_fnv1a_ch
    srv1:2053=a
    srv1:2054=b
    srv2:2053=c
    srv2:2054=d
  ;
buckyd -hash jump_fnv1a srv1:a srv1:b srv2:c srv2:d

After taking this for a test drive, it seems that buckyd doesn't like it: Error parsing hashring: strconv.ParseInt: parsing "a": invalid syntax

@jjneely
Copy link
Owner

jjneely commented Sep 15, 2017

You need to give buckyd the exact same host/port/instance strings as you do for carbon-c-relay. So:

./buckyd -hash jump_fnv1a srv1:2053=1 srv1:2054=b srv2:2053=c srv2:2054=d

@dolvany
Copy link
Contributor Author

dolvany commented Sep 15, 2017

Ah, thx @jjneely. The readme seems a bit misleading on this syntax. Does it require an update?

SERVER
SERVER:INSTANCE
SERVER:PORT:INSTANCE

Also, when I use buckyd srv1 I get this output for bucky servers.

-bash-4.2$ bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: jump_fnv1a:	  0:srv1
Number of replicas: 1
Found these servers:
	srv1

Is cluster healthy: true
-bash-4.2$

But when I use buckyd srv1:2053=a srv1:2054=b I get this output for bucky servers.

-bash-4.2$ bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: jump_fnv1a:	  0:srv1	  1:srv1
Number of replicas: 1
Found these servers:
	srv1
	srv1

Is cluster healthy: false
2017/09/15 18:44:33 Cluster is inconsistent.
-bash-4.2$

Not sure what is causing the cluster to go inconsistent or if it is something to be concerned about.

@deniszh
Copy link
Contributor

deniszh commented Sep 15, 2017

You need to run buckyd on all instances, so e.g. two buckyd on same port on srv1
That's why I said:

I don't think that it will work like this.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 15, 2017

@deniszh One buckyd per carbon-cache instance, not per server? So, would the typical deployment only run one carbon-cache per server?

@jjneely
Copy link
Owner

jjneely commented Sep 15, 2017

I only run one carbon-cache / go-carbon daemon per server.

The way replication/load balancing works I want to make sure I have 2 copies of the same metric on different servers, and not assigned to 2 different daemons that happen to live on the same host. (I'll hopefully have some replication support in buckytools in the next month or so.)

In the far distant past I did run multiple carbon-cache daemons per server to handle my throughput, but the storage requirements grew so much that I had more disk IO than the ingestion could ever keep up with.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 15, 2017

Thx, @jjneely. Let me provide some more transparency regarding my goal. Below is the carbon-c-relay config. I am not using a replication factor, just duplicating the metrics to two separate clusters. I would like to use bucky to manage each cluster independently. Is reducing the number of carbon-cache instances on each server to one the only reasonable way to integrate bucky?

cluster c1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
    srv2:2053
    srv2:2054
  ;
cluster c2
  jump_fnv1a_ch
    srv3:2053
    srv3:2054
    srv4:2053
    srv4:2054
  ;
match *
  send to
    c1
    c2
  stop
  ;

@jjneely
Copy link
Owner

jjneely commented Sep 15, 2017

At this point, yes, that's the easiest way to that goal.

Although, I guess the real bug here is making bucky aware of multiple instances on the same physical host.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 15, 2017

There are presumably two things going on here, @jjneely:

  1. Resolve a graphite key to a cluster member. It seems that this should work regardless of whether a cluster member appears twice (multiple carbon-cache instances).

  2. Enumerate the cluster members. This seems to be problematic if a cluster member appears more than once (multiple carbon-cache instances). Is it enough to just ensure that entries in the cluster member list are unique?

I suppose my thinking is more along the lines of ignoring the fact that multiple instances are on the same physical host, except for hashring purposes.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 16, 2017

I made some tweaks to bucky client to support multi-instance. I removed the check to verify that the cluster members length equals the hashring length, since this would not be true if a cluster member has multiple hashring entries. I also removed duplicates from the servers slice. I have no idea if this is a breaking change for anything else, but bucky servers runs clean.

diff --git a/cmd/bucky/cluster.go b/cmd/bucky/cluster.go
index 4af0585..d22eabf 100644
--- a/cmd/bucky/cluster.go
+++ b/cmd/bucky/cluster.go
@@ -7,6 +7,7 @@ import (
 )

 import "github.com/jjneely/buckytools/hashing"
+import "github.com/krasoffski/gomill/unique"

 type ClusterConfig struct {
        // Port is the port remote buckyd daemons listen on
@@ -79,6 +80,7 @@ func GetClusterConfig(hostport string) (*ClusterConfig, error) {
                Cluster.Servers = append(Cluster.Servers, v.Server)
        }

+       Cluster.Servers = unique.Strings(Cluster.Servers)
        members := make([]*hashing.JSONRingType, 0)
        for _, srv := range Cluster.Servers {
                if srv == master.Name {
@@ -105,9 +107,9 @@ func isHealthy(master *hashing.JSONRingType, ring []*hashing.JSONRingType) bool
        // XXX: Take replicas into account
        // The initial buckyd daemon isn't in the ring, so we need to add 1
        // to the length.
-       if len(master.Nodes) != len(ring)+1 {
-               return false
-       }
+       // if len(master.Nodes) != len(ring)+1 {
+       //      return false
+       // }

        // We compare each ring to the first one
        for _, v := range ring {

@grobian
Copy link
Contributor

grobian commented Sep 16, 2017

I'd like to point out that unlike carbon_ch, fnv1a_ch does include port in its hash-key. Since you use that hash, I think there should be no such thing as "duplicate" cluster members or something. @jjneely wrote this imo in #26 (comment).

@dolvany
Copy link
Contributor Author

dolvany commented Sep 16, 2017

Let me see if my assumptions are correct, @grobian. Please let me know if any of this is amiss. Bucky client derives the list of cluster hosts from the destinations in the hashring. Regardless of hash, the same cluster host can appear more than once in the hashring. Bucky client doesn't seem to like when it derives the same host more than once from the hashring. This begs a question. Can the same destination appear more than once in the hashring? Seems like it could provide a weighting factor for heterogeneous hardware. I am trying to figure out how to shoehorn carbon-c-relay and buckytools into a preexisting cluster which was scaled up with multiple instances of carbon-cache.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 16, 2017

I wonder if this is solvable in carbon-c-relay with a cluster of clusters approach.

cluster srv1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
  ;
cluster srv2
  jump_fnv1a_ch
    srv2:2053
    srv2:2054
  ;
cluster c1
  jump_fnv1a_ch
    srv1
    srv2
  ;
match *
  send to
    c1
  stop
  ;
buckyd srv1 srv2

@deniszh
Copy link
Contributor

deniszh commented Sep 16, 2017

Regardless of hash, the same cluster host can appear more than once in the hashring

Does it? Not really sure.

Can the same destination appear more than once in the hashring?

IMO no - by definition of hashring.

I am trying to figure out how to shoehorn carbon-c-relay and buckytools into a preexisting cluster which was scaled up with multiple instances of carbon-cache.

That's an also a problem - I do not really understand what's your problem and what are you're trying to achieve?

@dolvany
Copy link
Contributor Author

dolvany commented Sep 16, 2017

@deniszh, allow me to illustrate with a truncated example from the buckytools readme.

buckyd graphite010-g5:a graphite010-g5:b

It shows the same host, graphite010-g5, appearing multiple times in the hashring, once for each carbon-cache instance on the host. This is precisely the carbon-cache deployment that I have. The challenge I am having is that bucky servers fails the consistency check when I configure buckyd in this manner. Perhaps I am misunderstanding something fundamental here.

@grobian
Copy link
Contributor

grobian commented Sep 17, 2017

Could you perhaps describe your setup from the initial relay down to the carbon-cache processes? Getting a good idea about the flow of your metrics is key to get this right IMO.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 17, 2017

Sure, @grobian. Metrics->load balancer->multiple VMs with carbon-c-relay->multiple physical boxes each running multiple instances of carbon-cache. Carbon-c-relay config is identical on all VMs--consistent hash to all carbon-cache instances. I believe this is all working as intended--each graphite key is sent to a specific carbon-cache instance. Now, I am trying to integrate the cluster management piece.

@grobian
Copy link
Contributor

grobian commented Sep 18, 2017

Just summing up what has been said above to ensure we're all on the same page:

  1. you use a jump hash to distribute metrics over your cluster
  2. you have two such clusters
  3. both clusters receive the same input (e.g. they are mirrors) and because they are of the same size, their distribution per server is identical (because jump hash doesn't care about the server name, port or key, just the final ordering)
  4. sidenote, I recently applied this fix to c-relay: grobian/carbon-c-relay@1c50590 which should bring it back to the documented ordering, both bucky and c-relay need to agree on the ordering to have good operations
  5. your clusters have multiple carbon-cache instances running on the same host, and this is directly visible in your cluster configuration (e.g. instances affect your hash ring)
  6. you want to perform maintenance on your cluster using bucky

Due to 5. bucky and other tools get a tough job, because you probably /share/ the /var/lib/carbon/whisper directory amoung the multiple instances. It also makes future scaling out or down of instances on each server impossible because it will change the hash allocation (due to jump). To solve this, people typically use a c-relay on the storage server that simply any_of's all incoming metrics to a set of carbon-cache instances on the local host, thereby hiding any of this from tools like bucky. Your best start would be to implement this to be able to do 6. but it will cause metrics to move between srv1 and srv2 (and similar for srv3 and srv4 of course).

@dolvany
Copy link
Contributor Author

dolvany commented Sep 18, 2017

1, correct. 2, correct. 3, I am not familiar enough with the inner workings of the hashes to say whether I completely understand your point regarding the final ordering--an example would certainly clarify this for me. 4, awesome. 5, correct. 6, correct. To put this in terms of configuration, it seems you are suggesting the following (leaving the mirror cluster out for brevity). Will this achieve hashring alignment across c-relay and bucky?

Front Relay

cluster c1
  jump_fnv1a_ch
    srv1:2052
    srv2:2052
  ;
match *
  send to
    c1
  stop
  ;

Back Relay srv1:2052

cluster c1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
  ;
match *
  send to
    c1
  stop
  ;

Back Relay srv2:2052

cluster c1
  jump_fnv1a_ch
    srv2:2053
    srv2:2054
  ;
match *
  send to
    c1
  stop
  ;
buckyd srv1 srv2

Also, I am curious if the use of multiple carbon-cache instances per host is common enough to solve the problem without the use of a second layer of relays. It seems like it would be trivial to support two layers of hashing in a single c-relay instance. Thoughts, @grobian?

@grobian
Copy link
Contributor

grobian commented Sep 19, 2017

You want to avoid having multiple tiers of (c-)relays, is that correct? While I understand the rationale, it currently isn't possible and I don't see it high priority to implement double-hashing or anything.
Your config is indeed how it would look like. For performance and flexibility you could use any_of on the back relays instead of jump_fnv1a_ch. It shouldn't change anything for your situation.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 19, 2017

Sounds good, @grobian. Thanks for the guidance!

@azhiltsov
Copy link

@dolvany we had 12 carbon-cache processes per host with carbon-c-relay on the same host on front of them in order to distribute the load. At certain point it stopped work perfomance-wise and we switched to go-carbon which at current setup can easily handle 300K points/sec and with some tuning and external iSCSI storage up to 1000K points/sec sustained. It also eliminates carbon-c-relay on the host and you will be able to reduce amount of destinations in relay configs. It plays nice with bucky. Just have a look.

@grobian
Copy link
Contributor

grobian commented Sep 19, 2017

I would concur with azhiltsov's suggestion. carbon-cache.py isn't multi-threaded, hence running multiple in parallel. A c-relay in front of it is just a workaround, in reality it should've been multi-threaded by itself. go-carbon solves that nicely (and avoids the need for a local c-relay).

@dolvany
Copy link
Contributor Author

dolvany commented Sep 20, 2017

@azhiltsov @grobian So, would go-carbon be fronted with c-relay? It looks like go-carbon is a replacement for carbon-cache. What would the design look like?

@dolvany
Copy link
Contributor Author

dolvany commented Sep 20, 2017

@grobian If I use any_of on the back relay, would this result in some misalignment with CARBONLINK_HOSTS?

@grobian
Copy link
Contributor

grobian commented Sep 20, 2017

sender -> c-relay -> go-carbon

wrt CARBONLINK_HOSTS, I think that doesn't work at all anyway because fnv1a_jump_ch is not understood by graphite-web. This is the reason why we started carbonzipper. This "smart proxy" acts as a single carbon-cache to graphite-web, later we also replaced the latter with carbonapi.

@dolvany
Copy link
Contributor Author

dolvany commented Sep 20, 2017

@grobian But I could use carbon_ch on everything and then CARBONLINK_HOSTS would align with the caveat that the distribution may not be as efficient as other hashes, yes?

@grobian
Copy link
Contributor

grobian commented Sep 20, 2017

No, only if you use a single cluster with carbon_ch, CARBONLINK_HOSTS will be able to predict where metrics are located. It understands replication IIRC, but it always probes in order, in other words, it's highly unsuitable for setups which need to be redundant/fault-tolerant and performant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants