Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HDFS federation #227

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

j4ns8i
Copy link

@j4ns8i j4ns8i commented Mar 24, 2020

These changes now enforce proper HDFS federation/HA configurations. Specifically:

  • Highly available (HA) clusters require
    1. a nameservice in dfs.nameservices
    2. namenode ids for that nameservice in dfs.ha.namenodes.NAMESERVICE
    3. rpc addresses for those namenode ids in dfs.namenode.rpc-address.NAMESERVICE.NNID
  • Non-HA but federated clusters require
    1. a nameservice in dfs.nameservices
    2. an rpc address for the namenode in dfs.namenode.rpc-address.NAMESERVICE
  • Non-HA and non-federated clusters require
    1. an rpc address for the namenode in dfs.namenode.rpc-address

HA and federated configuration takes precedence such that if a property like dfs.nameservices is present, default clients will not use a sole namenode rpc address defined by dfs.namenode.rpc-address alone.

I'm sorry I haven't written any additional tests; I haven't had the time to set up an image on which to run a minicluster. If you have an idea for easily setting up a testing environment I'd be happy to try it. I think the tests (or testing environment) might actually need some rework to match upstream hadoop client behavior.

I tried not to be too opinionated but some of these changes did break previous behavior. This was done to match the behavior I observed from the hadoop fs command, which I used as a generalization for upstream hadoop client behavior. For example, a namenode specified through dfs.namenode.rpc-address.NAMESERVICE.NNID will no longer be returned from Namenodes or DefaultNamenodes unless that nameservice and namenode id are present in dfs.nameservices and dfs.ha.namenodes.NAMESERVICE, respectively. This should close #225, however.

Let me know what you think.

These changes now enforce proper HDFS configurations. Specifically:

  * Highly available (HA) clusters require
    1. a nameservice in dfs.nameservices
    2. namenode ids for that nameservice in dfs.ha.namenodes.NAMESERVICE
    3. rpc addresses for those namenode ids in dfs.namenode.rpc-address.NAMESERVICE.NNID
  * Non-HA but federated clusters require
    1. a nameservice in dfs.nameservices
    2. an rpc address for the namenode in dfs.namenode.rpc-address.NAMESERVICE
  * Non-HA and non-federated clusters require
    1. an rpc address for the namenode in dfs.namenode.rpc-address

HA and federated configuration takes precedence such that if a property
like dfs.nameservices is present, default clients will not use a sole
namenode rpc address defined by dfs.namenode.rpc-address alone.
@colinmarc
Copy link
Owner

Hi @j4ns8i, sorry for the radio silence.

Unfortunately, breaking changes to the API here would require a major version bump.

Is the problem here for that the different nameservices aren't available through HadoopConf, or is it that the commandline client doesn't work correctly in federated environments? If it's the latter, probably the most expedient way forward is to add this functionality to HadoopConf.Namenodes (without changing any signatures - I think that's fine in this case).

@j4ns8i
Copy link
Author

j4ns8i commented Jul 19, 2020

No worries at all 🙂

The problem I was facing was related to the different nameservices not working as I'd hoped from the HadoopConf generated from my configuration files. I'm importing this package into my own application rather than using the commandline client. The issue in particular, however, was not that the options weren't read from configuration, but that all namenodes found in the configuration files were being returned from the HadoopConf's Namenodes() function regardless of nameservice. This was causing my client to fail to look up files in one of my two nameservices, because the other one seemingly took precedence. I haven't dug deep enough to find out why that was happening.

While this is purely anecdotal, my application is currently running successfully using this PR's branch with the ability to distinguish between nameservices.

I appreciate the response and especially the effort that was put into developing and publishing this package - thank you!

@colinmarc
Copy link
Owner

colinmarc commented Jul 19, 2020

Ok, then would you be willing to rejigger this PR a bit?

  • Change the implementation (but not signature) of Namenodes() to return the default fs namenodes
  • Add a test for the above
  • Add a NamenodesForNameservice(ns string) method that tracks the functionality currently in the PR
  • Add tests for it.

There are currently test fixtures in hadoop/testdata which you can copy/expand.

@j4ns8i
Copy link
Author

j4ns8i commented Aug 3, 2020

I'll try to get to this at some point, I've just been tied up a bit recently.

@xtrimf
Copy link

xtrimf commented Feb 1, 2023

@j4ns8i should we wait for you? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hdfs-site.xml with multiple clusters
3 participants