git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HBase Short-Circuit Read Questions


On Fri, Jun 1, 2018 at 12:01 PM, Stack <stack@xxxxxxxxxx> wrote:

> On Fri, Jun 1, 2018 at 9:36 AM, Mike Drob <madrob@xxxxxxxxxxxx> wrote:
>
> > Hi folks, I was going through our docs looking at SCR set up and had some
> > confusion. Asking here before filing JIRA issues. After writing this, I'm
> > realizing the length got a bit out of hand. I don't want to split this
> into
> > several threads because I think the information is all related, but may
> > have to do that if a single one becomes difficult to follow.
> >
> >
> Thanks for taking a look. See in below.
>
>
>
> >
> > The main docs link: http://hbase.apache.org/book.html#shortcircuit.reads
> >
> > 1)
> >
> > Docs claim: dfs.client.read.shortcircuit.skip.checksum = true so we
> don’t
> > double checksum (HBase does its own checksumming to save on i/os. See
> > hbase.regionserver.checksum.verify for more on this.
> >
> > Code claims:
> >
> > https://github.com/apache/hbase/blob/master/hbase-
> > common/src/main/java/org/apache/hadoop/hbase/util/
> > CommonFSUtils.java#L784-L788
> >
> > That if this property is set, then we log a warning?
> >
> > Unrelated, this is duplicated between CommonFSUtils and FSUtils, will
> need
> > a jira to clean that up later.
> >
> > Also, there's a comment in
> > https://github.com/apache/hbase/blob/master/hbase-
> > server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.
> > java#L689-L690
> > that claims we automatically disable it, which we do in the HFileSystem
> > constructor by setting the same dfs property in our conf to true.
> >
> > So I'm confused if we should be setting the property like the docs claim,
> > not setting it like FSUtils warns, or ignoring it and letting RS auto-set
> > it.
> >
> >
>
> This is a classic font of confusion with layers of attempts over time at
> simplification, auto-config'ing, code migration, and poor doc (I believe
> I'm author of good bit of this mess here). Would be cool if it got a revamp
> informed by tinkering with configs and an edit by a better writer than I.
>
>
> I'm working on untangling this mess, but I just got lost in the weeds of
the argument on HBASE-6868.

I have to assume that this concern over double checksumming, or missing
checksums on remote files, or whatever else is going on in that issue only
applies to truly ancient versions of Hadoop at this point? Do we think it's
safe to say that if SCR are enabled, we always want to enable HBase
checksums and skip HDFS checksums? That's what the docs appear to
recommend, but the code approaches it in the converse perspective:

If HBase checksumming is enabled, we set dfs.c.r.sc.skip.checksum to true
and fs.setVerifyChecksum(false) in HFileSystem. User doesn't even have
option to override that. HBase checksumming is on by default, so we don't
need to mention any of this in the docs, or we can mention turning on hbase
xsum and turning off dfs xsum and then clarify that none of this is
actionable.


> > 2)
> >
> > Docs claim: dfs.client.read.shortcircuit.buffer.size = 131072 Important
> to
> > avoid OOME — hbase has a default it uses if unset, see
> > hbase.dfs.client.read.shortcircuit.buffer.size; its default is 131072.
> >
> > This is very confusing, we should set the property to some value, because
> > if it's unset then we will use... the same value? This reads like
> needless
> > operator burden.
> >
> > Looking at the code, the default we really use is 64 * 1024 * 2 = 126976,
> > which is actually close, but off by enough to give me pause.
> >
> > The default HDFS value is 1024 * 1024, which suggests that they're
> > expecting a value in the MB range and we're giving one in the KB range?
> > See:
> > https://github.com/apache/hadoop/blob/master/hadoop-
> > hdfs-project/hadoop-hdfs-client/src/main/java/org/
> > apache/hadoop/hdfs/client/HdfsClientConfigKeys.java#L146-L147
> >
> > Just now, I'm realizing that the initial comment in the docs might mean
> to
> > tune it way down to avoid OOME, my initial reading was that we need to
> > increase the ceiling from whatever default setting comes in via HDFS.
> Would
> > be good to clarify this, and also figure out what units the value is in.
> >
> >
> Agree.
>
> IIRC, intent was to set it way-down from usual default because hbase runs
> w/ many more open files than your typical HDFS client does.
>
> Ok, we can update docs to clarify that this is a value in bytes, the
default HDFS value is 1MB, our default value is 128KB and that the total
memory used will be the buffer size * number of file handles. What's a
reasonable first order approximation for number of files per RS that will
be affected by SCR? Hosted Regions * Columns? Doesn't need code change, I
think, but the rec for 131072 should be removed.

>
>
>
> > 3)
> >
> > Docs suggest: Ensure data locality. In hbase-site.xml, set
> > hbase.hstore.min.locality.to.skip.major.compact = 0.7 (Meaning that 0.7
> <=
> > n <= 1)
> >
> > I can't find anything else about this property in the docs. Digging
> through
> > the code, I find an oblique reference to HBASE-11195, but there's no RN
> > there or docs from there, and reading the issue doesn't help me
> understand
> > how this operates either. It looks like there was follow on work done,
> but
> > it would be useful to know how we arrived at 0.7 (seems arbitrary) and
> how
> > an operator could figure out if that setting is good for them or needs to
> > slide higher/lower.
> >
> >
> I don't know anything of the above.
>
> Will save this for later then.


> Thanks,
> S
>
>
>
> >
> > Thanks,
> > Mike
> >
>