git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

HBase Short-Circuit Read Questions


Hi folks, I was going through our docs looking at SCR set up and had some
confusion. Asking here before filing JIRA issues. After writing this, I'm
realizing the length got a bit out of hand. I don't want to split this into
several threads because I think the information is all related, but may
have to do that if a single one becomes difficult to follow.


The main docs link: http://hbase.apache.org/book.html#shortcircuit.reads

1)

Docs claim: dfs.client.read.shortcircuit.skip.checksum = true so we don’t
double checksum (HBase does its own checksumming to save on i/os. See
hbase.regionserver.checksum.verify for more on this.

Code claims:

https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/util/CommonFSUtils.java#L784-L788

That if this property is set, then we log a warning?

Unrelated, this is duplicated between CommonFSUtils and FSUtils, will need
a jira to clean that up later.

Also, there's a comment in
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L689-L690
that claims we automatically disable it, which we do in the HFileSystem
constructor by setting the same dfs property in our conf to true.

So I'm confused if we should be setting the property like the docs claim,
not setting it like FSUtils warns, or ignoring it and letting RS auto-set
it.

Also unrelated, there is a check in HFileSystem from HBASE-5885 for what I
think is HADOOP-9307, but we should be able to simplify some of that logic
now.

2)

Docs claim: dfs.client.read.shortcircuit.buffer.size = 131072 Important to
avoid OOME — hbase has a default it uses if unset, see
hbase.dfs.client.read.shortcircuit.buffer.size; its default is 131072.

This is very confusing, we should set the property to some value, because
if it's unset then we will use... the same value? This reads like needless
operator burden.

Looking at the code, the default we really use is 64 * 1024 * 2 = 126976,
which is actually close, but off by enough to give me pause.

The default HDFS value is 1024 * 1024, which suggests that they're
expecting a value in the MB range and we're giving one in the KB range?
See:
https://github.com/apache/hadoop/blob/master/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java#L146-L147

Just now, I'm realizing that the initial comment in the docs might mean to
tune it way down to avoid OOME, my initial reading was that we need to
increase the ceiling from whatever default setting comes in via HDFS. Would
be good to clarify this, and also figure out what units the value is in.

3)

Docs suggest: Ensure data locality. In hbase-site.xml, set
hbase.hstore.min.locality.to.skip.major.compact = 0.7 (Meaning that 0.7 <=
n <= 1)

I can't find anything else about this property in the docs. Digging through
the code, I find an oblique reference to HBASE-11195, but there's no RN
there or docs from there, and reading the issue doesn't help me understand
how this operates either. It looks like there was follow on work done, but
it would be useful to know how we arrived at 0.7 (seems arbitrary) and how
an operator could figure out if that setting is good for them or needs to
slide higher/lower.


Thanks,
Mike