You could have run into an issue similar to HBASE-10349. Have you pre-split your table? If you pre-split your table into enough regions, you should not run into this issue.
HBase 分region策略:
Split size is the number of regions that are on this server that all are of the same table, cubed, times 2x the region flush size OR the maximum region split size, whichever is smaller. For example, if the flush size is 128M, then after two flushes (256MB) we will split which will make two regions that will split when their size is 2^3 * 128M*2 = 2048M. If one of these regions splits, then there are three regions and now the split size is 3^3 * 128M*2 = 6912M, and so on until we reach the configured maximum filesize and then from there on out, we'll use that.
两个值: 1 2倍的region flush size(hbase.hregion.memstore.flush.size配置的值,默认1024*1024*128L 即128M) 或者是 maximum region split size,谁先达到就按哪个。
6 个回复
Hagrid
赞同来自: mopishv0
Hagrid
Hagrid
在批量对Hbase执行Put操作的时候当Table达到指定的数量时会执行split,从而导致regionserver下线,因此读不到他的信息,从而触发异常。
Hagrid
预分region和设定对应的rowkey范围,还有指定最大文件的大小。create 'usertable', 'family', {SPLITS => (1..200).map {|i| "user#{1000+i*(9999-1000)/200}"}, MAX_FILESIZE => 4*1024**3}
Hagrid
Hagrid
Split size is the number of regions that are on this server that all are of the same table, cubed, times 2x the region flush size OR the maximum region split size, whichever is smaller. For example, if the flush size is 128M, then after two flushes (256MB) we will split which will make two regions that will split when their size is 2^3 * 128M*2 = 2048M. If one of these regions splits, then there are three regions and now the split size is 3^3 * 128M*2 = 6912M, and so on until we reach the configured maximum filesize and then from there on out, we'll use that.
两个值: 1 2倍的region flush size(hbase.hregion.memstore.flush.size配置的值,默认1024*1024*128L 即128M) 或者是 maximum region split size,谁先达到就按哪个。
例如: 如果flush size 是128M,那么2次flush后就会分region。两次flushes之后,有两个128M的region,然后假设其中的一个region又要经过两次flush,当总的region大小为2^3 * 128M*2, 这里公式是:
当前region总数^3*flush size*2
这样做的 好处是不用等到maximum region split size再分region。