The basic unit of Sharding(scalability and load balancing) is region,each region consist of a sequence of row keys, the regions are created and merged depending upon the configurable parameter
The table is initially created with a single region since the Hbase itself can not know in advance where to create the split points when the size of region grows more than what is configured it is auto-split into different regions from middle of the row key.(Is is again Configurable)
Whenever data in the HBase table is updated it is first written in the WAL(Write Ahead Log)File and stored in memstore(memory store) once the data in memory exceeds the limit the data is flushed into the HFile on disk,As the store file increases the region server compacts them into combined larger files.As the store file accumulates or after the compaction finishes a Region-Split request is queued to decide if Splitting the Region is required ,Since the HFiles are immutable the newly created region if created will not rewrite the data Instead, they will create small sym-link like files, named Reference files, which point to either top or bottom part of the parent store file according to the split point.
Thus Splitting is very fast as the newly created region will read from the original storage files until a compaction rewrites them all into separate one asynchronously
For further Reading I will suggest to go through the below link