segunda-feira, 7 de outubro de 2019

HANA startup tuning – part 2

Starting with HANA 2.0 SPS4 there is a new option called fast restart. This option allows the DB to keep the column store (main area of CS) in memory during a restart of the HDB. So there is a difference to pmem. Persistent memory keeps the data also after an OS restart or power outage. This means that the fast restart option is a bridge technology which can be used independant from CPU and memory type. There are no additional costs to use it. This means also that there is no reason not to use it.

© 2019 SAP SE or an SAP affiliate company. All rights reserved
The main area of the CS is kept in area defined by a TMPFS, which is a standard temporary filesystem feature by OS level in DRAM. The main area covers 90-95% of persistent memory in SAP HANA. It provides less downtime but not the cost efficiency which can be gained with persistent memory (PMEM)
Prerequisite:
  • SLES12 SP3
  • SLES12 SP4
  • SLES15
  • RHEL 7.4
  • RHEL 7.6
  • RHEL 8.0
  • HANA 2.0 SPS4+

Theory part

At first you may ask why it only applies to the main store of CS?
Where is the difference between shared memory (SHM) and the temporary filesystem (tmpfs)?
Let’s start with shared memory. SAP decided to use SHM for the row store. shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies.
The tmpfs is like a RAM disk (=ramfs), but the difference is the virtual memory. ramfs will grow dynamically, tmpfs won’t. It would not allow you to write more than the size you’ve specified while mounting. So it uses the available memory of the OS to save data. Nothing special because this mechanism is pretty old. An early implementation of it was used in solaris systems for the /tmp filesystem. After a reboot all files were deleted.
Since HANA 1.0 SPS11 there is a similiar option for RS. As already described in part 1 of this blog series, with hdbrsutil these memory structures can be rescued and reused.

Practice part

Enough theory, let’s set up this new feature on a 8TB test environment. Only in advance – this setup won’t take you a long time.
#determine your numa node assignment
cat /sys/devices/system/node/node*/meminfo | grep MemTotal | awk 'BEGIN {printf "%10s | %20s\n", "NUMA NODE", "MEMORY GB"; while (i++ < 33) printf "-"; printf "\n"} {printf "%10d | %20.3f\n", $2, $4/1048576}'

#mount with size limitation
#add an entry to /etc/fstab if you want this setup permantly
mount tmpfs<sid>0 -t tmpfs -o mpol=prefer:0,size=xxxxG /hana/tmpfs0/<sid>
There are two parameter. Please don’t get confused by the term NVRAM or persistent memory in the documentation. It is also valid for the fast restart option.
For MAIN data fragments the in-memory storage location basepath must be defined as a configuration parameter in the [persistence] section of the global.ini file. Enter the basepath location in the basepath_persistent_memory_volumes parameter. All MAIN data fragments are stored at the location defined here. Multiple locations corresponding to NUMA nodes can be defined using a semi-colon as a separator (no spaces).
#add global.ini parameter basepath_persistent_memory_volumes in section [persistence]
/hana/tmpfs0/<sid>;/hana/tmpfs1/<sid>
activate/deactivate it at lower level (table) or per default for the complete DB (parameter: table_default).
Please note that basepath parameter can not be changed online! A restart is required to activate it.
You can check if your settings are working fine via hdbindexserver trace:
After startup of the DB:

There is also a consistency check which can be frequently scheduled:
CALL CHECK_TABLE_CONSISTENCY('CHECK_PERSISTENT_MEMORY_CHECKSUM', NULL, NULL);

Summary


As you can see the fast restart option speeds up the CS reload 4-6 times. Currently there is no reason not to use this feature. The configuration time invest for one system is about 30min if you are not implement it the first time.

Links:

Um comentário: