Using Google Chrome?

Download my
free Chrome Extension, Power Notes Searcher, to make searching for and evaluating SAP notes, much easier.

Thursday, June 26, 2014

HANA DB Re-initialisation Without Reinstall

Scenario:  You have a small test system or PoC system and you want to revert or recreate the HANA DB like you've just opened the box and installed it from new, but without the hassle of the reinstall.

This is completely possible using the hdbnsutil command line program.
With HANA shutdown, connect to the HANA server via SSH as the <sid>adm Linux user, then run the hdbnsutil command line program as follows:

hana01:/usr/sap/H10/HDB10/exe> hdbnsutil -initTopology

checking for inactive nameserver ...
nameserver hana01:31001 not responding.
creating persistence ...
run as transaction master writing initial topology...
writing initial license: status check = 2

As you will see, it recreates the persistence layer (database) and also re-creates the license.
You will need to reinstall your HANA system license after the re-initialisation process is complete.
On a slow system, the process took approximately 2 minutes.

Due to the size of my system, I am unable to tell you if this process destroys any specific configuration.  From what I can tell, the existing global.ini, nameserver.ini and indexserver.ini are kept.

You should also note that the SYSTEM user password is reset to its default value of "master".
Plus, if you have enabled encryption, the reinitialised data volumes will be re-encrypted unless you de-check the checkbox on the Data Volume Encryption tab inside the Security tab, prior to reinitialisation.

Encrypting HANA Data Volumes

Out of the box, the HANA database is not encrypted.
For those businesses that demand encryption, from within HANA Studio you can activate the encryption of the HANA data volumes.
The HANA documentation supplied with SPS7, suggests that the recommended approach is to encrypt the data volumes as part of the HANA installation process.  This is due to the "Copy-on-write" method used to persist data on disk.  An already used database may still have unencrypted data pages in it, even after encryption is enabled.

You should note that the words "data volumes" means the location of the database files for the indexserver and the statistics server, which is usually something like "/hana/data/<SID>/mnt0000<n>/hdb0000<n>/*.dat".

Tip: You can check which values for "<n>" in the above, will be used, by checking the "Landscape Overview" tab and the "Volumes" tab within that.

- SAP recommend changing the SSFS encryption key after installation of the HANA system.  This key is used to store the data volume encryption key and others.
- Disk space of the data volume areas (we discuss this below anyway).
- Take a cold backup of the HANA DB area and config files.
- Access to the HANA Studio (you can do this with SQL access and hdbsql, but I don't show this).

Let's begin:

From within HANA Studio, open the "Security" tab and navigate to the "Data Volume Encryption" tab.
You will notice that encryption is not already enabled.
Tick the "Activate encryption of data volumes" tick box, then click the "Execute" button:

HANA volume encryption activation

HANA data volume encryption execute

The encryption process will start immediately ("Encryption Pending" then "Encryption Running"):

HANA volume encryption pending

The status is updated when each server process finishes:

HANA volume encryption running

Once the encryption process is completed successfully, the tick box is enabled again for you to be able to de-encrypt the volumes should you wish to reverse the process:

HANA volume encryption encrypted

My small (empty) HANA DB @ sps7, took approximately 10 minutes to encrypt.
The data volume sizes increased as part of the encryption process.
The indexserver data volume went from 324MB used to 341MB used:

HANA volume encryption indexserver size

HANA volume encryption indexserver size

Of more importance is the dramatic increase in the allocated (Total Size).
It's gone from 513MB to 896MB!  You will need to be aware of the disk space you may need before enabling encryption.

The statistics server is smaller and went from 80MB used to 101MB used:

Snap668 2014-06-26, 11_26_17


Again, notice that we've got an increased allocation from 320MB to 400MB.

Validate the encryption status using an SQL console:



What are the implications for encryption?

Well, the data volumes have been encrypted, but we've not seen anything about encrypting the logs or the backups.
Also, to preserve the performance of the HANA system in memory, the data is decrypted when read from the disk into memory.

Is encrypting the HANA DB transaction logs feasible?  Probably not.  There are various whitepapers detailing the issues of encryption of data stored on SSDs.  Since the majority of high performance appliance resellers use SSDs for the HANA DB transaction logs, the use of software layer encryption on SSDs is not worth the effort and would probably reduce the performance of the database.  Instead, using the SSD hardware layer encryption may prove useable, but this is only worth while if you think that the SSD could be physically stolen.  SSD hardware encryption doesn't prevent an intruder at the O/S level from seeing the data.

Is encrypting the HANA DB backups (data and logs) feasible?  Yes this is definitely something that should be employed.  There are 3rd party backup tools that will take a copy of the HANA backup files, then encrypt and store elsewhere.  But these tools may not support secure wipe, so the disk location of the backup files would potentially still contain the data.  Using backup tools certified for SAP with the BackInt interface would be better, since the data doesn't touch the disks, it's piped straight to the backup device via the "network".

There is possibly some slight performance impact from encrypting the data volumes.  The data files are written to at least every 5 minutes.  They are also read from during HANA start up.  These I/O operations will be the same speed as they were before, except there will be some additional software routines involved to perform the encryption, which will use slightly more memory and mean a slight delay between the data being read/written to/from the disk.

Be aware of bugs in the backup/restore process.
I've seen mention of bugs in other software products where an encrypted backup is void due to a software bug.  Ensure that you test your solution thoroughly.

SAP recommend that you change the Page Encryption Key (used to encrypt the data volumes) regularly as per your organisations standards.  This may yet again increase the allocated size.

- Data volume encryption is easy and fast in HANA.  There's not really any reason to not implement it.  Beware of implementing in an already populated database and ensure you change the keys regularly.
- Backups and transaction log volumes are not encrypted and for the logs there's a good reason why you may not want to.
- Performance in certain scenarios could be affected slightly.
- You should attempt to implement a supported Backint capable backup product with encryption, since the backups with this method don't touch the unencrypted disks.
- Encryption can be performed and validated at the SQL command level.
- Be aware that data volume encryption in HANA will require more disk space to be allocated to the data volumes, increasing the footprint of the HANA database by as much as 50%.

SAP HANA Backup Allocation Failed

During a HANA backup, you get an "Allocation Failed" error.
This is caused by a lack of memory.  If possible, increase the memory available to HANA by increasing the free memory at the O/S level (e.g. shutting down other HANA instances), or increase the global allocation limit.

Thursday, June 19, 2014

HANA OOM Error Tip #1 - Partition Tables Correctly

If your HANA system is regularly experiencing OOM (Out Of Memory) errors, then there are a number of things that you can do to try and reduce memory consumption.

Tip #1:  Partition Large Tables Correctly
If there are large Column Store tables in your HANA system, you should partition them.
Whilst this is an absolute must in a HANA scale-out scenario (for enabling multi-node parallelism), it might not be so obvious that it can also help in a single node HANA system.
Partitioning a column table means that only the required partitions of the table are loaded into memory when accessed, you would think.

Partitioning a large table into smaller chunks will therefore help to reduce the memory usage of the table during SQL queries and also during updates.
During updates, each partition gets its own delta cache area.

Choosing how to partition a table is slightly more difficult and will depend on whether the table is a custom table, SAP standard table or other.
Plus, you will need to know what and how queries or updates are executed against the table.  A period of monitoring is suggested, to enable to you collect the required information to be able to make a decision.

One thing you should try to do, is partition the table using the most logical, selective columns.
Read on for a simple example with a twist!

A simple example, a single node HANA system has 1 large column table T1.
The table is partitioned into ranges based on the date column INVOICEDATE:


As you can see, I've created 3 partitions by year:  2000 to 2001,  2001 to 2002 and 2002 to 2003.
This will actually create 4 partitions:  year 2000,  year 2001, year 2002 and year <OTHER>.

HANA Table Distribution

Insert 5 records into the table:

insert into darryl.t1 (INVOICEREF,INVOICEDATE) values('1','2000-01-01')
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values('2','2001-01-01')
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values('3','2002-01-01')
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values('4','2003-01-01')
insert into darryl.t1 (INVOICEREF,INVOICEDATE) values('5','2004-01-01')

Inside the Table Distribution tab, you will now see that the records have been inserted according to their values into the respective partitions (see Raw Record Count field on the right):

HANA Table Distribution

The last two records for year 2004 and 2003 are in the fourth partition.
You can also see that each partition has a Delta Size, and that the Delta Size for the fourth partition with the most records, is larger than the other partitions.
Unload the table from memory:

HANA Unload Table from memory

Refreshing the Table Distribution tab now shows the table partitions to have zero size in memory:

HANA Table Distribution

Now select the records for the years 2004 and 2005 only:

select * from darryl.t1 where invoicedate in ('2004-01-01','2003-01-01')

Refreshing the Table Distribution tab now shows the tables to have non-zero size in memory for ALL partitions!

HANA Table Distribution

All of the records from all of the partitions appear to be loaded!
What went wrong?
Well, it's simple, we didn't create an index on the column INVOICEDATE.
This forced HANA to scan through the entire table to access the required records, meaning that it needed to load them all into memory.

Let's create an index in INVOICEDATE:


Unload the table from memory:

HANA Unload Table from memory

Refreshing the Table Distribution tab now shows the tables to have zero size in memory:

HANA Table Distribution

Now re-execute the SELECT statement:

select * from darryl.t1 where invoicedate in ('2004-01-01','2003-01-01')

Once again, on the Table Distribution tab, we can see that it has accessed all partitions, AGAIN!:

HANA Table Distribution

What went wrong this time?  Well, HANA doesn't yet have any statistics on the table data, so it simply ignored the index.
If you now unload the table from memory once again (we haven't done anything else):

HANA Unload Table from memory

Now re-execute the SELECT statement:

select * from darryl.t1 where invoicedate in ('2004-01-01','2003-01-01')

Look at the Table Distribution tab:

HANA Table Distribution

You can see that HANA has now only accessed the final partition of the table.  The other partitions have not been loaded into memory.
At first I thought this feature might be due to statistics, so I tried removing them from the table T1 (drop statistics on T1;).  Then I retried the process of unloading and re-running the query.  This had no effect, HANA correctly went straight to the fourth partition.
This left me with one other option, the Plan Cache.

Clearing the Plan Cache using:


I then re-ran the test by unloading the table from memory:

HANA Unload Table from memory

Re-executing the SQL SELECT:

select * from darryl.t1 where invoicedate in ('2004-01-01','2003-01-01')

Look at the Table Distribution tab:

HANA Table Distribution

The Plan Cache was storing some form of execution plan statistics that meant that it was accessing the fourth partition straight away.
Each time the table is unloaded, the statistics from the existing Plan Cache remain and are re-used upon next execution of the query, which means HANA is able to go straight to the fourth partition.

Partitioning is a great way of parallelising access to a table in HANA.
It also serves to help reduce memory by only loading specific partitions into memory when they are required.
In order to effectively use partitioning, you need to partition on an indexed column.
The initial access of a newly partition table with a new index, does not enable the benefits of partition until the second subsequent access of the table due to the Plan Cache.  A method/process of pre-loading the execution plan/statistics into the cache is required.

Thursday, June 12, 2014

HANA - Emergency Shutdown

For testing purposes, you can perform an emergency shutdown of the HANA indexserver forcing it to simply crash out.
This will require a DB recovery (automatic) upon restart!

Use the hdbcons utility (as <sid>adm) to first activate the crash functionality, then invoke the crash:

> hdbcons
> crash activate
> crash emergencyshutdown

Inside the indexserver trace file (from diagnosis view in HANA Studio), you will be able to see the following text towards the end of the file "ExternalCommandHandler.cpp (00958)  NOTE: INTENTIONAL CRASH: errorExitWithoutCrashdump".

Once the restart is complete, you can verify that recovery was required from the indexserver trace file:

"LoggerImpl.cpp(00933) : Replayed 130944B (0MB) of log in 17.4729 seconds; 0.00714693MB/s; max known TID=50175"

Thursday, June 05, 2014

Native or Direct HANA Connection

When using the SAP HANA Studio, you can configure the connection to the HANA DB as NATIVE or DIRECT (from Window -> Preferences -> General -> Network Connections).
When NATIVE is selected, this means that the internet connection details of the native Operating System are used.

In Windows this is usually the same as the "Internet" settings configured through Windows Control Panel, or through Internet Explorer's options menu.

In this setting, HANA Studio will utilise whatever proxy server you have configured for internet access.
When DIRECT is selected, any proxy settings are controlled from within HANA Studio and any Operating System settings are ignored.

This is the reason that SAP say to select DIRECT whenever you experience connectivity issues.
Are there any performance benefits from DIRECT?   maybe.  Since the HANA Studio doesn't need to query for any proxy server settings in the Windows registry, it's possible that the initial connection could be established quicker.  Once connected it probably doesn't make any performance improvement over NATIVE.