Friday, August 29, 2014

Login storm against database caused exhausted library cache

One of our databases experienced massive contention in the shared pool, in form of wait events alerted as "library cache locks".

The database was very small indeed, so my natural instinct was to throw some more memory at the virtual host, and rearrange the memory parameters.

This turned out to be a misconception; the resources were sufficient for the instance to work properly.

The problem was caused by an incorrect password configuration on the application server.

What we could observe was:

  • A totally exhausted shared pool, caused by "library cache lock"
  • The SQL that seemed to be repeatedly executed was
    SELECT /*+ connect_by_filtering */
              privilege#, LEVEL
          FROM sysauth$
    CONNECT BY grantee# = PRIOR privilege# AND privilege# > 0
    START WITH grantee# = :1 AND privilege# > 0;
    
    
    SELECT privilege#
      FROM sysauth$
    WHERE (grantee# = :1 OR grantee# = 1) AND privilege# > 0;
    
  • The V$EVENT_NAME view showed that the wait event was accompanied by the additional information found in the columns parameter1 through parameter3, which turned out to be helpful further on:
    select  name, wait_class,parameter1,parameter2,parameter3
    from v$event_name
    where wait_class = 'Concurrency'
    and name = 'library cache lock';
    

    NAME WAIT_CLASS PARAMETER1 PARAMETER2 PARAMETER3
    library cache lock Concurrency handle address lock address 100*mode+namespace

    Further research showed that the problem was due to a built-in delay between failed login attempts in Oracle 11g:

    "The 'library cache lock' wait is seen due to the fact that the account status gets updated due to incorrect login.
    To prevent password guessing attack, there's a sleep() in the code when incorrect login attempts exceed count of 3.
    And because of this sleep() you see a wait on library cache, as the process is yet to release the lock."


  • In release 11.1.0.7, patch 7715339 was released to remove this delay.
  • In release 11.2.X, the DBA must set an event to remove the delay, as follows:

    alter system set events '28401 trace name context forever, level 1'; 
    

    According to Oracle, the purpose of the built-sleep is to make it harder to succeed in a "password guessing attack", particularly in cases where FAILED_LOGIN_ATTEMPTS is set to UNLIMITED. Oracle Development is pointing out that disabling the sleep-function is not recommended. A better solution is to set the FAILED_LOGIN_ATTEMPTS to a reasonable value.
    When the number of failed login attempts for a session hits the limit, the account will be locked. Subsequent logon attempts with incorrect password will then be rejected immediately without any contention in the library cache.

    See Bug 15882590 : 'LIBRARY CACHE LOCK' DURING WRONG PASSWORD LOGON ATTEMPTS on My Oracle Support (MOS) for further information.


  • 1 comment:

    1. Great article, really save our ass today! Thanks for sharing.

      ReplyDelete