Some time ago a
customer had problems with an application that generates many blocking
sessions. In the end all the blocking session grows up and exceeds the available
processes on the database so that the ORA-00020 maximum number of
processes exceeded error occurred. The customer
also asks me, to analyze the problem to know which statement is the root cause
of the problem.
The problem now is that the database preventing new
connections to that instance (normal users and for sysdba connection). Sure you
can now kill some session at OS level to get a process for your connection or
restart the database, but if you kill the wrong connection you may not find the
root cause of the problem.
[oracle@server1
prelim]$ sqlplus scott/tiger
SQL*Plus:
Release 12.1.0.1.0 Production on Sun Jun 15 16:08:35 2014
Copyright
(c) 1982, 2013, Oracle. All rights reserved.
ERROR:
ORA-00020:
maximum number of processes (300) exceeded
The same issue occur if we try to login as sysdba user.
[oracle@server1
DB12EE]$ sqlplus / as sysdba
SQL*Plus:
Release 12.1.0.1.0 Production on Sun Jun 15 16:11:40 2014
Copyright
(c) 1982, 2013, Oracle. All rights reserved.
ERROR:
ORA-00020:
maximum number of processes (300) exceeded
What we can do here is to use the Preliminary Parameter to start a
sqlplus session. Using a sqlplus preliminary connection you will be able to connect
to the database since no session is actually created, but you will have limited
access to the SGA. This will help in capturing diagnostic information like a
systemstate dump to aid in problem resolution.
[oracle@server1 DB12EE]$ sqlplus -prelim / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Sun Jun 15
16:20:09 2014
Copyright (c) 1982, 2013, Oracle. All rights
reserved.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug hanganalyze 3
Statement processed.
SQL> oradebug TRACEFILE_NAME
/u01/app/oracle/diag/rdbms/db12ee/DB12EE_1/trace/DB12EE_1_ora_24689.trc
With 11.2.0.2 onwards oradebug hanganalyze will not produce output under a sqlplus „preliminary connection“.
If you do this as shown in this example oradebug seems to run successful but
when we look into the trace file generated we only see this message.
Processing
Oradebug command 'hanganalyze 3'
===============================================================================
HANG ANALYSIS:
ERROR: Cannot perform hang analysis dump
without a process
state
object and a session state object.
( process=(nil), sess=(nil) )
===============================================================================
Now we can make a
systemstate dump of the database. The trace files generated here are also helpful
to analyze a problem like here described. Also read the documentation which
level is suitable for your problem.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug dump systemstate 267
Statement processed.
SQL> oradebug TRACEFILE_NAME
/u01/app/oracle/diag/rdbms/db12ee/DB12EE_1/trace/DB12EE_1_ora_24689.trc
After you have created your
trace files you can try to kill some of the database session so a normal login
is possible. To kill a database session you can use „kill -9“ on Unix or
„orakill“ on windows. Once you have a connection to the database you can do
further analyze of the problem.
I want to talk about a situation that happened
to me a few days ago. I logged in to one of my databases and my connection was
hanged. My database was not accepting new connections. Now bound to the
database server and I checked the status of the listener.
# lsnrctl status
Everything was normal. The listener is
listening to my database. So I didn’t get any error from listener. I tried to
log in with “sqlplus / as sysdba” but I failed. A connection on the server was
also hanging. I checked the background processes.
# ps -ef | grep ora_
Everything was normal. And I learned something
I did not know until that day. The backdoor entry of the Oracle database! You
can enter to the database through the back door using SQL * Plus tool with
“Prelim” parameter:) Prelim, directly connects to the SGA but it does not
open a session.
You can connect to the database with Prelim as
following
# sqlplus -prelim / as sysdba
SQL>
SQL>
Or
# sqlplus /nolog
SQL> set _prelim on
SQL> conn / as sysdba
Prelim connection established
SQL> set _prelim on
SQL> conn / as sysdba
Prelim connection established
Now, you can analyze the SGA using oradebug
command.
Resolve huge archive
gap between PRIMARY and STANDBY Large gaps
A Physical Standby database synchs with
Primary by continuous applies of archive logs from a Primary Database.
When the logs are missing on standby
difference is huge (say more than 500 logs), you have to rebuild the standby
database from scratch.
Without rebuild standby database, as an
enhancement from 10g, an incremental backup created with BACKUP INCREMENTAL…
FROM SCN can be used to refresh the standby database.
Please use below query to find out archive gap
on Standby:
SELECT ARCH.THREAD# “Thread”, ARCH.SEQUENCE#
“Last Sequence Received”, APPL.SEQUENCE# “Last Sequence Applied”,
(ARCH.SEQUENCE# – APPL.SEQUENCE#) “Difference” FROM (SELECT THREAD# ,SEQUENCE#
FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT
THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT
THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT
THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE
ARCH.THREAD# = APPL.THREAD# ORDER BY 1;
Thread
Last Sequence Received
Last Sequence Applied
Difference
———- ———————- ———————
———-
1
8254
7954
300
Find the SCN on the PRIMARY:
SQL> select
current_scn from v$database;
CURRENT_SCN
———–
242671761
Find the SCN on the STANDBY:
SQL> select
current_scn from v$database;
CURRENT_SCN
———–
223771173
Clearly you can
see there is difference
Stop and shutdown the managed standby apply process:
SQL> alter
database recover managed standby database cancel;
Database altered.
Shutdown the
standby database
SQL> shut
immediate
On the primary, take an incremental backup from the SCN
number where the standby current value 223771173:
RMAN>
run { allocate channel c1 type disk format ‘/backup/%U.bkp’;
backup
incremental from scn 223771173 database;
}
On the primary, create a new standby controlfile:
SQL> alter
database create standby controlfile as ‘/backup/for_standby.ctl’;
Database altered.
Copy the standby controlfile to STANDBY and bring up the
standby instance in nomount status with standby controlfile:
SQL> startup
nomount
SQL> alter
database mount standby database;
Connect to RMAN on STANDBY, Catalog backup files to
RMAN using below commands:
$ rman target=/
RMAN> catalog
start with ‘/backup’;
PERFORM RECOVER:
RMAN> recover
database;
Start managed recovery process:
SQL> alter
database recover managed standby database disconnect from session;
Database altered.
Check the SCN’s
in primary and standby it should be close to each other.
Resolve
archive gap between PRIMARY and STANDBY less than 10-15 numbers
A Physical Standby database synchs with
Primary by continuous applies of archive logs from a Primary Database.
When the logs are missing on standby is less
than 10-15 numbers, We can simple ship the logs which are missing in the
standby database from primary database by using
SCP/FTP and then register the logfiles in standby to resolve the gap.
Please use below query to find out archive
gap on Standby:
SELECT ARCH.THREAD# “Thread”, ARCH.SEQUENCE#
“Last Sequence Received”, APPL.SEQUENCE# “Last Sequence Applied”,
(ARCH.SEQUENCE# – APPL.SEQUENCE#) “Difference” FROM (SELECT THREAD# ,SEQUENCE#
FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT
THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT
THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT
THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE
ARCH.THREAD# = APPL.THREAD# ORDER BY 1;
Thread Last Sequence Received
Last Sequence Applied Difference
———- ———————-
——————— ———-
1
9545
9535
10
SQL> SELECT MAX(SEQUENCE#) FROM
V$ARCHIVED_LOG WHERE APPLIED=’YES’;
MAX(SEQUENCE#)
————–
9535
COPY ARCHIVELOG FILE TO STANDBY FROM PRIMARY:
REGISTER LOGFILE AT STANDBY:
SQL> alter database register logfile
‘/log_location/log_file_n.arc’;
logfile registered
Repeat the same process for all the log files
which are missing at standby.
Manually Resolving Gaps - Oracle Data Guard
11gR2
For simplicity, Gap is a range of
archived redo logs that were created at a time when the standby database was
not available to receive them. In many practices it happen because the network
problems.
We have 3 methods to resolving this problem. I’ll dispatch it to two main methods,
We have 3 methods to resolving this problem. I’ll dispatch it to two main methods,
Manually resolving
Automatic resolving : Using log switched and
FAL configuration
1. Start with detecting gaps in the redo logs
by querying the v$archive_gap
SQL> select * from v$archive_gap;
Thread# low_sequence#
high_sequence#
-------- ---------------- ------------------
1 30 34
-------- ---------------- ------------------
1 30 34
The output indicates our standby
database is currently missing log files from sequence 30 to
34.
2. Issue following statement on primary database to locate the archived redo log files. Assuming
the local archive destination on primary is LOG_ARCHIVE_DEST_1
34.
2. Issue following statement on primary database to locate the archived redo log files. Assuming
the local archive destination on primary is LOG_ARCHIVE_DEST_1
SQL> select
name from v$archived_log where thread#=1 and dest_id=1 and sequence# between 30
and 34;
name
----------------------------------
----------------------------------
/u02/oraarchive/DB01/arch_t1_s30.dbf
/u02/oraarchive/DB01/arch_t1_s31.dbf
/u02/oraarchive/DB01/arch_t1_s32.dbf
/u02/oraarchive/DB01/arch_t1_s33.dbf
/u02/oraarchive/DB01/arch_t1_s34.dbf
3. Stop the
automatic recovery (MRP) of the data guard
SQL> alter
database recover managed standby database cancel;
4. Transfer
manually the archived log files shown on the step 2 to standby database
5. Register that archived log files on standby database
5. Register that archived log files on standby database
SQL> alter
database register logfile '/u02/oraarchive/DB01/arch_t1_s30.dbf';
SQL> alter
database register logfile '/u02/oraarchive/DB01/arch_t1_s31.dbf';
SQL> alter
database register logfile '/u02/oraarchive/DB01/arch_t1_s32.dbf';
SQL> alter
database register logfile '/u02/oraarchive/DB01/arch_t1_s33.dbf';
SQL> alter
database register logfile '/u02/oraarchive/DB01/arch_t1_s34.dbf';
6. Put the standby
database into automatic recovery managed mode
SQL>
alter database recover managed standby database disconnect from session;
7. Verify that the
gap gets resolved on standby database
SQL> select
sequence#, applied from v$archived_log order by sequence#;
now you figure it out, that your archived log files (ARCLs) has been synchronize with your primary database. As a DBA you must take a notice at log using this SQL statement
now you figure it out, that your archived log files (ARCLs) has been synchronize with your primary database. As a DBA you must take a notice at log using this SQL statement
SQL> select message from
v$dataguard_status where severity like 'Warning';
Also on your alert.log