Session using a database link hangs on “SQL*Net more data from dblink”

I have a client who recently move a database server from on-premise to a Cloud provider.  A database on this database server had a database link to their E-business database in the Oracle Cloud.  Since the move, any sessions in the database that use the database link to the E-business database would hang if the query was to return large dataset.

Below is selecting from dual over the database link that worked:

SQL> set timing on
SQL> set autotrace on
SQL> select * from dual@ebs;

D
-
X

Elapsed: 00:00:00.13

Execution Plan
----------------------------------------------------------
Plan hash value: 272002086

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|      |     1 |     2 |     2   (0)| 00:00:01 |        |
|   1 |  TABLE ACCESS FULL     | DUAL |     1 |     2 |     2   (0)| 00:00:01 | PWGPSI |
----------------------------------------------------------------------------------------

Note
-----
   - fully remote statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        511  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

But selecting from a table hangs:

SQL> select * from XXX_PER_PEOPLE_F@ebs;

Hangs!......

The session in the source database shows the wait “SQL*Net more data from dblink”:

       SID USERNAME       STATUS   MACHINE                         TERMINAL                       PROGRAM                                          EVENT                                        WAIT_CLASS                   WAIT_TIME SECONDS_IN_WAIT STATE

---------- -------------- -------- ------------------------------- ------------------------------ ------------------------------------------------ -------------------------------------------- ---------------------------- --------- --------------- -------------------

       144 XXXXXXXXX      ACTIVE   XXX                             unknown                        sqlplus@xxxxx.xxxxx.xxx (TNS V1-V3)              SQL*Net more data from dblink                Network                      0         341             WAITING

The session in the target database shows the wait “SQL*Net message from client”:

SID USERNAME EVENT                       SQL_ID   MACHINE       PROGRAM                        PROCESS 
--- -------- --------------------------- -------- ------------- ------------------------------ ---------- 
35  XXX      SQL*Net message from client XXX      XXX.XXX.XXX   oracle@XXX.XXX.XXX (TNS V1-V3) 15340

I traced the source session and could see the session hangs waiting for more data from the target (EBS) database:

WAIT #2: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920328443464

*** 2018-07-17 11:13:47.934

WAIT #2: nam='SQL*Net message from client' ela= 11336000 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920339779586
WAIT #0: nam='single-task message' ela= 179653 p1=0 p2=0 p3=0 obj#=-1 tim=1495920339960679
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920339989423
WAIT #0: nam='SQL*Net message from dblink' ela= 29064 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340018526
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340018982
WAIT #0: nam='SQL*Net message from dblink' ela= 30534 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340049542
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340049793
WAIT #0: nam='SQL*Net message from dblink' ela= 42171 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340091996
WAIT #0: nam='SQL*Net more data from dblink' ela= 4 driver id=1413697536 #bytes=17 p3=0 obj#=-1 tim=1495920340092053
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340092253
WAIT #0: nam='SQL*Net message from dblink' ela= 28448 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340120720
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340120854
WAIT #0: nam='SQL*Net message from dblink' ela= 30884 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340151761
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340151816
WAIT #0: nam='SQL*Net message from dblink' ela= 28590 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340180420
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340180491
WAIT #0: nam='SQL*Net message from client' ela= 207 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340180713
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340180787
WAIT #0: nam='SQL*Net message from dblink' ela= 28635 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340209434
WAIT #0: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340209472
WAIT #0: nam='SQL*Net message from client' ela= 70 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340209554
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340209601
WAIT #0: nam='SQL*Net message from dblink' ela= 29357 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340238969
WAIT #0: nam='SQL*Net more data from dblink' ela= 6 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340239133
WAIT #0: nam='SQL*Net more data from dblink' ela= 37 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340239196
WAIT #0: nam='SQL*Net more data from dblink' ela= 124 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340239340
WAIT #0: nam='SQL*Net more data from dblink' ela= 37 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340239409
WAIT #0: nam='SQL*Net more data from dblink' ela= 93 driver id=1413697536 #bytes=3 p3=0 obj#=-1 tim=1495920340239528
WAIT #0: nam='SQL*Net more data from dblink' ela= 79 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340239632
WAIT #0: nam='SQL*Net more data from dblink' ela= 28330 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340268062
WAIT #0: nam='SQL*Net more data from dblink' ela= 4 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340268111
WAIT #0: nam='SQL*Net more data from dblink' ela= 168 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340268297
WAIT #0: nam='SQL*Net more data from dblink' ela= 55 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340268378
WAIT #0: nam='SQL*Net more data from dblink' ela= 69 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340268483
WAIT #0: nam='SQL*Net more data from dblink' ela= 47 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340268553
Hangs!...

After raising a SR with My Oracle Support (MOS) for the database we manage and for the one Oracle manage in the Oracle Cloud, it was concluded the package size (Session Data Unit) was larger then the allowable for a network component between the two databases.  They referred to MOS note:

Query on “bigtable” from remote Client hangs (certain queries or fields) (Doc ID 2104257.1)

“SYMPTOMS

Certain queries are hanging when run from some remote Clients.  However, other (smaller) queries are successful.

This is especially evident on queries that require more than 2kb of data to be returned.
Some examples:

select * from v$database; –> hangs
select count(*) from v$database; –> works

DESCRIBE with large data results –> hangs
DESCRIBE with small data results –> works

This might also show up with Database Links (DBLINK) as well.

CAUSE

A Network “security” device or setting (possibly local especially on a Microsoft Windows machine) is preventing or “altering” larger TCP packets from being transported across the network.
This in turn is causing the Client to wait on the Server for the data from the query, and the Server to wait on the Client (which thinks part of the packet is still on the way).

1. Check for settings like the DF (“Don’t Fragment”) bit being set.
2. Check for ALG SQL settings being enabled.

*Note: these causes are all external to Oracle so provided only as potential causes.

SOLUTION

Workaround
~~~~~~~~~
As a workaround (or test to prove this is or is not the issue) lower the SQL*Net SDU from the default size of 8192 to 1400 (see reference below for more details on this setting):

1. Add the following single line to the sqlnet.ora file on BOTH ends of the communication:
DEFAULT_SDU_SIZE = 1400

2. Restart the Listener(s) servicing the Database in question, make a new connection from the Client, and test the query that was hanging.

3. If this corrects the issue and allows queries to complete, then there is a network / system device or setting causing fragmentation, detention, or alteration of SQL packets mid-stream.”

Oracle Support set the SDU in the sqlnet.ora file on the target database server:
DEFAULT_SDU_SIZE = 1400

More info on DEFAULT_SDU_SIZE can be found here:

Database Net Services Reference -> 5 Parameters for the sqlnet.ora File -> DEFAULT_SDU_SIZE

And then restarted the listener.  I did the same for the source database server and then the session no longer hanged 🙂 :

SQL> select * from XXX_PER_PEOPLE_F@ebs;
...

1069 rows selected.

Elapsed: 00:00:02.57

Update

28/08/2018 The issue wasn’t resolved and the MTU of the network cards on both the source and target servers was also changed to 1400.  It could possibly been resolved by reducing the SDU further but it was decided to change the MTU on network cards.  In most cases the SDU change will fix the issue, but otherwise the MTU on network card can also resolve the issue as in this case.

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

4 thoughts on “Session using a database link hangs on “SQL*Net more data from dblink”

      • i have similar issue we are trying to migrate the data from one db to another via db link. when the data set is large like 30M or so it will complete fetching data over the db link and hten insert in the calling db. Now it again does a query over the same dbl ink to get next set of 30 M and it will hang. Session needs to be killed. But if i end the session and call the next 30M in different script it will work. What is reason for that ? any workarounds ?

        Like

      • Hi Aarti,

        It’s hard to say without knowing the exact details. I’d say check the wait events on both source and target. Use tracing if necessary like I did. Failing this, raise an SR with Oracle Support to assist you.

        Hope that helps

        Kind Regards

        ZedDBA

        Like

Leave a comment