Oracle Switches CPU from Intel to AMD for Exadata Cloud Services

Working on Oracle Exadata Cloud Infrastructure X9M, I noticed the upper OCPU went up pretty high compared to X8M. This made me curious and I discovered the switch from Intel to AMD for the CPU:

Exadata X9M (Exadata Cloud Services)

[root@v1exacs01c1db01 ~]# more /proc/cpuinfo | grep "vendor_id|model name" | sort | uniq -c
8 model name : AMD EPYC 7J13 64-Core Processor
8 vendor_id : AuthenticAMD
[root@v1exacs01c1db01 ~]#

Compare this to on-premises Exadata and they are still Intel processors:

Exadata X9M (on-premises)

[root@v1exadb01 ~]# more /proc/cpuinfo | grep "vendor_id|model name" | sort | uniq -c
76 model name      : Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
76 vendor_id       : GenuineIntel
[root@v1exadb01 ~]#

Comparing with previous Exadata Cloud @ Customer and this is also still Intel processors:

Exadata X8M (Exadata Cloud @ Customer)

[root@v1exacc01c1db01 ~]# more /proc/cpuinfo | grep "vendor_id|model name" | sort | uniq -c
6 model name      : Intel(R) Xeon(R) Platinum 8270CL CPU @ 2.70GHz
6 vendor_id       : GenuineIntel
[root@v1exacc01c1db01 ~]#

This change make sense as the latest Intel processor is the “Intel® Xeon® Platinum 8358 Processor”, which has 32 cores and 64 threads.

Compare this with latest AMD processor “AMD EPYC™ 7713”, which has 64 cores and 128 threads. I believe the AMD EPYC 7J13 in the Exadata Cloud Infrastructure X9M is of the same family.

This allows Oracle to offer 252 OCPU for Quarter Rack, 2 sockets x 64 cores x 2 database servers less 2 cores per database servers for KVM. This is double what could be offered if Intel processors were used. And seems as Oracle charge per a OCPU, it’s a smart move that was quietly done 😉

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

Oracle and Microsoft to Interconnect Oracle Cloud and Microsoft Azure

On the 5th of June 2019, both Oracle and Microsoft made a joint announcement on the interconnection between Oracle Cloud and Microsoft Azure:

Oracle’s Press Release
Microsoft Press Release

The key aspects of the announcement are:

  1. A direct interconnect between Oracle Cloud and Azure Cloud, starting in Oracle’s Ashburn (North America) region and Azure Washington DC (US East) region, with plans to expand additional regions in the future*.
  2. Unified identity and access management, via a unified single sign-on experience to manage resources across both Oracle Cloud and Azure.
  3. Supported deployment of custom applications and packaged Oracle applications (such as JD Edwards, PeopleSoft and E-Business Suite typically referred to as Oracle Applications Unlimited) in Azure Cloud with the Oracle databases (such as RAC, Exadata, Autonomous Database) deployed in the Oracle Cloud.
  4. A collaborative support model for customers leveraging these new capabilities.

*With only one region available in both Oracle and Azure, this will allow for failure in a Availability Domain in Oracle Cloud and/or Availability Zone in Azure Cloud but not a failure of a whole region in either.  So until more regions are added, Disaster Recover will be limited to Availability Domains/Availability Zones:

oracle-azure-connectivity

Figure 4 from Oracle’s blog “Overview of the Interconnect Between Oracle and Microsoft“.

What does this mean?

In a nutshell, for those customers who have Microsoft Azure as their Cloud platform of choice, can now migrate their application tier to Microsoft Azure, whilst migrating the database tier i.e. Oracle database (mandatory for Application Unlimited) without having to worry about the all important latency (high-throughput, low-latency as stated in Oracle’s blog post).  It is however unclear if there will be any charge for outbound/inbound traffic between the clouds, but it does seem from the documentation and blog post that both Oracle’s dedicated FastConnect and Azure’s dedicated ExpressRoute are required, which are both fix rate products.  It also helps those customers who require the favourable database licensing on the Oracle Cloud, more info can be found in my blog post here.

This is certainly a step towards the trend of multi-cloud/hybird-cloud platforms.

More info can be found regarding this announcement from our Version 1’s blog post here.

Another interesting read from SearchCloud Computing in regards to this announcement.

 

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

Oracle’s Autonomous Database (Cloud)

So yesterday I attended the “Autonomous Database GTM Roadmap Sales Workshop” at Oracle’s London office.  This training is for Oracle partners such as Version 1, which is one of Oracle’s strategic partners.

A lot of what is in this blog post is subject to Oracle’s Safe Harbour statement.

My Key Takeaways

1 . Maturity

The Autonomous Database is still very new!  It’s like back in 2008 when the first Exadata Machine (V1) was launched, it was great, it was game changer for large Data Warehouses.  But it wasn’t suited for OLTP and as with anything new it had its fair share of “teething issues”.  However, now passing its 10 year anniversary last year and on its 8th iteration the X7, it’s now a very mature product.  It’s suited for mixed workloads (since the 2nd iteration) and has had so many new features over the years that makes it now a very compelling offering if it suits your business needs.

This is the same for the Autonomous Database, at launch it was only suited for Data Warehouse just as the first Exadata Machine (however soon after another offering was available for OLTP, see further on), it’s not perfect and it has it’s fair share of “teething issues”.  However, come its 10 years anniversary and all the features that are in the road map are implemented, it will be a different story and it will be another very compelling offering from Oracle, again if it suits your business needs.

2. Makeup

The make up of the Autonomous Database in the Oracle Cloud is:

  1. Oracle’s Extreme performance platform, Exadata part of the Oracle Engineered Systems
  2. A streamlined version of 18c database soon to be 19c
  3. Oracle Cloud Automated Data Centre Operations

This is the not so “secret sauce” 🙂

3. Infrastructure Offerings

So the Oracle Autonomous Database comes in 2 offerings:

  1. Serverless Exadata Cloud Infrastructure, which just means it’s shared.  This is for non-mission critical workloads and is non-deterministic performance.  The minimum is 1 TB storage and 1 OCPU and it’s the low cost entry point.  Please Note: This is the ONLY offering at present (Jan 2019).
  2. Dedicated Exadata Cloud Infrastructure, which is as the name suggests dedicated.  This is for mission critical workloads and is deterministic performance.  To be confirmed, but envisioned to be offered like Exadata conventional sizes, i.e. quarter, half and full rack.  The minimum is 1 TB storage and 1 OCPU to all OCPU in the rack size provisioned.  It will have private networks unlike the above offering which is public.  Expecting “soon”, so could be Q2 or Q3 of 2019.

4. Workload Offerings

Once you’ve selected between shared or dedicated, then you need to decide what type of workload as there are two products that apply the autonomous optimisations:

  1. Autonomous Data Warehouse (ADW), which optimises complex SQL, stores in columnar format and creates data summaries.  This was the only offering at launch.
  2. Autonomous Transaction Processing (ATP), which optimises response time, stores in row format and creates indexes autonomously.  Now also available.

The current offering doesn’t let you change between the two, however it is on the road map to be able to converted from one to another, for example to want to test which works best for you or if you have in hindsight made the wrong selection.

5. Automatic Indexing

This one is probably a contentious yet interesting topic!  Us DBAs are used to the world of indexes and us “knowing” what’s right, however the world moved on and AI and Machine Learning is taking away laborious task from us.  The Autonomous Database in the ATP can analyse the workload and use AI and ML to see what indexes are needed over a period of time and eventually have the same elapse time of a workload, however the most interesting aspect is that it will only have indexes that are needed and have a net reduction in indexes, which can often get left behind and have little to no benefit.  There’s no denying we can know better and have a set of indexes with some redundant indexes too but how often is this reviewed to remove unused indexes, add new ones as queries change?  This Automatic Indexing takes away that headache with some volatility as it works out what is required.  I can really see the benefits here and see this being the norm just as Automatic Undo Management is, who in this day and age manages undo segments?

6. Autonomous

The Autonomous Database is:

  1. Self-Driving, performs database maintenance tasks such as tablespace space management, etc.  Automate upgrades and release updates.
  2. Self-Securing, automatically apply secure patch online.  Out the box, all data and network traffic is encrypted.
  3. Self-Repairing, can automatically detect and apply fixes data issues, i.e. resolve block corruption using Active Data Guard, ensure high availability using Real Application Clusters (RAC) and in the event of disaster, use Data Guard physical standby.

7. Is it for you?

Just talking Oracle platforms, there’s a spectrum of platforms, from most Manual to most Autonomous:

  1. Database on commodity hardware on premise
  2. Database on Engineered Systems (Exadata) on premise
  3. Database on Oracle Cloud Infrastructure (OCI)
  4. Exadata Cloud Services / Exadata Cloud @ Customer
  5. Autonomous Database Cloud Services

The more autonomous you go, the more you can focus on your business.

Anyone who’s interested in Autonomous Database, come talk to us 🙂

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

Oracle Database Backup Service Fails with: ORA-19511: – KBHS-00715: HTTP error occurred ‘oracle-error’ – ORA-29024

I discovered an Oracle Cloud Database Backup failing with:

Starting backup at 2018/09/08 20:00:04
current log archived
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup plus archivelog command at 09/08/2018 20:00:07
ORA-19554: error allocating device, device type: SBT_TAPE, device name:
ORA-27023: skgfqsbi: media manager protocol error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
   KBHS-00715: HTTP error occurred 'oracle-error'
KBHS-00712: ORA-29024 received from local HTTP service

Recovery Manager complete.

Upon investigation I found the following metalink note:
RMAN Backup to Oracle Database Backup Cloud Service fails with KBHS-00715 ORA-29024 (Doc ID 2360941.1)

The “ORA-29024” is raised due to incorrect certificate chain.  This issue was investigated in Bug 27402663, however no fix is needed.  The later library versions will by default include the trusted certificate workaround for the issue.

So the solution is to re-install the cloud backup module with the “-trustedCerts” option:

[oracle@V1LOEM ~]$ cd /u01/oracle_stage/cloud/
[oracle@V1LOEM cloud]$ . oraenv
ORACLE_SID = [oracle] ? EMREPOS
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 is /u01/app/oracle
[oracle@V1LOEM cloud]$ java -jar opc_install.jar -host https://em2.storage.oraclecloud.com/v1/Storage-aXXX -opcId 'oraclecloudbackup@version1.com' -opcPass 'xxx' -walletDir '/u01/oracle/opc_wallet' -libDir $ORACLE_HOME/lib -debug -trustedCerts
Oracle Database Cloud Backup Module Install Tool, build 2017-05-04
Debug: os.name        = Linux
Debug: os.arch        = amd64
Debug: os.version     = 3.8.13-98.1.2.el6uek.x86_64
Debug: file.separator = /
Debug: Platform = PLATFORM_LINUX64
Debug: OPC Account Verification: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><account name="Storage-aXXX"><container><name>oracle-data-storagea-1</name><count>944</count><bytes>52740693928</bytes><accountId><id>XXX</id></accountId><deleteTimestamp>0.0</deleteTimestamp><containerId><id>XXX</id></containerId></container></account>
Oracle Database Cloud Backup Module credentials are valid.
Debug: Certificate Success:
       Subject  : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
       Validity : Fri Nov 10 00:00:00 GMT 2006 - Mon Nov 10 00:00:00 GMT 2031
       Issuer   : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Oracle Database Cloud Backup Module wallet created in directory /u01/oracle/opc_wallet.
Oracle Database Cloud Backup Module initialization file /u01/app/oracle/product/12.1.0/dbhome_1/dbs/opcEMREPOS.ora created.
Downloading Oracle Database Cloud Backup Module Software Library from file opc_linux64.zip.
Debug: Temp zip file = /tmp/opc_linux648852138548086808899.zip
Debug: Downloaded 27342262 bytes in 13 seconds.
Debug: Transfer rate was 2103250 bytes/second.
Download complete.
Debug: Delete RC = true

Now test the latest Cloud Backup Module:

[oracle@V1LOEM cloud]$ rman target /

Recovery Manager: Release 12.1.0.2.0 - Production on Mon Sep 10 13:10:17 2018

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

connected to target database: EMREPOS (DBID=XXX)

RMAN> delete obsolete recovery window of 8 days device type sbt;

using target database control file instead of recovery catalog
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=794 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_2
channel ORA_SBT_TAPE_2: SID=785 device type=SBT_TAPE
channel ORA_SBT_TAPE_2: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_3
channel ORA_SBT_TAPE_3: SID=407 device type=SBT_TAPE
channel ORA_SBT_TAPE_3: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_4
channel ORA_SBT_TAPE_4: SID=1169 device type=SBT_TAPE
channel ORA_SBT_TAPE_4: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_5
channel ORA_SBT_TAPE_5: SID=416 device type=SBT_TAPE
channel ORA_SBT_TAPE_5: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_6
channel ORA_SBT_TAPE_6: SID=1167 device type=SBT_TAPE
channel ORA_SBT_TAPE_6: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_7
channel ORA_SBT_TAPE_7: SID=782 device type=SBT_TAPE
channel ORA_SBT_TAPE_7: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_8
channel ORA_SBT_TAPE_8: SID=1164 device type=SBT_TAPE
channel ORA_SBT_TAPE_8: Oracle Database Backup Service Library VER=12.2.0.2
Deleting the following obsolete backups and copies:
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set           18192  30-AUG-18
  Backup Piece       18192  30-AUG-18          pptbsjl2_1_1
...
  Backup Piece       18213  01-SEP-18          qutc1sae_1_1

Do you really want to delete the above objects (enter YES or NO)? no

RMAN> exit

Recovery Manager complete.
[oracle@V1LOEM cloud]$

Everything working again 🙂

Now you have the latest library version, by default you have the workaround and can now omit the -trustedCerts option:

[oracle@V1LOEM cloud]$ java -jar opc_install.jar -host https://em2.storage.oraclecloud.com/v1/Storage-aXXX -opcId 'oraclecloudbackup@version1.com' -opcPass 'xxx' -walletDir '/u01/oracle/opc_wallet' -libDir $ORACLE_HOME/lib -debug
Oracle Database Cloud Backup Module Install Tool, build 2017-05-04
Debug: os.name = Linux
Debug: os.arch = amd64
Debug: os.version = 3.8.13-98.1.2.el6uek.x86_64
Debug: file.separator = /
Debug: Platform = PLATFORM_LINUX64
Debug: OPC Account Verification: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><account name="Storage-aXXX"><container><name>oracle-data-storagea-1</name><count>944</count><bytes>52740693928</bytes><accountId><id>XXX</id></accountId><deleteTimestamp>0.0</deleteTimestamp><containerId><id>XXX</id></containerId></container></account>
Oracle Database Cloud Backup Module credentials are valid.
Debug: Certificate Success:
Subject : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Validity : Fri Nov 10 00:00:00 GMT 2006 - Mon Nov 10 00:00:00 GMT 2031
Issuer : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Oracle Database Cloud Backup Module wallet created in directory /u01/oracle/opc_wallet.
Oracle Database Cloud Backup Module initialization file /u01/app/oracle/product/12.1.0/dbhome_1/dbs/opcEMREPOS.ora created.
Downloading Oracle Database Cloud Backup Module Software Library from file opc_linux64.zip.
Debug: Temp zip file = /tmp/opc_linux647954567264150845107.zip
Debug: Downloaded 27342262 bytes in 15 seconds.
Debug: Transfer rate was 1822817 bytes/second.
Download complete.
Debug: Delete RC = true
[oracle@V1LOEM cloud]$

Now test again, to ensure still working:

[oracle@V1LOEM cloud]$ rman target /

Recovery Manager: Release 12.1.0.2.0 - Production on Mon Sep 10 13:12:45 2018

Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved.

connected to target database: EMREPOS (DBID=XXX)

RMAN> delete obsolete recovery window of 8 days device type sbt;

using target database control file instead of recovery catalog
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=794 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_2
channel ORA_SBT_TAPE_2: SID=785 device type=SBT_TAPE
channel ORA_SBT_TAPE_2: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_3
channel ORA_SBT_TAPE_3: SID=407 device type=SBT_TAPE
channel ORA_SBT_TAPE_3: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_4
channel ORA_SBT_TAPE_4: SID=1169 device type=SBT_TAPE
channel ORA_SBT_TAPE_4: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_5
channel ORA_SBT_TAPE_5: SID=416 device type=SBT_TAPE
channel ORA_SBT_TAPE_5: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_6
channel ORA_SBT_TAPE_6: SID=1167 device type=SBT_TAPE
channel ORA_SBT_TAPE_6: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_7
channel ORA_SBT_TAPE_7: SID=782 device type=SBT_TAPE
channel ORA_SBT_TAPE_7: Oracle Database Backup Service Library VER=12.2.0.2
allocated channel: ORA_SBT_TAPE_8
channel ORA_SBT_TAPE_8: SID=1164 device type=SBT_TAPE
channel ORA_SBT_TAPE_8: Oracle Database Backup Service Library VER=12.2.0.2
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set 18192 30-AUG-18
Backup Piece 18192 30-AUG-18 pptbsjl2_1_1
...
Backup Piece 18213 01-SEP-18 qutc1sae_1_1

Do you really want to delete the above objects (enter YES or NO)? no

RMAN> exit

Recovery Manager complete.
[oracle@V1LOEM cloud]$

Related Posts

Oracle Database Backup Service Fails with: ORA-19511: – KBHS-00715: HTTP error occurred ‘oracle-error’ – ORA-28750

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

Session using a database link hangs on “SQL*Net more data from dblink”

I have a client who recently move a database server from on-premise to a Cloud provider.  A database on this database server had a database link to their E-business database in the Oracle Cloud.  Since the move, any sessions in the database that use the database link to the E-business database would hang if the query was to return large dataset.

Below is selecting from dual over the database link that worked:

SQL> set timing on
SQL> set autotrace on
SQL> select * from dual@ebs;

D
-
X

Elapsed: 00:00:00.13

Execution Plan
----------------------------------------------------------
Plan hash value: 272002086

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|      |     1 |     2 |     2   (0)| 00:00:01 |        |
|   1 |  TABLE ACCESS FULL     | DUAL |     1 |     2 |     2   (0)| 00:00:01 | PWGPSI |
----------------------------------------------------------------------------------------

Note
-----
   - fully remote statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        511  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

But selecting from a table hangs:

SQL> select * from XXX_PER_PEOPLE_F@ebs;

Hangs!......

The session in the source database shows the wait “SQL*Net more data from dblink”:

       SID USERNAME       STATUS   MACHINE                         TERMINAL                       PROGRAM                                          EVENT                                        WAIT_CLASS                   WAIT_TIME SECONDS_IN_WAIT STATE

---------- -------------- -------- ------------------------------- ------------------------------ ------------------------------------------------ -------------------------------------------- ---------------------------- --------- --------------- -------------------

       144 XXXXXXXXX      ACTIVE   XXX                             unknown                        sqlplus@xxxxx.xxxxx.xxx (TNS V1-V3)              SQL*Net more data from dblink                Network                      0         341             WAITING

The session in the target database shows the wait “SQL*Net message from client”:

SID USERNAME EVENT                       SQL_ID   MACHINE       PROGRAM                        PROCESS 
--- -------- --------------------------- -------- ------------- ------------------------------ ---------- 
35  XXX      SQL*Net message from client XXX      XXX.XXX.XXX   oracle@XXX.XXX.XXX (TNS V1-V3) 15340

I traced the source session and could see the session hangs waiting for more data from the target (EBS) database:

WAIT #2: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920328443464

*** 2018-07-17 11:13:47.934

WAIT #2: nam='SQL*Net message from client' ela= 11336000 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920339779586
WAIT #0: nam='single-task message' ela= 179653 p1=0 p2=0 p3=0 obj#=-1 tim=1495920339960679
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920339989423
WAIT #0: nam='SQL*Net message from dblink' ela= 29064 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340018526
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340018982
WAIT #0: nam='SQL*Net message from dblink' ela= 30534 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340049542
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340049793
WAIT #0: nam='SQL*Net message from dblink' ela= 42171 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340091996
WAIT #0: nam='SQL*Net more data from dblink' ela= 4 driver id=1413697536 #bytes=17 p3=0 obj#=-1 tim=1495920340092053
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340092253
WAIT #0: nam='SQL*Net message from dblink' ela= 28448 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340120720
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340120854
WAIT #0: nam='SQL*Net message from dblink' ela= 30884 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340151761
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340151816
WAIT #0: nam='SQL*Net message from dblink' ela= 28590 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340180420
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340180491
WAIT #0: nam='SQL*Net message from client' ela= 207 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340180713
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340180787
WAIT #0: nam='SQL*Net message from dblink' ela= 28635 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340209434
WAIT #0: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340209472
WAIT #0: nam='SQL*Net message from client' ela= 70 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340209554
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340209601
WAIT #0: nam='SQL*Net message from dblink' ela= 29357 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340238969
WAIT #0: nam='SQL*Net more data from dblink' ela= 6 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340239133
WAIT #0: nam='SQL*Net more data from dblink' ela= 37 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340239196
WAIT #0: nam='SQL*Net more data from dblink' ela= 124 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340239340
WAIT #0: nam='SQL*Net more data from dblink' ela= 37 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340239409
WAIT #0: nam='SQL*Net more data from dblink' ela= 93 driver id=1413697536 #bytes=3 p3=0 obj#=-1 tim=1495920340239528
WAIT #0: nam='SQL*Net more data from dblink' ela= 79 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340239632
WAIT #0: nam='SQL*Net more data from dblink' ela= 28330 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340268062
WAIT #0: nam='SQL*Net more data from dblink' ela= 4 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340268111
WAIT #0: nam='SQL*Net more data from dblink' ela= 168 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340268297
WAIT #0: nam='SQL*Net more data from dblink' ela= 55 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340268378
WAIT #0: nam='SQL*Net more data from dblink' ela= 69 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340268483
WAIT #0: nam='SQL*Net more data from dblink' ela= 47 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340268553
Hangs!...

After raising a SR with My Oracle Support (MOS) for the database we manage and for the one Oracle manage in the Oracle Cloud, it was concluded the package size (Session Data Unit) was larger then the allowable for a network component between the two databases.  They referred to MOS note:

Query on “bigtable” from remote Client hangs (certain queries or fields) (Doc ID 2104257.1)

“SYMPTOMS

Certain queries are hanging when run from some remote Clients.  However, other (smaller) queries are successful.

This is especially evident on queries that require more than 2kb of data to be returned.
Some examples:

select * from v$database; –> hangs
select count(*) from v$database; –> works

DESCRIBE with large data results –> hangs
DESCRIBE with small data results –> works

This might also show up with Database Links (DBLINK) as well.

CAUSE

A Network “security” device or setting (possibly local especially on a Microsoft Windows machine) is preventing or “altering” larger TCP packets from being transported across the network.
This in turn is causing the Client to wait on the Server for the data from the query, and the Server to wait on the Client (which thinks part of the packet is still on the way).

1. Check for settings like the DF (“Don’t Fragment”) bit being set.
2. Check for ALG SQL settings being enabled.

*Note: these causes are all external to Oracle so provided only as potential causes.

SOLUTION

Workaround
~~~~~~~~~
As a workaround (or test to prove this is or is not the issue) lower the SQL*Net SDU from the default size of 8192 to 1400 (see reference below for more details on this setting):

1. Add the following single line to the sqlnet.ora file on BOTH ends of the communication:
DEFAULT_SDU_SIZE = 1400

2. Restart the Listener(s) servicing the Database in question, make a new connection from the Client, and test the query that was hanging.

3. If this corrects the issue and allows queries to complete, then there is a network / system device or setting causing fragmentation, detention, or alteration of SQL packets mid-stream.”

Oracle Support set the SDU in the sqlnet.ora file on the target database server:
DEFAULT_SDU_SIZE = 1400

More info on DEFAULT_SDU_SIZE can be found here:

Database Net Services Reference -> 5 Parameters for the sqlnet.ora File -> DEFAULT_SDU_SIZE

And then restarted the listener.  I did the same for the source database server and then the session no longer hanged 🙂 :

SQL> select * from XXX_PER_PEOPLE_F@ebs;
...

1069 rows selected.

Elapsed: 00:00:02.57

Update

28/08/2018 The issue wasn’t resolved and the MTU of the network cards on both the source and target servers was also changed to 1400.  It could possibly been resolved by reducing the SDU further but it was decided to change the MTU on network cards.  In most cases the SDU change will fix the issue, but otherwise the MTU on network card can also resolve the issue as in this case.

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)