OGB Appreciation Day : Exadata X8M

What is OGB Appreciation Day?

The Oracle Groundbreakers (OGB) Appreciation Day formally known as OTN Appreciation Day and ODC Appreciation Day, is a great initiative by Tim Hall aka Oracle-Base.com.  Where we take the opportunity to say thanks to the Oracle Community which includes but not limited to ACEs, Java Champions, Ambassadors and all those who have the Groundbreakers spirit #ThanksOGB 🙂

I wonder what will be the name will be next year 😉

More info on Oracle Groundbreakers Community can be found here:
About Oracle Groundbreakers Community

When is it?

This year, it is on Thursday 10th October 2019 and I have to confess I totally missed it and thus my post is a few days out but I didn’t want to do disservice to the spirit of the initiative.

You can see my previous post here:
2017 – ODC Appreciation Day : Oracle Exadata Database Machine
2018 – ODC Appreciation Day : Oracle dcli Utility

You can see a summary of previous years blog posts here:
2016 – OTN Appreciation Day : Summary
2017 – ODC Appreciation Day 2017 : It’s a Wrap (#ThanksODC)
2018 – ODC Appreciation Day 2018 : It’s a Wrap (#ThanksODC)
2019 – OGB Appreciation Day 2019 : It’s a Wrap (#ThanksOGB) – this year

My Contribution : Exadata X8M

When I was at Oracle Open World 2019 a few weeks ago, Larry Ellison (CTO of Oracle) announced the new Exadata X8M:

IMG_6715

Key point being in-memory performance utilising persistence memory and RDMA Network over Converged Ethernet (RoCE), which I will detail later on in this blog post.

Larry also boasted the Exadata X8M storage is 50x faster then AWS and 100x faster then Azure All flash storage:

IMG_6716

Following the announcement I attended another 2 sessions with Juan Loaiza (Executive Vice President, Mission Critical Database Technologies, Oracle) and Kothanda Umamageswaran (Senior Vice President, Exadata Development)/Gavin Parish (Senior Principal Product Manager, Exadata Development), who gave more details on the Exadata X8M:

IMG_6880

The keys changes are:

  1. 100Gb/Sec RoCE internal fabric
  2. 1.5TB Persistent Memory per storage server/cell

IMG_6881

RoCE Networking

IMG_6882

RoCE stand for RDMA (Remote Direct Memory Access) over Converged Ethernet, which initially from the start of Exadata had been over InfiniBand, however Oracle stated Ethernet has caught up and surpassed InfiniBand giving 100Gb/sec throughput as opposed to 40Gb/sec with InfiniBand which is 2.5 times faster:

IMG_6883

IMG_6910

RoCE uses InfiniBand RDMA software on top of Ethernet, so includes all the optimisation and allows for backwards compatibility:

IMG_6911

Also mentioned is the smart network prioritisation which can prioritise critical database messages such as transaction commits, cache fusion over backups, etc using Class of Service:

IMG_6914

An another nice addition is instance failure detected through use of RoCE, because if all 4 ports don’t respond it confirmed server failure and instantly evicted from cluster:

IMG_6915

Persistent Memory

IMG_6884

The Exadata X8M uses Intel Optane DC persistent memory a new silicon technology that capacity, performance and cost is between DRAM and flash:

IMG_6885

In the Exadata X8M, the persistent memory is shared, just as disks and flash are.  So you get all the benefit of aggregated performance, redundancy, etc:

IMG_6887

The benefit of RoCE with persistent memory is the Persistent Memory Data Accelerator, that allows the database to use RDMA instead of I/O bypassing network and IO software, interrupts, context switches:

IMG_6919

Another benefit of persistent memory is the Memory Commit Accelerator, which like Smart Flash Logging, uses persistent memory to further speed up log writes by 8x using oersistent memory as a buffer which is flushed to flash or disk later on:

IMG_6920

Smart capacity management of persistent memory, so primaries on persistent memory and secondary on flash, which is automatically moved to persistent memory when primary is unavailable:

IMG_6921

If Exadata was not fast enough, all this innovation has lead to the “Worlds Fastest Database Machine” with a astonishing 16 million IOPS with less then 19 microseconds:

IMG_6909

For more information on Exadata X8M can be found here.

Finally Happy OGB Appreciation Day! #ThanksOGB #ThanksODC #ThanksOTN 🙂

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

Advertisements

Session using a database link hangs on “SQL*Net more data from dblink”

I have a client who recently move a database server from on-premise to a Cloud provider.  A database on this database server had a database link to their E-business database in the Oracle Cloud.  Since the move, any sessions in the database that use the database link to the E-business database would hang if the query was to return large dataset.

Below is selecting from dual over the database link that worked:

SQL> set timing on
SQL> set autotrace on
SQL> select * from dual@ebs;

D
-
X

Elapsed: 00:00:00.13

Execution Plan
----------------------------------------------------------
Plan hash value: 272002086

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|      |     1 |     2 |     2   (0)| 00:00:01 |        |
|   1 |  TABLE ACCESS FULL     | DUAL |     1 |     2 |     2   (0)| 00:00:01 | PWGPSI |
----------------------------------------------------------------------------------------

Note
-----
   - fully remote statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        511  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

But selecting from a table hangs:

SQL> select * from XXX_PER_PEOPLE_F@ebs;

Hangs!......

The session in the source database shows the wait “SQL*Net more data from dblink”:

       SID USERNAME       STATUS   MACHINE                         TERMINAL                       PROGRAM                                          EVENT                                        WAIT_CLASS                   WAIT_TIME SECONDS_IN_WAIT STATE

---------- -------------- -------- ------------------------------- ------------------------------ ------------------------------------------------ -------------------------------------------- ---------------------------- --------- --------------- -------------------

       144 XXXXXXXXX      ACTIVE   XXX                             unknown                        sqlplus@xxxxx.xxxxx.xxx (TNS V1-V3)              SQL*Net more data from dblink                Network                      0         341             WAITING

The session in the target database shows the wait “SQL*Net message from client”:

SID USERNAME EVENT                       SQL_ID   MACHINE       PROGRAM                        PROCESS 
--- -------- --------------------------- -------- ------------- ------------------------------ ---------- 
35  XXX      SQL*Net message from client XXX      XXX.XXX.XXX   oracle@XXX.XXX.XXX (TNS V1-V3) 15340

I traced the source session and could see the session hangs waiting for more data from the target (EBS) database:

WAIT #2: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920328443464

*** 2018-07-17 11:13:47.934

WAIT #2: nam='SQL*Net message from client' ela= 11336000 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920339779586
WAIT #0: nam='single-task message' ela= 179653 p1=0 p2=0 p3=0 obj#=-1 tim=1495920339960679
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920339989423
WAIT #0: nam='SQL*Net message from dblink' ela= 29064 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340018526
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340018982
WAIT #0: nam='SQL*Net message from dblink' ela= 30534 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340049542
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340049793
WAIT #0: nam='SQL*Net message from dblink' ela= 42171 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340091996
WAIT #0: nam='SQL*Net more data from dblink' ela= 4 driver id=1413697536 #bytes=17 p3=0 obj#=-1 tim=1495920340092053
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340092253
WAIT #0: nam='SQL*Net message from dblink' ela= 28448 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340120720
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340120854
WAIT #0: nam='SQL*Net message from dblink' ela= 30884 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340151761
WAIT #0: nam='SQL*Net message to dblink' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340151816
WAIT #0: nam='SQL*Net message from dblink' ela= 28590 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340180420
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340180491
WAIT #0: nam='SQL*Net message from client' ela= 207 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340180713
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340180787
WAIT #0: nam='SQL*Net message from dblink' ela= 28635 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340209434
WAIT #0: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340209472
WAIT #0: nam='SQL*Net message from client' ela= 70 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1495920340209554
WAIT #0: nam='SQL*Net message to dblink' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340209601
WAIT #0: nam='SQL*Net message from dblink' ela= 29357 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340238969
WAIT #0: nam='SQL*Net more data from dblink' ela= 6 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340239133
WAIT #0: nam='SQL*Net more data from dblink' ela= 37 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340239196
WAIT #0: nam='SQL*Net more data from dblink' ela= 124 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340239340
WAIT #0: nam='SQL*Net more data from dblink' ela= 37 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340239409
WAIT #0: nam='SQL*Net more data from dblink' ela= 93 driver id=1413697536 #bytes=3 p3=0 obj#=-1 tim=1495920340239528
WAIT #0: nam='SQL*Net more data from dblink' ela= 79 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340239632
WAIT #0: nam='SQL*Net more data from dblink' ela= 28330 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340268062
WAIT #0: nam='SQL*Net more data from dblink' ela= 4 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=1495920340268111
WAIT #0: nam='SQL*Net more data from dblink' ela= 168 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340268297
WAIT #0: nam='SQL*Net more data from dblink' ela= 55 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1495920340268378
WAIT #0: nam='SQL*Net more data from dblink' ela= 69 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340268483
WAIT #0: nam='SQL*Net more data from dblink' ela= 47 driver id=1413697536 #bytes=4 p3=0 obj#=-1 tim=1495920340268553
Hangs!...

After raising a SR with My Oracle Support (MOS) for the database we manage and for the one Oracle manage in the Oracle Cloud, it was concluded the package size (Session Data Unit) was larger then the allowable for a network component between the two databases.  They referred to MOS note:

Query on “bigtable” from remote Client hangs (certain queries or fields) (Doc ID 2104257.1)

“SYMPTOMS

Certain queries are hanging when run from some remote Clients.  However, other (smaller) queries are successful.

This is especially evident on queries that require more than 2kb of data to be returned.
Some examples:

select * from v$database; –> hangs
select count(*) from v$database; –> works

DESCRIBE with large data results –> hangs
DESCRIBE with small data results –> works

This might also show up with Database Links (DBLINK) as well.

CAUSE

A Network “security” device or setting (possibly local especially on a Microsoft Windows machine) is preventing or “altering” larger TCP packets from being transported across the network.
This in turn is causing the Client to wait on the Server for the data from the query, and the Server to wait on the Client (which thinks part of the packet is still on the way).

1. Check for settings like the DF (“Don’t Fragment”) bit being set.
2. Check for ALG SQL settings being enabled.

*Note: these causes are all external to Oracle so provided only as potential causes.

SOLUTION

Workaround
~~~~~~~~~
As a workaround (or test to prove this is or is not the issue) lower the SQL*Net SDU from the default size of 8192 to 1400 (see reference below for more details on this setting):

1. Add the following single line to the sqlnet.ora file on BOTH ends of the communication:
DEFAULT_SDU_SIZE = 1400

2. Restart the Listener(s) servicing the Database in question, make a new connection from the Client, and test the query that was hanging.

3. If this corrects the issue and allows queries to complete, then there is a network / system device or setting causing fragmentation, detention, or alteration of SQL packets mid-stream.”

Oracle Support set the SDU in the sqlnet.ora file on the target database server:
DEFAULT_SDU_SIZE = 1400

More info on DEFAULT_SDU_SIZE can be found here:

Database Net Services Reference -> 5 Parameters for the sqlnet.ora File -> DEFAULT_SDU_SIZE

And then restarted the listener.  I did the same for the source database server and then the session no longer hanged 🙂 :

SQL> select * from XXX_PER_PEOPLE_F@ebs;
...

1069 rows selected.

Elapsed: 00:00:02.57

Update

28/08/2018 The issue wasn’t resolved and the MTU of the network cards on both the source and target servers was also changed to 1400.  It could possibly been resolved by reducing the SDU further but it was decided to change the MTU on network cards.  In most cases the SDU change will fix the issue, but otherwise the MTU on network card can also resolve the issue as in this case.

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)

VMware Expert Database Workshop Program Oracle Edition – Day 3

Day 3 kicked off again with another early start at 7am, yawn 🙂

Again, it was a very intense day, with lots of presentations and then ending with video interviews of each attendee:

  • Dean Bolton from VLSS, talked to us about “License Fortress from VLSS”
    • Very interesting product, where VLSS can advise you how to run your Oracle on VMware, then protect you from Oracle with the License Fortress guarantee, that has lawyers ready to defend you in case of license compliance backed by an insurance policy if required.  So the customer is never at risk when they take out the “License Fortress from VLSS”.
    • Plug for my current employer who also offer License Audit Consulting.
  • Chris Rohan from VMware, talked to us about “VMware vSphere Core & SDDC – Networking – NSX & VCNS”
  • Marcus Thordal from Brocade, talked to us about “Brocade and VMware Technology and the VMware Solutions Lab”
    • Interestingly Brocade have worked out a way to tag network packets, so you can identify which VM guest is causing network traffic
    • Also had a good example of how NVMe is causing need to higher network bandwidth
  • Simon Guyennet from VMware, talked to us about “Emerging Products” & “VMware Integrated Containers and Oracle”
    • Lot of NDA stuff, so those interested in this area, keep a look out, some interesting stuff coming soon 🙂
  • Mike Adams from VMware, talked to us about “The CPBU, vSphere and Friends, and the Experts Program”
    • Key take away that was not NDA, is VMware on AWS that is currently available to select customers and will be Generally Available soon 🙂
  • Somu Rajarathinam and Ron Ekins from Pure Storage gave a Technical Session
  • Feidhlim O’Leary from VMware, talked to us about “High Availability and Disaster Recovery in the SDDC”
  • Alain Geenrits from Blue Medora, talked to us about “Management & Monitoring – Blue Medora and Oracle on vSphere”
  • Daniel Hesselink from License Consulting, talked to us about “License Audit with License Consulting”
  • The duo Sudhir Balasubramanian and Mohan Potheri, talked to us about “vSphere HA or Oracle RAC, SRM or Data Guard, they are all complimentary when Oracle is run in the SDDC”
    • I enjoyed the labs from this duo, with their “good cop, bad cop” style 🙂

The workshop ended with a short video interview, where we were each asked to introduce ourselves, answer a few questions about the workshop and Pure Storage.  I’m not the best at this sort of things, so I don’t think I’ll end up in the marketing video of the event, but time will tell 🙂

My OCM buddy Yvonne Murphy, then gave a few of us an extended tour of the Global Support Services (GSS) whilst we waited for the shuttle back to the hotel.

Then it was a quick chauffeured ride to the airport and a short flight back home to Manchester.

Another great day that concluded the workshop, it increased my knowledge of VMware and gave me a great opportunity to network with Oracle Database Experts from around Europe 🙂

Many thanks to VMware and Pure Storage for organising this workshop and allowing me to be a part of it 🙂

My tweets for the day can be seen here.

The VMware Expert Database Workshop Program hashtag is #VMWORA

My related Blog Posts

VMware Expert Database Workshop Program Oracle Edition
VMware Expert Database Workshop Program Oracle Edition – Day 1
VMware Expert Database Workshop Program Oracle Edition – Day 2

Other related Blog Posts

Tim Hall (Oracle Base) – VMware Expert Database Workshop Program Oracle Edition
Tim Hall (Oracle Base) – VMware Workshop – The Journey Begins
Tim Hall (Oracle Base) – VMware Workshop – Day 1
Tim Hall (Oracle Base) – VMware Workshop – Day 2
Tim Hall (Oracle Base) – VMware Workshop – Day 3
Michael Corey (Columnist) – VMware Experts Program Oracle Edition
Michael Corey (Columnist) – Day 1 VMware Experts Program Oracle Edition
Michael Corey (Columnist) – Day 2 VMware Experts Program Oracle Edition

If you found this blog post useful, please like as well as follow me through my various Social Media avenues available on the sidebar and/or subscribe to this oracle blog via WordPress/e-mail.

Thanks

Zed DBA (Zahid Anwar)