Redundant and Duplicate Indexes are different !!

Being a full-time MySQL consultant by profession I get opportunity to work with several interesting customers and issues, Recently during our usual “MySQL Performance HealthCheck & Audit Service” I have noticed several duplicate as well as redundant Indexes in customer MySQL infrastructure, This has affected MySQL performance badly. So I thought it’s worth explaining implications of Redundant and Duplicate Indexes in a separate blog. 

Redundant and Duplicate Indexes are not same !!  
Technically Redundant and Duplicate Indexes are completely different. Let me explain this with an very simple example, Consider table “TAB1” with index  on columns (COL1,COL2) another index on (COL1), now this is what we call “Redundant Index” because index on (COL1,COL2) can also be used as an index on (COL1) alone However an index on  (COL2,COL1) wouldn’t be redundant neither would an index on (COL2) because COL2 is not a leftmost prefix of (COL1,COL2).When you have duplicate indexes (Duplicate Indexes are indexes in same set of columns in the same order), MySQL maintains them separately and eventually optimizer consider each of them when it optimizes queries. The only way to address   Redundant and Duplicate Indexes is to drop them as these multiple indexes in same table is an huge maintenance cost! I personally use “pt-duplicate-key-checker” of Percona Toolkit to identify Redundant/Duplicate Indexes 

Redundant and Duplicate indexes are expensive in Optimal MySQL Operations but never drop them without understanding clearly how application use MySQL indexes ! 

Storing numbers wisely in MySQL !

Generally we have two kinds of numbers, Whole numbers and real numbers (numbers with Fractional units). When we are using whole numbers the natural choice is TINYINT, SMALLINT, MEDIUMINT, INT and BIGINT. These datatypes require 8,16,24,32,64 bits of storage space accordingly.

MySQL allows you to specify “width” of integer types, like INT(25). This doesn’t mean /restrict legal range of values but actually specifies the number of characters MySQL’s interactive tools will reserve for display purposes, INT(1) is same as INT(20) for storage and computational purposes. 

Real numbers can have both fractional and decimal numbers which are very large and don’t fit in “BIGINT”. The “FLOAT” and “DOUBLE” support approximate calculation with standard floating point math.

The “DECIMAL” type is good for storing exact fractional numbers. From MySQL 5.0 and newer, the “DECIMAL” type support exact math! Due to lower precision level “FLOATING-POINT” used during MySQL 4.1 days used to give weird result..

“FLOATING-POINT” math is much faster because CPU performs computations natively. “FLOATING-POINT” type typically use less space than “DECIMAL” to store same range of values. MySQL uses “DOUBLE” for its internal calculations on “FLOATING-POINT” types dues to  its greater precision compared to “FLOAT"

Considering expensive computational cost you should choose “DECIMAL” only when you need exact results for fractional numbers, for example, financial data. It makes more sense to use “BIGINT” for high volume transactions which are not so critical or store them in multiples of smallest fractions of currency! 

Profiling MySQL by Response Time

Being an independent MySQL consultant is fun and equally challenging, I very regularly get the opportunity of doing low-level MySQL Performance Health Check, Diagnostics and Forensics. My customers equally are curious in knowing how optimal is their MySQL infrastructure ? Many times I have heard people defining “Performance” as executing queries optimally using minimum system resources! I contradict this thought by asking in return, If you have bought expensive high processing power infrastructure why not use it to maximum? At MinervaSQL we define performance as total time consumed to complete a task so for us Performance is Response Time . In real life this is very scalable method too because MySQL Server is busy executing queries and how much time consumed by each query leads us to Response Time  and our goal remains reducing the Response Time , Here we measure where the time is consumed.
This also brings most powerful principle of performance optimization “Things which cannot be measured can never be maximised !” . Measure where response time is spent and "never ever” aggregate here, target each query independently. Measure every slow query and monitor what is the impact in the server, Calculate percentage of time spend in waiting and executing of each query.

Profiling MySQL for Response Time   
Accuracy of profiling is key in calculating Response Time, During profiling we measure where MySQL is spending time? In effect here we are measuring how much time is spend in execution ? and how much time spend by waiting ?
When we are doing profiling of MySQL, There are two interesting matrices to notice:

1. Queries are continuously executing using available infrastructure so this is execution time profiling
2. Queries are waiting for tasks to get completed or for resources , This is called wait time profiling

Both of these above mentioned matrices are very important for us to measure the time spend by every query and the results will guide us to conclude where we should be spending time during tuning.

I always list down the queries that can bring down big impact to MySQL performance, If a query consumes only 5% of total time of execution then tuning that specific query can gain only in 5% of improvement (Amdahl’s law)

The "lost time” calculation during response time analysis is very important, It is total wall clock time went unaccounted during execution time. Let’s consider overall CPU time for execution of query is 60 seconds and profiling tools subtasks adds upto 50 seconds then “lost time” here is 10 seconds!

When doing MySQL profiling two of my favourite tools are SHOW PROFILE (Jeremy Cole) and PT-QUERY-DIGEST of Percona Toolkit developed by Percona  

I recommend my customers to collect the response time of each query and measure every single activity that consume time which include even application level response time, It is not just MySQL causes performance bottleneck every time and it’s very straightforward to diagnose whether MySQL is the problem. 

Monitoring InnoDB Buffer Pool activity

Even the most optimal MySQL Database Design and SQL engineering get trapped with Performance Bottlenecks due to Disk I/O Operations. So let’s see where Disk I/O Performance Bottlenecks starts from ? 

Disk I/O Performance Diagnostics should be the first step here, Performance Engineering is driven by methodical Diagnostics & Forensics activity. Here the objective is limited to measure MySQL I/O activity and  so I am using  Unix/Linux utility “top” (sometimes things looks much better when they are simple).  

If your MySQL table is cached in “InnoDB Buffer Pool” then query execution will be optimal with minimal or no disk intervention, This is one major reason why we tune “INNODB_BUFFER_POOL_SIZE” option upto 85% of available physical memory. 
The larger the InnoDB Buffer Pool more the InnoDB will start like in-memory database. InnoDB operates using the variation of the least recently used (LRU) algorithm. InnoDB uses “midpoint insertion strategy” to evicts the least recently used block and adds the new block to the middle of the list.   

Disk seeks are a huge performance bottleneck. The problem becomes more severe  when the amount of data starts to grow so large and effective caching becomes almost impossible. Always consider using disks with better seek time. You can also consider symlinking files to different disks or striping the disks to reduce seek time.  
Monitoring InnoDB Buffer Pool

We have many proprietary and free/open source InnoDB performance monitoring platforms but to measure “Buffer Pool” activity SHOW ENGINE INNODB STATUS command would be sufficient, If you closely examine “BUFFER POOL AND MEMORY” section of results, there are several matrices pertaining to the to operation of the buffer pool LRU algorithm, I have explained them below:

  1. Old database pages: The number of pages in the old sublist of the buffer pool.
  2. Pages made young, not young: The number of old pages that were moved to the head of the buffer pool (the new sublist), and the number of pages that have remained in the old sublist without being made new.
  3. youngs/s non-youngs/s: The number of accesses to old pages that have resulted in making them young or not. This metric differs from Pages made young, not young  in two ways. First, it relates only to old pages. Second, it is based on number of accesses to pages and not the number of pages. (There can be multiple accesses to a given page, all of which are counted.) 
  4. young-making rate:  Hits that cause blocks to move to the head of the buffer pool.
  5. not : Hits that do not cause blocks to move to the head of the buffer pool (due to the delay not being met).

INNODB_BUFFER_POOL_STATS table reports very much the same buffer pool information that is provided by command SHOW ENGINE INNODB STATUS output.

Per second averages provided in InnoDB Monitor output are based on the elapsed time between the current time and the last time InnoDB Monitor output was reported 


MySQL Performance Optimization Checklist

I always thought of maintaining the checklist for MySQL Performance Optimization, This actually helps me to draw the first line of action items and developing scope at advanced stages.   

Optimizing at the Database Level
The most important factor in making a database application fast is its basic design:
  • Are the tables structured properly? In particular, do the columns have the right data types, and does each table have the appropriate columns for the type of work? For example, applications that perform frequent updates often have many tables with few columns, while applications that analyze large amounts of data often have few tables with many columns.
  • Are the right indexes in place to make queries efficient?
  • Are you using the appropriate storage engine for each table, and taking advantage of the strengths and features of each storage engine you use? In particular, the choice of a transactional storage engine such as InnoDB or a nontransactional one such as MyISAM can be very important for performance and scalability.
  • Does each table use an appropriate row format? This choice also depends on the storage engine used for the table. In particular, compressed tables use less disk space and so require less disk I/O to read and write the data. Compression is available for all kinds of workloads with InnoDB tables, and for read-only MyISAM tables.
  • Does the application use an appropriate locking strategy? For example, by allowing shared access when possible so that database operations can run concurrently, and requesting exclusive access when appropriate so that critical operations get top priority. Again, the choice of storage engine is significant. The InnoDB storage engine handles most locking issues without involvement from you, allowing for better concurrency in the database and reducing the amount of experimentation and tuning for your code.
  • Are all memory areas used for caching sized correctly? That is, large enough to hold frequently accessed data, but not so large that they overload physical memory and cause paging. The main memory areas to configure are the InnoDB buffer pool, the MyISAM key cache, and the MySQL query cache.

Optimizing at the Hardware Level
Any database application eventually hits hardware limits as the database becomes more and more busy. A DBA must evaluate whether it is possible to tune the application or reconfigure the server to avoid these bottlenecks, or whether more hardware resources are required. System bottlenecks typically arise from these sources:

  • Disk seeks. It takes time for the disk to find a piece of data. With modern disks, the mean time for this is usually lower than 10ms, so we can in theory do about 100 seeks a second. This time improves slowly with new disks and is very hard to optimize for a single table. The way to optimize seek time is to distribute the data onto more than one disk.
  • Disk reading and writing. When the disk is at the correct position, we need to read or write the data. With modern disks, one disk delivers at least 10–20MB/s throughput. This is easier to optimize than seeks because you can read in parallel from multiple disks.
  • CPU cycles. When the data is in main memory, we must process it to get our result. Having large tables compared to the amount of memory is the most common limiting factor. But with small tables, speed is usually not the problem.
  • Memory bandwidth. When the CPU needs more data than can fit in the CPU cache, main memory bandwidth becomes a bottleneck. This is an uncommon bottleneck for most systems, but one to be aware of.

MVCC & Isolation Levels Explained in MySQL

Every RDBMS  got their own method of implementing Multi Version Concurrency Control (MVCC)   to provide concurrent access to their respective Database Infrastructure.

How MySQL implemented MVCC?
MySQL keeps the copy of everything it changed in the form of undo record inside undo log so every time a record is changed in MySQL the current (old) version of data is transferred to undo log. Every record in MySQL maintains a reference to most recent undo record, It’s called rollback pointer or ROLL_PTR and every undo record carries reference to previous undo record, This chain goes long till the under record exist in undo log

Isolation Levels Explained!  

  • READ UNCOMMITTED: Expect transaction inconsistencies in this isolation level because it always read the newest record in index which isn’t even committed, This causes “dirty read"
  • READ COMMITTED: Read committed records only  based on the current maximum transaction ID at statement start. This look like good fit many times and many database systems (including Oracle) have default isolation level “ READ COMMITTED” but there is something to worry (not that serious though !) about here, the statement to statement user will see new data because the locks that did not match the scan are released after the statement completes  
  • REPEATABLE READ: This address the concern in READ COMMITTED isolation level by guaranteeing the consistent view of data through out the transaction because every lock acquired during transaction is held for the entire duration of the transaction 
  • SERIALIZATION: This make every transaction isolated, as though all transactions are executed serially. InnoDB implicitly converts all plain SELECT statements to “SELECT … LOCK IN SHARE MODE. 

Paging and Swapping explained simple

Often I get questions regarding difference between "Paging" and "Swapping" so decided to blog it here very simple!

"Swapping" occurs when whole process is transferred to disk,while paging is when some part of process is transferred to disk while rest is still in physical memory. There are two utilities to monitor memory, called vmstat (for bsd, etc),sar( for system V, etc).

"Page-ins" and "page-outs" are pages moved in and out from physical memory to disk, "swap-ins" and "swap-outs" are processes moved in and out of disk.

Configuring InnoDB for Performance

InnoDB still remains as the most favourite storage engine in MySQL ecosystem for its flexibility.   I personally don't recommend anyone to tune MySQL performance limited to optimal my.cnf parameters rather you can optimize and scale MySQL much better by writing cost efficient SQL  but this blog post is about highly tuneable InnoDB parameters in your my.cnf to use available system resources optimally so I have my favourite ones listed below: 

You can never miss indoor_buffer_pool_size,  Here you have data, SQL, schema & indexes cached: Being generous (in an dedicated MySQL server go upto 80%) in setting this value address you expensive disk I/O operations . 

innodb_log_file_size  sets the size of the redo logs. The redo logs ensure writes are fast and durable and also during crash recovery. Up to MySQL 5.1, it was hard to adjust, as you wanted both large redo logs for good performance and small redo logs for fast crash recovery but the good news is now crash recovery performance has improved a lot since MySQL 5.5 so you can now have good write performance and fast crash recovery. Until MySQL 5.5 the total redo log size was limited to 4GB (the default is to have 2 log files). This has been lifted in MySQL 5.6.Starting with innodb_log_file_size is 512M (giving 1GB of redo logs) should give you plenty of room for writes. If your application is write-intensive and you are using MySQL 5.6, you can start with innodb_log_file_size  with 4G.

The are times you face ‘Too many connections’ error due to max_connections is too low. It is very frequent and common the application does not close connections to the database correctly, you seriously need much more than the default 151 connections. The main drawback of high values for max_connections (like 1000 or more) is that the server will become unresponsive if for any reason it has to run 1000 or more active transactions. Using  application level connection pooling or a thread pool at the MySQL level can address this issue. 

innodb_file_per_table directs InnoDB if it should store data and indexes in the shared tablespace (innodb_file_per_table = OFF) or in a separate .ibd file for each table (innodb_file_per_table= ON). Having a file per table allows you to reclaim space when dropping, truncating or rebuilding a table. It is also needed for some advanced features such as compression. However it does not provide any performance benefit. The main scenario when you do NOT want file per table is when you have a very high number of tables (say 10k+). MySQL 5.6, the default value is ON so you have nothing to do in most cases. For previous versions, you should set it to ON prior to loading data as it has an effect on newly created tables only. 

innodb_flush_log_at_trx_commit default setting of 1 means that InnoDB is fully ACID compliant. It is the best value when your primary concern is data safety, for instance on a master. However it can have a significant overhead on systems with slow disks because of the extra fsyncs that are needed to flush each change to the redo logs. Setting it to 2 is a bit less reliable because committed transactions will be flushed to the redo logs only once a second, but that can be acceptable on some situations for a master and that is definitely a good value for a replica. 0 is even faster but you are more likely to lose some data in case of a crash: it is only a good value for a replica. 

innodb_flush_method setting controls how data and logs are flushed to disk. Popular values are O_DIRECT when you have a hardware RAID controller with a battery-protected write-back cache and fdatasync (default value) for most other scenarios. sysbench is a good tool to help you choose between the 2 values.

innodb_log_buffer_size is the size of the buffer for transactions that have not been committed yet. The default value (1MB) is usually fine but as soon as you have transactions with large blob/text fields, the buffer can fill up very quickly and trigger extra I/O load. Constantly monitor the Innodb_log_waits status variable and if it is not 0, increase innodb_log_buffer_size. 

MySQL Show Profile

The MySQL SHOW PROFILE is the community contribution from Jeremy Cole, included from MySQL 5.1 and newer. It is also among the most favourite query profiling tool available in GA release of MySQL. By default it is disabled but can be enabled for session with command below:

mysql> SET profiling = ON; 

So by enabling profiling you can measure the elapsed time and few other performance matrices when state of query execution changes. Every time you issue query to MySQL Server, It records the profiling information in temporary table and will also assign the SQL statement and integer identifier, starting with 1. I have explained this in an example below 

mysql> show profiles;

| Query_ID | Duration   | Query                 |
|        1 | 0.00065225 | select * from student |
|        2 | 0.03845225 | show status           |
2 rows in set, 1 warning (0.00 sec)

In the above example you will notice Query_ID, Duration and Query. To get now the detailed profiling of the specific query please follow the steps below 

mysql> show profile for query 1;

| Status               | Duration |
| starting             | 0.000061 |
| checking permissions | 0.000013 |
| Opening tables       | 0.000084 |
| init                 | 0.000026 |
| System lock          | 0.000021 |
| optimizing           | 0.000010 |
| statistics           | 0.000018 |
| preparing            | 0.000017 |
| executing            | 0.000007 |
| Sending data         | 0.000149 |
| end                  | 0.000029 |
| query end            | 0.000125 |
| closing tables       | 0.000033 |
| freeing items        | 0.000033 |
| cleaning up          | 0.000029 |

Now we have a challenge, i.e we can't sort this info. so let's switch the "SHOW PROFILE" command to corresponding information_schema table   

mysql>  select state, sum(duration) as Total_R,

    ->  Round(
    ->  100*sum(duration) /
    ->   (select sum(duration)
    -> from information_schema.profiling
    -> where query_id=@query_id
    -> ), 2) as Pct_R,
    -> count(*) as Calls,
    ->  sum(duration) / Count(*) as "R/Call"
    ->  from information_schema.profiling
    -> where query_id=@query_id
    ->  group by state
    ->  order by Total_R desc;
| state                | Total_R  | Pct_R | Calls | R/Call       |
| Sending data         | 0.000149 | 22.75 |     1 | 0.0001490000 |
| query end            | 0.000125 | 19.08 |     1 | 0.0001250000 |
| Opening tables       | 0.000084 | 12.82 |     1 | 0.0000840000 |
| starting             | 0.000061 |  9.31 |     1 | 0.0000610000 |
| freeing items        | 0.000033 |  5.04 |     1 | 0.0000330000 |
| closing tables       | 0.000033 |  5.04 |     1 | 0.0000330000 |
| end                  | 0.000029 |  4.43 |     1 | 0.0000290000 |
| cleaning up          | 0.000029 |  4.43 |     1 | 0.0000290000 |
| init                 | 0.000026 |  3.97 |     1 | 0.0000260000 |
| System lock          | 0.000021 |  3.21 |     1 | 0.0000210000 |
| statistics           | 0.000018 |  2.75 |     1 | 0.0000180000 |
| preparing            | 0.000017 |  2.60 |     1 | 0.0000170000 |
| checking permissions | 0.000013 |  1.98 |     1 | 0.0000130000 |
| optimizing           | 0.000010 |  1.53 |     1 | 0.0000100000 |
| executing            | 0.000007 |  1.07 |     1 | 0.0000070000 |
15 rows in set (0.04 sec)

MySQL adaptive hash index explained

An optimization for InnoDB tables that can speed up lookups using = and IN operators, by constructing a hash index in memory. MySQL monitors index searches for InnoDB tables, and if queries could benefit from a hash index, it builds one automatically for index pages that are frequently accessed. In a sense, the adaptive hash index configures MySQL at runtime to take advantage of ample main memory, coming closer to the architecture of main-memory databases. This feature is controlled by the innodb_adaptive_hash_index  configuration option. Because this feature benefits some workloads and not others, and the memory used for the hash index is reserved in the buffer pool, typically you should benchmark with this feature both enabled and disabled.
The hash index is always built based on an existing InnoDB secondary index, which is organized as a B-tree structure. MySQL can build a hash index on a prefix of any length of the key defined for the B-tree, depending on the pattern of searches against the index. A hash index can be partial; the whole B-tree index does not need to be cached in the buffer pool.
In MySQL 5.6 and higher, another way to take advantage of fast single-value lookups with InnoDB tables is to use the memcached interface to InnoDB. 


Sizing innodb_buffer_pool_size

It's not unusual I have noticed many customers running innodb_buffer_pool_size with default 8M but if you have dedicated MySQL box then please give generously memory to innodb buffer pool, I am not actually demanding 95% of your available memory here ;)  It will be great if you can provide 15% larger than size of your database (always account for future growth!). Actually buffer pool is not just limited with data pages but also adaptive hash indexes . This now also leads to think about how much memory required for your systems operations optimally? You system processes, page tables etc. all need memory so please plan proactively here! I work only on 64 bit dedicated Linux/MySQL boxes so will provision 20% of memory for optimal system operations (But I regularly monitor si/so matrices of vmstat to measure swapping and expect these numbers zero or near to zero)  

Double buffering is never appreciated !
We don't want OS to cache what innodb is doing already! Innodb cache is anytime more efficient when compared to OS cache because there is no copying, due to adaptive hash indexes. You have to bypass OS buffering innodb by setting innodb_flush_method=O_DIRECT on Linux and FreeBSD.

Optimal sizing of innodb_buffer_pool_size is a major activity in every MySQL Infrastructure to ensure available system resources are used judiciously ! 

MySQL NUMA and Excessive Swapping

Excessive swapping always is a serious concern from both MySQL Performance & Scalability, I personally expect si/so columns in VMSTAT are zero or close to zero but things go wired very often in complex & large MySQL Infrastructure from swapping perspective. Linux swap-out a large amount of memory despite no real memory crunching pressure, I don't buy swap disabled strategy because kernel assumes there is swap space and will go cranky during real memory spikes and there is nothing to swap out. Though you can run MySQL in memory using -memlock but then you experience most weirdest events so I go only with swap "enabled" but will regularly monitor VMSTAT..

How Percona improved BUFFER POOL scalability?

The InnoDB buffer pool is a well known point of contention when many queries are executed concurrently. In XtraDB, the global mutex protecting the buffer pool has been split into several mutexes to decrease contention.
This feature splits the single global InnoDB buffer pool mutex into several mutexes:
flush_state_mutexflushing state of dirty blocks
LRU_list_mutexLRU lists of blocks in buffer pool
flush_list_mutexflush list of dirty blocks to flush
free_list_mutexlist of free blocks in buffer pool
zip_free_mutexlists of free area to treat compressed pages
zip_hash_mutexhash table to search compressed pages
The goal of this change is to reduce mutex contention, which can be very impacting when the working set does not fit in memory.

TOP Explained

1° Row — top

top - 22:23:21 up 10:05,  4 users,  load average: 0.00, 0.01, 0.05

indicates in order:
  • current time (22:23:21)
  • uptime of the machine (up 0 days, 10:05)
  • users sessions logged in (4 users)
  • average load on the system (load average: 0.00, 0.01, 0.05) the 3 values refer to the last minute, five minutes and 15 minutes.

2° Row – task
 Tasks: 172 total,   2 running, 167 sleeping,   3 stopped,   0 zombie
The second row gives the following information:
  • Processes running in totals (172 total)
  • Processes running (2 running)
  • Processes sleeping (167 sleeping)
  • Processes stopped (3 stopped)
  • Processes waiting to be stoppati from the parent process (0 zombie)
3° Row – cpu
 Cpu(s):  0.7%us,  0.7%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
The third line indicates how the cpu is used. If you sum up all the percentages the total will be 100% of the cpu. Let’s see what these values indicate in order:
  • Percentage of the CPU for user processes (0.7%us)
  • Percentage of the CPU for system processes (0.7%sy)
  • Percentage of the CPU processes with priority upgrade nice (0.0%ni)
  • Percentage of the CPU not used (98,7%id)
  • Percentage of the CPU processes waiting for I/O operations(0.0%wa)
  • Percentage of the CPU serving hardware interrupts (0.0% hi — Hardware IRQ
  • Percentage of the CPU serving software interrupts (0.0% si — Software Interrupts
  • The amount of CPU ‘stolen’ from this virtual machine by the hypervisor for other tasks (such as running another virtual machine) this will be 0 on desktop and server without Virtual machine. (0.0%st — Steal Time)

4° and 5° Rows – memory usage

Mem:   1018564k total,   939864k used,    78700k free,      348k buffers
Swap:  2064380k total,     4392k used,  2059988k free,   596160k cached
The fourth and fifth rows respectively indicate the use of physical memory (RAM) and swap. In this order: Total memory in use, free, buffers cached. 

Following Rows — Processes list
   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                   
 1601 haldaemo  20   0 36900 2472 1584 S  0.3  0.2   0:02.71 hald                                                                                                       
 1734 root      20   0  310m 1116  800 S  0.3  0.1   0:17.42 VBoxService                                                                                                
 2323 root      20   0  191m  32m 6640 S  0.3  3.2   0:14.30 Xorg

And as last thing ordered by CPU usage (as default) there are the processes currently in use. Let’s see what information we can get in the different columns:
  • PID – l’ID of the process(1734)
  • USER – The user that is the owner of the process (root)
  • PR – priority of the process (20)
  • NI – The “NICE” value of the process (0)
  • VIRT – virtual memory used by the process (310m)
  • RES – physical memory used from the process (1116m)
  • SHR – shared memory of the process (800)
  • S – indicates the status of the process: S=sleep R=running Z=zombie (S)
  • %CPU – This is the percentage of CPU used by this process (0.3)
  • %MEM – This is the percentage of RAM used by the process (0.1)
  • TIME+ –This is the total time of activity of this process (0:17.42)
  • COMMAND – And this is the name of the process 

How do I monitor Oracle 11g RAC Infrastructure with V$ Views

V$CLUSTER_INTERCONNECTS displays one or more interconnects that are being used for cluster communication.
V$CONFIGURED_INTERCONNECTS displays all the interconnects that Oracle is aware of. This view attempts to answer the question of where Oracle found the information about a specific interconnect.
V$DLM_MISC displays miscellaneous DLM statistics.
V$DLM_CONVERT_LOCAL displays the elapsed time for the local lock conversion operation.
V$DLM_CONVERT_REMOTE displays the elapsed time for the remote lock conversion operation.
V$GES_ENQUEUE view describes all locks currently known to lock manager.
V$GES_BLOCKING_ENQUEUEview describes all locks currently known to lock manager that are being blocked or blocking others. The output of this view is a subset of the output from V$GES_ENQUEUE.
V$DLM_LOCKS view describes all locks currently known to lock manager that are being blocked or blocking others. The output of this view is a subset of the output from V$DLM_ALL_LOCKS.
V$DLM_RESS view displays information of all resources currently known to the lock manager.
V$GLOBAL_BLOCKED_LOCKS view displays global blocked locks.

Oracle 12c Health Check using Trace Files and the Alert Log

Monitoring Errors with Trace Files and the Alert Log

Each server and background process can write to an associated trace fileWhen an internal error is detected by a process, it dumps information about the error to its trace file. Some of the information written to a trace file is intended for the database administrator, and other information is for Oracle Support Services. Trace file information is also used to tune applications and instances

The alert log is a chronological log of messages and errors, and includes the following items:
  • All internal errors (ORA-00600), block corruption errors (ORA-01578), and deadlock errors (ORA-00060) that occur
  • Administrative operations, such as CREATEALTER, and DROP statements and STARTUPSHUTDOWN, and ARCHIVELOG statements
  • Messages and errors relating to the functions of shared server and dispatcher processes
  • Errors occurring during the automatic refresh of a materialized view
  • The values of all initialization parameters that had nondefault values at the time the database and instance start 
Oracle Database uses the alert log to record these operations as an alternative to displaying the information on an operator's console (although some systems also display information on the console). If an operation is successful, a "completed" message is written in the alert log, along with a timestamp.
The alert log is maintained as both an XML-formatted file and a text-formatted file. You can view either format of the alert log with any text editor or you can use the ADRCI utility to view the XML-formatted version of the file with the XML tags stripped.
Check the alert log and trace files of an instance periodically to learn whether the background processes have encountered errors. For example, when the log writer process (LGWR) cannot write to a member of a log group, an error message indicating the nature of the problem is written to the LGWR trace file and the alert log. Such an error message means that a media or I/O problem has occurred and should be corrected immediately.
Oracle Database also writes values of initialization parameters to the alert log, in addition to other important statistics.
The alert log and all trace files for background and server processes are written to the Automatic Diagnostic Repository, the location of which is specified by the DIAGNOSTIC_DEST initialization parameter. The names of trace files are operating system specific, but each file usually includes the name of the process writing the file (such as LGWR and RECO).

Controlling the Size of an Alert Log

To control the size of an alert log, you must manually delete the file when you no longer need it. Otherwise the database continues to append to the file.
You can safely delete the alert log while the instance is running, although you should consider making an archived copy of it first. This archived copy could prove valuable if you should have a future problem that requires investigating the history of an instance.

Controlling the Size of Trace Files

You can control the maximum size of all trace files (excluding the alert log) using the initialization parameter MAX_DUMP_FILE_SIZE. You can set this parameter in the following ways:
  • A numerical value specifies the maximum size in operating system blocks. The specified value is multiplied by the block size to obtain the limit.
  • A number followed by a K, M, or G suffix specifies the file size in kilobytes, megabytes, or gigabytes.
  • UNLIMITED, which is the default, specifies no limit.
Oracle Database can automatically segment trace files based on the limit you specify with the MAX_DUMP_FILE_SIZE initialization parameter. When a limit is reached, the database renames the current trace file using a sequential number, and creates an empty file with the original name.

The MAX_DUMP_FILE_SIZE Parameter and Trace File Segmentation
MAX_DUMP_FILE_SIZE SettingTrace File Segmentation
Trace files are not segmented.
Larger than 15M
Trace files are segmented on a boundary that is 1/5 of the MAX_DUMP_FILE_SIZE setting. Trace files with sizes that are less than this boundary in size are not segmented. For example, if the MAX_DUMP_FILE_SIZEsetting is 100M, then the boundary is 20 MB (1/5 of 100 MB).
15M or less
Trace files are not segmented.
There can be up to five segments, but the total combined size of the segments cannot exceed the MAX_DUMP_FILE_SIZE limit. When the combined size of all segments of the trace file exceeds the specified limit, the oldest segment after the first segment is deleted, and a new, empty segment is created. Therefore, the trace file always contains the most recent trace information. The first segment is not deleted because it might contain relevant information about the initial state of the process.
Segmentation improves space management for trace files. Specifically, segmentation enables you to manage trace files in the following ways:
  • You can purge old trace files when they are no longer needed.
  • You can diagnose problems with smaller trace files and isolate trace files that must be packaged for the incident packaging service (IPS).

Monitoring a Database with Server-Generated Alerts

A server-generated alert is a notification from the Oracle Database server of an impending problem. The notification may contain suggestions for correcting the problem. Notifications are also provided when the problem condition has been cleared.
Alerts are automatically generated when a problem occurs or when data does not match expected values for metrics, such as the following:
  • Physical Reads Per Second
  • User Commits Per Second
  • SQL Service Response Time
Server-generated alerts can be based on threshold levels or can issue simply because an event has occurred. Threshold-based alerts can be triggered at both threshold warning and critical levels. The value of these levels can be customer-defined or internal values, and some alerts have default threshold levels which you can change if appropriate. For example, by default a server-generated alert is generated for tablespace space usage when the percentage of space usage exceeds either the 85% warning or 97% critical threshold level. Examples of alerts not based on threshold levels are:
  • Snapshot Too Old
  • Resumable Session Suspended
  • Recovery Area Space Usage
An alert message is sent to the predefined persistent queue ALERT_QUE owned by the user SYS. Cloud Control reads this queue and provides notifications about outstanding server alerts, and sometimes suggests actions for correcting the problem. The alerts are displayed on the Cloud Control Database Home page and can be configured to send email or pager notifications to selected administrators. If an alert cannot be written to the alert queue, a message about the alert is written to the Oracle Database alert log.
Background processes periodically flush the data to the Automatic Workload Repository to capture a history of metric values. The alert history table andALERT_QUE are purged automatically by the system at regular intervals.

Server-Generated Alerts Data Dictionary Views

The following data dictionary views provide information about server-generated alerts.
DBA_THRESHOLDSLists the threshold settings defined for the instance
DBA_OUTSTANDING_ALERTSDescribes the outstanding alerts in the database
DBA_ALERT_HISTORYLists a history of alerts that have been cleared
V$ALERT_TYPESProvides information such as group and type for each alert
V$METRICNAMEContains the names, identifiers, and other information about the system metrics
V$METRICContains system-level metric values
V$METRIC_HISTORYContains a history of system-level metric values