TOP Explained

1° Row — top

top - 22:23:21 up 10:05,  4 users,  load average: 0.00, 0.01, 0.05

indicates in order:
  • current time (22:23:21)
  • uptime of the machine (up 0 days, 10:05)
  • users sessions logged in (4 users)
  • average load on the system (load average: 0.00, 0.01, 0.05) the 3 values refer to the last minute, five minutes and 15 minutes.

2° Row – task
 Tasks: 172 total,   2 running, 167 sleeping,   3 stopped,   0 zombie
The second row gives the following information:
  • Processes running in totals (172 total)
  • Processes running (2 running)
  • Processes sleeping (167 sleeping)
  • Processes stopped (3 stopped)
  • Processes waiting to be stoppati from the parent process (0 zombie)
3° Row – cpu
 Cpu(s):  0.7%us,  0.7%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
The third line indicates how the cpu is used. If you sum up all the percentages the total will be 100% of the cpu. Let’s see what these values indicate in order:
  • Percentage of the CPU for user processes (0.7%us)
  • Percentage of the CPU for system processes (0.7%sy)
  • Percentage of the CPU processes with priority upgrade nice (0.0%ni)
  • Percentage of the CPU not used (98,7%id)
  • Percentage of the CPU processes waiting for I/O operations(0.0%wa)
  • Percentage of the CPU serving hardware interrupts (0.0% hi — Hardware IRQ
  • Percentage of the CPU serving software interrupts (0.0% si — Software Interrupts
  • The amount of CPU ‘stolen’ from this virtual machine by the hypervisor for other tasks (such as running another virtual machine) this will be 0 on desktop and server without Virtual machine. (0.0%st — Steal Time)

4° and 5° Rows – memory usage

Mem:   1018564k total,   939864k used,    78700k free,      348k buffers
Swap:  2064380k total,     4392k used,  2059988k free,   596160k cached
 
The fourth and fifth rows respectively indicate the use of physical memory (RAM) and swap. In this order: Total memory in use, free, buffers cached. 

Following Rows — Processes list
   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                   
 1601 haldaemo  20   0 36900 2472 1584 S  0.3  0.2   0:02.71 hald                                                                                                       
 1734 root      20   0  310m 1116  800 S  0.3  0.1   0:17.42 VBoxService                                                                                                
 2323 root      20   0  191m  32m 6640 S  0.3  3.2   0:14.30 Xorg

And as last thing ordered by CPU usage (as default) there are the processes currently in use. Let’s see what information we can get in the different columns:
  • PID – l’ID of the process(1734)
  • USER – The user that is the owner of the process (root)
  • PR – priority of the process (20)
  • NI – The “NICE” value of the process (0)
  • VIRT – virtual memory used by the process (310m)
  • RES – physical memory used from the process (1116m)
  • SHR – shared memory of the process (800)
  • S – indicates the status of the process: S=sleep R=running Z=zombie (S)
  • %CPU – This is the percentage of CPU used by this process (0.3)
  • %MEM – This is the percentage of RAM used by the process (0.1)
  • TIME+ –This is the total time of activity of this process (0:17.42)
  • COMMAND – And this is the name of the process 

How do I monitor Oracle 11g RAC Infrastructure with V$ Views


V$CLUSTER_INTERCONNECTS displays one or more interconnects that are being used for cluster communication.
V$CONFIGURED_INTERCONNECTS displays all the interconnects that Oracle is aware of. This view attempts to answer the question of where Oracle found the information about a specific interconnect.
V$DLM_MISC displays miscellaneous DLM statistics.
V$DLM_CONVERT_LOCAL displays the elapsed time for the local lock conversion operation.
V$DLM_CONVERT_REMOTE displays the elapsed time for the remote lock conversion operation.
V$GES_ENQUEUE view describes all locks currently known to lock manager.
V$GES_BLOCKING_ENQUEUEview describes all locks currently known to lock manager that are being blocked or blocking others. The output of this view is a subset of the output from V$GES_ENQUEUE.
V$DLM_LOCKS view describes all locks currently known to lock manager that are being blocked or blocking others. The output of this view is a subset of the output from V$DLM_ALL_LOCKS.
V$DLM_RESS view displays information of all resources currently known to the lock manager.
V$GLOBAL_BLOCKED_LOCKS view displays global blocked locks.



Oracle 12c Health Check using Trace Files and the Alert Log

Monitoring Errors with Trace Files and the Alert Log

Each server and background process can write to an associated trace fileWhen an internal error is detected by a process, it dumps information about the error to its trace file. Some of the information written to a trace file is intended for the database administrator, and other information is for Oracle Support Services. Trace file information is also used to tune applications and instances

The alert log is a chronological log of messages and errors, and includes the following items:
  • All internal errors (ORA-00600), block corruption errors (ORA-01578), and deadlock errors (ORA-00060) that occur
  • Administrative operations, such as CREATEALTER, and DROP statements and STARTUPSHUTDOWN, and ARCHIVELOG statements
  • Messages and errors relating to the functions of shared server and dispatcher processes
  • Errors occurring during the automatic refresh of a materialized view
  • The values of all initialization parameters that had nondefault values at the time the database and instance start 
Oracle Database uses the alert log to record these operations as an alternative to displaying the information on an operator's console (although some systems also display information on the console). If an operation is successful, a "completed" message is written in the alert log, along with a timestamp.
The alert log is maintained as both an XML-formatted file and a text-formatted file. You can view either format of the alert log with any text editor or you can use the ADRCI utility to view the XML-formatted version of the file with the XML tags stripped.
Check the alert log and trace files of an instance periodically to learn whether the background processes have encountered errors. For example, when the log writer process (LGWR) cannot write to a member of a log group, an error message indicating the nature of the problem is written to the LGWR trace file and the alert log. Such an error message means that a media or I/O problem has occurred and should be corrected immediately.
Oracle Database also writes values of initialization parameters to the alert log, in addition to other important statistics.
The alert log and all trace files for background and server processes are written to the Automatic Diagnostic Repository, the location of which is specified by the DIAGNOSTIC_DEST initialization parameter. The names of trace files are operating system specific, but each file usually includes the name of the process writing the file (such as LGWR and RECO).

Controlling the Size of an Alert Log

To control the size of an alert log, you must manually delete the file when you no longer need it. Otherwise the database continues to append to the file.
You can safely delete the alert log while the instance is running, although you should consider making an archived copy of it first. This archived copy could prove valuable if you should have a future problem that requires investigating the history of an instance.


Controlling the Size of Trace Files

You can control the maximum size of all trace files (excluding the alert log) using the initialization parameter MAX_DUMP_FILE_SIZE. You can set this parameter in the following ways:
  • A numerical value specifies the maximum size in operating system blocks. The specified value is multiplied by the block size to obtain the limit.
  • A number followed by a K, M, or G suffix specifies the file size in kilobytes, megabytes, or gigabytes.
  • UNLIMITED, which is the default, specifies no limit.
Oracle Database can automatically segment trace files based on the limit you specify with the MAX_DUMP_FILE_SIZE initialization parameter. When a limit is reached, the database renames the current trace file using a sequential number, and creates an empty file with the original name.

The MAX_DUMP_FILE_SIZE Parameter and Trace File Segmentation
MAX_DUMP_FILE_SIZE SettingTrace File Segmentation
UNLIMITED
Trace files are not segmented.
Larger than 15M
Trace files are segmented on a boundary that is 1/5 of the MAX_DUMP_FILE_SIZE setting. Trace files with sizes that are less than this boundary in size are not segmented. For example, if the MAX_DUMP_FILE_SIZEsetting is 100M, then the boundary is 20 MB (1/5 of 100 MB).
15M or less
Trace files are not segmented.
There can be up to five segments, but the total combined size of the segments cannot exceed the MAX_DUMP_FILE_SIZE limit. When the combined size of all segments of the trace file exceeds the specified limit, the oldest segment after the first segment is deleted, and a new, empty segment is created. Therefore, the trace file always contains the most recent trace information. The first segment is not deleted because it might contain relevant information about the initial state of the process.
Segmentation improves space management for trace files. Specifically, segmentation enables you to manage trace files in the following ways:
  • You can purge old trace files when they are no longer needed.
  • You can diagnose problems with smaller trace files and isolate trace files that must be packaged for the incident packaging service (IPS).

Monitoring a Database with Server-Generated Alerts

A server-generated alert is a notification from the Oracle Database server of an impending problem. The notification may contain suggestions for correcting the problem. Notifications are also provided when the problem condition has been cleared.
Alerts are automatically generated when a problem occurs or when data does not match expected values for metrics, such as the following:
  • Physical Reads Per Second
  • User Commits Per Second
  • SQL Service Response Time
Server-generated alerts can be based on threshold levels or can issue simply because an event has occurred. Threshold-based alerts can be triggered at both threshold warning and critical levels. The value of these levels can be customer-defined or internal values, and some alerts have default threshold levels which you can change if appropriate. For example, by default a server-generated alert is generated for tablespace space usage when the percentage of space usage exceeds either the 85% warning or 97% critical threshold level. Examples of alerts not based on threshold levels are:
  • Snapshot Too Old
  • Resumable Session Suspended
  • Recovery Area Space Usage
An alert message is sent to the predefined persistent queue ALERT_QUE owned by the user SYS. Cloud Control reads this queue and provides notifications about outstanding server alerts, and sometimes suggests actions for correcting the problem. The alerts are displayed on the Cloud Control Database Home page and can be configured to send email or pager notifications to selected administrators. If an alert cannot be written to the alert queue, a message about the alert is written to the Oracle Database alert log.
Background processes periodically flush the data to the Automatic Workload Repository to capture a history of metric values. The alert history table andALERT_QUE are purged automatically by the system at regular intervals.


Server-Generated Alerts Data Dictionary Views

The following data dictionary views provide information about server-generated alerts.
ViewDescription
DBA_THRESHOLDSLists the threshold settings defined for the instance
DBA_OUTSTANDING_ALERTSDescribes the outstanding alerts in the database
DBA_ALERT_HISTORYLists a history of alerts that have been cleared
V$ALERT_TYPESProvides information such as group and type for each alert
V$METRICNAMEContains the names, identifiers, and other information about the system metrics
V$METRICContains system-level metric values
V$METRIC_HISTORYContains a history of system-level metric values

MySQL - Lock wait timeout exceeded - try restarting transaction

Symptoms

Caused by: java.sql.BatchUpdateException: Lock wait timeout exceeded; try restarting transaction
    at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2007)
    at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1443)
    at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeBatch(NewProxyPreparedStatement.java:1723)
    at net.sf.hibernate.impl.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:54)
    at net.sf.hibernate.impl.BatcherImpl.executeBatch(BatcherImpl.java:128)
    at net.sf.hibernate.impl.SessionImpl.executeAll(SessionImpl.java:2436)
    at net.sf.hibernate.impl.SessionImpl.execute(SessionImpl.java:2391)
    at net.sf.hibernate.impl.SessionImpl.flush(SessionImpl.java:2259)
    at net.sf.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:61)
    at org.springframework.orm.hibernate.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:510)


Cause

The problem usually happens when a query is taking too long (maybe because the server is too busy).


Resolution

Make sure the database tables are using InnoDB storage engine and READ-COMMITTED transaction isolation level.
If the the above configuration is correct then please try to increase the database server innodb_lock_wait_timeout variable 
Restart the MySQL database service for the configuration to take place.
If the steps above don't help, please run these queries below to double-check the configuration:
show variables like '%wait_timeout%';
show variables like '%tx_isolation%';
SELECT @@GLOBAL.tx_isolation, 

Monitoring Locks Proactively in PostgreSQL

  SELECT bl.pid                 AS blocked_pid,
a.usename AS blocked_user,
ka.current_query AS blocking_statement,
now() - ka.query_start AS blocking_duration,
kl.pid AS blocking_pid,
ka.usename AS blocking_user,
a.current_query AS blocked_statement,
now() - a.query_start AS blocked_duration
FROM pg_catalog.pg_locks bl
JOIN pg_catalog.pg_stat_activity a ON a.procpid = bl.pid
JOIN pg_catalog.pg_locks kl ON kl.transactionid = bl.transactionid AND kl.pid != bl.pid
JOIN pg_catalog.pg_stat_activity ka ON ka.procpid = kl.pid
WHERE NOT bl.granted;
For PostgreSQL >= 9.2:
  SELECT bl.pid                 AS blocked_pid,
a.usename AS blocked_user,
ka.query AS blocking_statement,
now() - ka.query_start AS blocking_duration,
kl.pid AS blocking_pid,
ka.usename AS blocking_user,
a.query AS blocked_statement,
now() - a.query_start AS blocked_duration
FROM pg_catalog.pg_locks bl
JOIN pg_catalog.pg_stat_activity a ON a.pid = bl.pid
JOIN pg_catalog.pg_locks kl ON kl.transactionid = bl.transactionid AND kl.pid != bl.pid
JOIN pg_catalog.pg_stat_activity ka ON ka.pid = kl.pid
WHERE NOT bl.granted;


Setting log_lock_waits and related deadlock_timeout parameters help to even analyze intermittent locks that are causing problems only sometimes. The slow lock acquisition will appear in the database logs for later analysis.

Oracle Data Dictionary Views to Monitor Performance

ViewDescription
V$LOCKLists the locks currently held by Oracle Database and outstanding requests for a lock or latch
DBA_BLOCKERSDisplays a session if it is holding a lock on an object for which another session is waiting
DBA_WAITERSDisplays a session if it is waiting for a locked object
DBA_DDL_LOCKSLists all DDL locks held in the database and all outstanding requests for a DDL lock
DBA_DML_LOCKSLists all DML locks held in the database and all outstanding requests for a DML lock
DBA_LOCKLists all locks or latches held in the database and all outstanding requests for a lock or latch
DBA_LOCK_INTERNALDisplays a row for each lock or latch that is being held, and one row for each outstanding request for a lock or latch
V$LOCKED_OBJECTLists all locks acquired by every transaction on the system
V$SESSION_WAITLists the resources or events for which active sessions are waiting
V$SYSSTATContains session statistics
V$RESOURCE_LIMITProvides information about current and maximum global resource utilization for some system resources
V$SQLAREAContains statistics about shared SQL area and contains one row for each SQL string. Also provides statistics about SQL statements that are in memory, parsed, and ready for execution
V$LATCHContains statistics for nonparent latches and summary statistics for parent latches

Oracle DBA_HIST Views Explained

DBA_HIST_ACTIVE_SESS_HISTORY displays the history of the contents of the in-memory active session history for recent system activity

DBA_HIST_BASELINE displays information about the baselines captured on the system, such as the time range of each baseline and the baseline type

DBA_HIST_BASELINE_DETAILS displays details about a specific baseline

DBA_HIST_BASELINE_TEMPLATE displays information about the baseline templates used by the system to generate baselines

DBA_HIST_DATABASE_INSTANCE displays information about the database environment

DBA_HIST_DB_CACHE_ADVICE displays historical predictions of the number of physical reads for the cache size corresponding to each row

DBA_HIST_DISPATCHER displays historical information for each dispatcher process at the time of the snapshot

DBA_HIST_DYN_REMASTER_STATS displays statistical information about the dynamic remastering process

DBA_HIST_IOSTAT_DETAIL displays historical I/O statistics aggregated by file type and function

DBA_HIST_SHARED_SERVER_SUMMARY displays historical information for shared servers, such as shared server activity, common queues and dispatcher queues

DBA_HIST_SNAPSHOT displays information on snapshots in the system

DBA_HIST_SQL_PLAN displays the SQL execution plans

DBA_HIST_WR_CONTROL displays the settings for controlling AWR

How do we monitor SQL Server deadlocks?

Process
Comments
Windows Performance Monitor
Object: SQLServer:Locks
Counter: Number of Deadlocks/sec
Instance: _Total
This provides all the deadlocks that have happened on your server since the last restart.  We can look at this counter using the following SQL Statement:
SELECT cntr_value AS NumOfDeadLocks
FROM sys.dm_os_performance_counters
WHERE object_name = 'SQLServer:Locks'
AND counter_name = 'Number of Deadlocks/sec'
AND instance_name = '_Total'
Trace Flags1204 and 1222Trace flag 1204 has existed since at least SQL Server 2000.  And Trace Flag 1222 was introduced in SQL Server 2005.Both output Deadlock Information to the SQL Server ERRORLOG.
SQL Server Profiler/Server Side TraceEvent Class: LocksEvent Name:Deadlock GraphGives an XML Graph like the example above.  Very easy to read and figure out what is going on.
Extended EventsThe new way to do monitoring in SQL Server.  Extended Events are eventually going to replace the SQL Server Profiler all together (i.e. SQL Server Profiler is on the Deprecated Feature list).  It produces the same XML graph as SQL Server Profiler, and is lighter in performance impact
System HealthThis is a “new” default trace, but it’s unlike the default trace that had limited the amount of information being tracked and could not be modified.  We can modify a system health definition, which is built on a Extended Events.  But unlike the default trace, the System Health tracks the Deadlocks that have occurred recently.  So we can pull this information from the System Health for analysis instead of implementing our own Extended Events monitoring.

Most common latch contention in Oracle 11g

 Library cache/shared pool latch contention - This is usually caused when an application issues high volumes of SQL which are non-sharable due to an absence of bind variables. The CURSOR_SHARING parameter can often be used to alleviate this form of contention.

Cache buffer chains contention - This is usually associated with very high logical read rates and "hot" blocks within the database (sometimes index blocks). After tuning SQL to reduce logical IO and eliminate repetitive reads of the same information, partitioning is often a possible solution.

Oracle Performance Diagnostics and Forensics from Systems Operations Level

Checking System CPU Utilization

Oracle statistics report CPU utilization only of Oracle sessions, whereas every process running on your system affects the available CPU resources. Effort spent tuning non-Oracle factors can thus result in better Oracle performance.
Use operating system monitoring tools to see what processes are running on the system as a whole. If the system is too heavily loaded, check the memory, I/O, and process management areas described later in this section.
Tools such as sar -u on many UNIX-based systems enable you to examine the level of CPU utilization on your entire system. CPU utilization in UNIX is described in statistics that show user time, system time, idle time, and time waiting for I/O. A CPU problem exists if idle time and time waiting for I/O are both close to zero (less than 5%) at a normal or low workload.
Performance Monitor is used on NT systems to examine CPU utilization. It provides statistics on processor time, user time, privileged time, interrupt time, and DPC time. (NT Performance Monitor is not the same as Performance Manager, which is an Oracle Enterprise Manager tool.)
Attention: This section describes how to check system CPU utilization on most UNIX-based and NT systems. For other platforms, please check your operating system documentation.

Memory Management

Check the following memory management issues:
Paging and Swapping. Use the appropriate tools (such as sar or vmstat on UNIX or Performance Monitor on NT) to investigate the cause of paging and swapping, should they occur.
Oversize Page Tables. On UNIX systems, if the processing space becomes too large, it may result in the page tables becoming too large. This is not an issue on NT systems.

I/O Management

Check the following I/O management issues:
Thrashing. Make sure that your workloads fits in memory so that the machine is not thrashing (swapping and paging processes in and out of memory). The operating system allocates fixed slices of time during which CPU resources are available to your process. If the process squanders a large portion of each time slice checking to be sure that it can run, that all needed components are in the machine, it may be using only 50% of the time allotted to actually perform work.

Client/Server Round Trips. The latency of sending a message may result in CPU overload. An application often generates messages that need to be sent through the network over and over again. This results in a lot of overhead that must be completed before the message is actually sent. To alleviate this problem you can batch the messages and perform the overhead only once, or reduce the amount of work. For example, you can use array inserts, array fetches, and so on.

Different Types of Oracle Performance Bottlenecks





  • Memory contentionMemory contention occurs when processes require more memory than is available. When this occurs, the system pages and swaps processes between memory and disk.
  • Disk I/O contention
    Disk I/O contention is caused by poor memory management, poor distribution of tablespaces and files across disks, or a combination of both.
  • CPU contention
    Although the UNIX kernel usually allocates CPU resources effectively, many processes compete for CPU cycles and this can cause contention. If you installed Oracle9i in a multiprocessor environment, there might be a different level of contention on each CPU.
  • Oracle resources contention
    Contention is also common for Oracle resources such as locks and latches

PostgreSQL Duplicate Indexes


Duplicate Indexes are serious pain in PostgreSQL Infrastruture. It is always cool to 
drop them once found.... The script below helps you to find duplicate indexes in PostgreSQL


SELECT pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS size,
(array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2,
(array_agg(idx))[3] AS idx3, (array_agg(idx))[4] AS idx4
FROM (
SELECT indexrelid::regclass AS idx, (indrelid::text ||E'\n'|| indclass::text ||E'\n'|| indkey::text ||E'\n'||
coalesce(indexprs::text,'')||E'\n' || coalesce(indpred::text,'')) AS KEY
FROM pg_index) sub
GROUP BY KEY HAVING count(*)>1
ORDER BY sum(pg_relation_size(idx)) DESC;

When we should recompile execution plans in SQL Server

Certain changes in a database can cause an execution plan to be either inefficient or invalid, based on the new state of the database. SQL Server detects the changes that invalidate an execution plan and marks the plan as not valid. A new plan must then be recompiled for the next connection that executes the query. The conditions that invalidate a plan include the following:
  • Changes made to a table or view referenced by the query (ALTER TABLE and ALTER VIEW).
  • Changes made to a single procedure, which would drop all plans for that procedure from the cache (ALTER PROCEDURE).
  • Changes to any indexes used by the execution plan.
  • Updates on statistics used by the execution plan, generated either explicitly from a statement, such as UPDATE STATISTICS, or generated automatically.
  • Dropping an index used by the execution plan.
  • An explicit call to sp_recompile.
  • Large numbers of changes to keys (generated by INSERT or DELETE statements from other users that modify a table referenced by the query).
  • For tables with triggers, if the number of rows in the inserted or deleted tables grows significantly.
  • Executing a stored procedure using the WITH RECOMPILE option.

SQL Server Execution Plan Caching and Reuse

SQL Server has a pool of memory that is used to store both execution plans and data buffers. The percentage of the pool allocated to either execution plans or data buffers fluctuates dynamically, depending on the state of the system. The part of the memory pool that is used to store execution plans is referred to as the procedure cache.
SQL Server execution plans have the following main components:
  • Query PlanThe bulk of the execution plan is a re-entrant, read-only data structure used by any number of users. This is referred to as the query plan. No user context is stored in the query plan. There are never more than one or two copies of the query plan in memory: one copy for all serial executions and another for all parallel executions. The parallel copy covers all parallel executions, regardless of their degree of parallelism.
  • Execution ContextEach user that is currently executing the query has a data structure that holds the data specific to their execution, such as parameter values. This data structure is referred to as the execution context. The execution context data structures are reused. If a user executes a query and one of the structures is not being used, it is reinitialized with the context for the new user.

PostgreSQL plugins for Nagios

check_database_pgsql 
leo.unstable.be/projects/nagios_plugins/
This simple plugin checks if its possible to connect to the postgresql database (on localhost) and does a small query.

check_pgpool-II.pl 
This plugin will check how many backends are connected with PgPool-II and will check if their status are "down"

check_pgpool2 
This plugin checks pgpool2 status. It uses pcp_node_info internally to find out node status.

check_pgsql_connections 
pgfoundry.org/projects/nagiosplugins/
Watch the percentage and number of connections available on a PostgreSQL server.

check_pgsql_partitions.pl 
Check_partitions is thought as an check if partitions are pre-generated (due to an explicit date). The plugin only handles partitions whose syntax are like table_name_prefix_YYYYMMDD | fooYYYYMM | 123blubbYYYY or just end with a date where there are n ...

check_pgsql_queries 
pgfoundry.org/projects/nagiosplugins/
Summarizes the types of queries (SELECT ,INSERT ,DELETE ,UPDATE ,ALTER ,CREATE ,TRUNCATE, VACUUM, COPY) running on a PostgreSQL server. This warns about long running queries.

check_pgsql_waiting_queries
pgfoundry.org/projects/nagiosplugins/
Monitor if any queries are in a waiting or blocked state on a PostgreSQL server. This uses SNMP to check the status via the ps command.

check_postgres 
bucardo.org/
Check many things for a Postgres database, including connectivity, version, disk space used, table and index bloat, database size, relation size, locks, number of connections, transaction wraparound, and more.


check_postgresql_unused_db
www.rummandba.com/2013/06/nagios-script-to-find-unused-datab
This plugin finds the number of unused databases, that is, the databases that have created or not been accessed or modified a certain time ago.

check_psql_query 
Checking query result on postgresql.

check_slony 
github.com/kakikubo/check_slony
This plugins will check Slony-I Cluster status(time lag and events). This program is rewriting of check_slony.pl with ruby.

pg_db_cache_hits 
www.phonax.com/download/pg_db_cache_hits.shtml
A PostgreSQL plugin that checks the database's cache hits percentage.

pg_db_size 
www.phonax.com/download/pg_db_size.shtml
The pg_db_size plugin checks for the on disk size of any given PostgreSQL database.

Postgres Database Monitoring Wizard www.nagios.com/products/nagiosxi
Postgres Database Monitoring Wizard A Nagios XI wizard for monitoring various performance metrics of a Postgres database - including connection status, database size, table sizes, relation sizes, and remaining sequences.

Postgres Query Monitoring Wizard www.nagios.com/products/nagiosxi
Postgres Query Monitoring Wizard A Nagios XI wizard for monitoring Postgres SQL queries.


Postgres Server Monitoring Wizard www.nagios.com/products/nagiosxi
Postgres Server Monitoring Wizard A Nagios XI wizard for monitoring various performance metrics of a Postgres database server - including connection status, backend connections, and WAL files.

PostgreSQL cache hit ratio (bash + psql) 
www.ropardo.de/
A simple plugin for monitoring PostgreSQL cache hit ratios for a whole PG-cluster

MySQL TABLE Maintenance Operations

CHECK TABLE
This command is mainly use to check table for errors. It support MyISAM and InnoDB
REPAIR TABLE
If there is error shown during the CHECK TABLE process, then you will need to use REPAIR TABLE to fix the error.
ANALYZE TABLE
This command analyze and store the key distribution for a table. During the operation, the table will be lock with READ LOCK. If your application perform a lot of DELETE and UPDATE operation, overtime you might need to perform ANALYZE TABLE to reorder the key distribution and improve the performance of the table.
OPTIMIZE TABLE
If you perform large amount of DELETE operation, you might need to perform OPTIMIZE TABLE to reclaim your unused space and defragment the data file. This command also suitable if you perform heavy UPDATE modification on your table, it helps to improve the performance of the statement that uses the table.

PostgreSQL VACUUM

VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables.

With no parameter, VACUUM processes every table in the current database that the current user has permission to vacuum. With a parameter, VACUUM processes only that table.

VACUUM ANALYZE performs a VACUUM and then an ANALYZE for each selected table. 

Plain VACUUM (without FULL) simply reclaims space and makes it available for re-use. This form of the command can operate in parallel with normal reading and writing of the table, as an exclusive lock is not obtained. However, extra space is not returned to the operating system (in most cases); it's just kept available for re-use within the same table. VACUUM FULL rewrites the entire contents of the table into a new disk file with no extra space, allowing unused space to be returned to the operating system. This form is much slower and requires an exclusive lock on each table while it is being processed.

When the option list is surrounded by parentheses, the options can be written in any order. Without parentheses, options must be specified in exactly the order shown above. The parenthesized syntax was added in PostgreSQL 9.0; the unparenthesized syntax is deprecated.