AlienVault R&D Labs Portal. Get the latest news from our research.
Header

Forensic database performance optimizations

March 14th, 2008 | Posted by DK in Tuning - (Comments Off)

Remember the couple of posts I made back in November in the tuning section ? Well, I finally got the time to look into this issue again and have made some interesting discoveries the last couple of days. I’m really enjoying this.

The following table illustrates some comparisons between a stock Base 1.3.9 (ossim patched) and the tuned rewrite I’ve got running right now. These optimizations are now part of our appliance offering (updates for already deployed ones on the way) and will be released to the public afterwards. Thanks to everybody that has been helping me on this, specially to the people at #ossim in freenode ;-).

BREAK

Just for statistical issues, the current contents of the database are (links are broken of course):


Sensors/Total:
470 / 1692

Unique Alerts:
844
Categories: 21
Total Number of Alerts:
1620567

  • Src IP addrs: 106821
  • Dest. IP addrs: 34411
  • Unique IP links 178934
  • Source Ports: 63254
    • TCP ( 61782)  UDP ( 46206)
  • Dest Ports: 36514
    • TCP ( 25886)  UDP ( 22436)
Traffic Profile by Protocol

TCP
(67%)
UDP
(32%)
ICMP
(1%)


Portscan Traffic
(0%)

Here are the results of some tests I’ve made, you’ll have to trust my word on these :-). The testing host is a Pentium M 2.0Ghz with 1GB RAM, with query caching disabled. If you want me to post some specific test or logs as proof of concept please don’t hesitate to ask.

Loading Times (in seconds)
ACID/BASE 1.3.9 (ossim patched) AV Tuned ACID/BASE
Front Page 185 1
Unique Alerts 44 4
Source IPs 32 3
Source IPs (order by count(dest)) 76 6
Dest Ports (order by ocurrences) 105 6
Random Event Detail 42 1
Next Event Detail 23 1

 

Those numbers are with a couple of days worth of data on a Mysql 5.0.32, not taking other optimizations into account which vastly improve that performance on the applliances such as:

Well, just wanted to share those numbers since forensic database performance has been a real pita for many years and I’m really happy it’s starting to work right now on large amounts of data.

Happy spring break to everyone. Aaah, and check the GSOC page on monday, hopefully we’re eligible as a mentoring organization for this years summer of code, check out our ideas here.

DK

Mr Wolf Wannabe.

More Posts - Website

MySQL performance tuning applied to OSSIM. Case 1.

November 30th, 2007 | Posted by DK in Tuning - (Comments Off)

I’d like to share my first actual success on mysql tuning, after having spent a couple of days reading everything I could about the matter (and still waiting for the books to arrive).

From what I’ve seen a very important point on DB optimization is the right table design, followed by the right queries and finally optimizing DB parameters. Since I don’t know enough yet about optimal DB design I’ll skip that phase (tho I’ll definetively accomplish it during the next weeks/months) and examining some queries.

After enabling log_slow_queries, one of the first queries popping out continuously was the following:

SELECT *, inet_ntoa(src_ip) as aux_src_ip, inet_ntoa(dst_ip) as aux_dst_ip FROM event_tmp order by id desc limit 1;

Ugly, ain’t it ?

BREAK

A little bit of explanation about the event_tmp table and how we use it may come handy to understand this.

 

Introduction

After stumbling across Digg spy some time ago, it seemed like a nifty feature to add to ossim. A real time event viewer. So that’s what we started to do.

Shortly after starting we already had performance issues, since basically we had to aggregate lots of information from many unrelated tables, and do this every second. So we wrote a cache table:

And this it how it looks like:

(image removed, broken link, I'm very sorry. DK.)
mysql> desc event_tmp;
+-----------------+------------------+------+-----+-------------------+-------+
| Field           | Type             | Null | Key | Default           | Extra |
+-----------------+------------------+------+-----+-------------------+-------+
| id              | bigint(20)       | NO   | PRI |                   |       |
| timestamp       | timestamp        | NO   |     | CURRENT_TIMESTAMP |       |
| sensor          | text             | NO   |     |                   |       |
| interface       | text             | NO   |     |                   |       |
| type            | int(11)          | NO   |     |                   |       |
| plugin_id       | int(11)          | NO   |     |                   |       |
| plugin_sid      | int(11)          | NO   |     |                   |       |
| plugin_sid_name | varchar(255)     | YES  |     | NULL              |       |
| protocol        | int(11)          | YES  |     | NULL              |       |
| src_ip          | int(10) unsigned | YES  |     | NULL              |       |
| dst_ip          | int(10) unsigned | YES  |     | NULL              |       |
| src_port        | int(11)          | YES  |     | NULL              |       |
| dst_port        | int(11)          | YES  |     | NULL              |       |
| priority        | int(11)          | YES  |     | 1                 |       |
| reliability     | int(11)          | YES  |     | 1                 |       |
| asset_src       | int(11)          | YES  |     | 1                 |       |
| asset_dst       | int(11)          | YES  |     | 1                 |       |
| risk_a          | int(11)          | YES  |     | 1                 |       |
| risk_c          | int(11)          | YES  |     | 1                 |       |
| alarm           | tinyint(4)       | YES  |     | 1                 |       |
| filename        | varchar(255)     | YES  |     | NULL              |       |
| username        | varchar(255)     | YES  |     | NULL              |       |
| password        | varchar(255)     | YES  |     | NULL              |       |
| userdata1       | varchar(255)     | YES  |     | NULL              |       |
| userdata2       | varchar(255)     | YES  |     | NULL              |       |
| userdata3       | varchar(255)     | YES  |     | NULL              |       |
| userdata4       | varchar(255)     | YES  |     | NULL              |       |
| userdata5       | varchar(255)     | YES  |     | NULL              |       |
| userdata6       | varchar(255)     | YES  |     | NULL              |       |
| userdata7       | varchar(255)     | YES  |     | NULL              |       |
| userdata8       | varchar(255)     | YES  |     | NULL              |       |
| userdata9       | varchar(255)     | YES  |     | NULL              |       |
+-----------------+------------------+------+-----+-------------------+-------+

This table would have all the needed information so we could write a nice scrolling real time event viewer.

The actual implementation is like a ring buffer, you specify how many events you want to keep in that table at max and the server will take care that the table doesn’t get too big. (10000 being the default).

 

The code

Getting back to the previous query, it was our quick & dirty attempt at getting the last row out of that table. Let’s see what it actually does:

mysql> explain SELECT *, inet_ntoa(src_ip) as aux_src_ip, inet_ntoa(dst_ip) as aux_dst_ip FROM event_tmp order by id desc limit 1;
+----+-------------+-----------+-------+---------------+---------+---------+------+-------+-------+
| id | select_type | table     | type  | possible_keys | key     | key_len | ref  | rows  | Extra |
+----+-------------+-----------+-------+---------------+---------+---------+------+-------+-------+
|  1 | SIMPLE      | event_tmp | index | NULL          | PRIMARY | 8       | NULL | 10001 |       |
+----+-------------+-----------+-------+---------------+---------+---------+------+-------+-------+
1 row in set (0.00 sec)

Well, that’s pretty bad. Having to traverse 10000 rows in order to get one out of it ? what if our tmp table had millions ? no good.
So, rethinking this, since id is autoincremental anyway, we could just fetch the highest one…

mysql> explain select max(id) from event_tmp;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

Much better. No single row needs to be accessed since we just use the index/key column. So let’s finish up the query simulating the original one:

mysql> explain select *, inet_ntoa(src_ip) as aux_src_ip, inet_ntoa(dst_ip) as aux_dst_ip  from event_tmp where id = (select max(id) from event_tmp);
+----+-------------+-----------+-------+---------------+---------+---------+-------+------+------------------------------+
| id | select_type | table     | type  | possible_keys | key     | key_len | ref   | rows | Extra                        |
+----+-------------+-----------+-------+---------------+---------+---------+-------+------+------------------------------+
|  1 | PRIMARY     | event_tmp | const | PRIMARY       | PRIMARY | 8       | const |    1 |                              |
|  2 | SUBQUERY    | NULL      | NULL  | NULL          | NULL    | NULL    | NULL  | NULL | Select tables optimized away |
+----+-------------+-----------+-------+---------------+---------+---------+-------+------+------------------------------+
2 rows in set (0.00 sec)

mysql>

Fantastic, now we only need to query 1 row, regardless of how many tmp rows we might have in there. We could easily remove the limiting code from the server and just get a cleanup process chop the table every once in a while.

 

Conclusion

If you’re an SQL expert you might not be impressed by this at all, but for me who I’m just taking my first steps deeper into all of this it’s been a nice feeling of accomplishment, and an extra motivation push for further delving into this matter.

BTW, I’m trying out the MySQL Enterprise Dashboard and am seriously considering buying support from them next year, it’s been very helpful so far. I’d like to post a screenshot but I don’t think I’m allowed by the EULA I (of course) haven’t read, so check out the following link: Automating MySQL best practices management.

DK

Mr Wolf Wannabe.

More Posts - Website

MySQL Performance Tuning

November 28th, 2007 | Posted by DK in Tuning - (Comments Off)

I’ve finally decided to learn everything I could about MySQL performance tuning; we’re working on highly tuned appliances and this is a must for high-traffic environments.

I’d like to share my first findings on interesting stuff and encourage comments on the matter, which seems as deep as any science.

These last days we’ve discussing about this at the office and we couldn’t agree about the type of database configuration using MySQL was optimal for the broadest range of users.

It’s much easier to tune everything if you already know the exact environment, available hardware and so on, rather than trying to tune a database for a broad range of people going to install a product.

BREAK

Enough of introduction anyway, I want to get to the bottom of this issue once and for all, so for a beginning I ordered these three books:

Besides the books, which should arrive in less than a week, I’ve been googling quite a lot. I really have to try out mirroring an event table into memory from where events are being fetched and writing a copy into disk. I also see that a lot of index and key optimization can be done in all the involved databases.

Anyway, it’s too early to reach a conclusion, in the meantime here’s a very interesting piece of reading about index optimizations.

Other interesting options:

Anyway, if you’ve got an opinion, suggestion or any sort of input about all of this, please share it on the comments section or send me a mail and I’ll post conclusions here.

DK

Mr Wolf Wannabe.

More Posts - Website