I did a little spelunking around our system and I have some suggestions for the logging infrastructure. We have enough hosts and complexity that log analysis will help us know when something is misconfigured or flapping in a weird way.
1. logs in /var/log/hosts on log1 are not consistently named - sometimes they are being reported with ips, sometimes with short hostname, sometimes with fqdn. It needs to be made consistent
2. we need to make sure we cleanup old logs from the above, too.
3. the structure of the log dir doesn't seem to match what we'd normally see in /var/log on any host. They are being logged as a different dir per day, which is great, but it'd be good if rsyslog was putting in the same file structure as a normal set of logs so normal log analysis tools will work on it
4. I installed pflogsumm on log1 so I could do a little postfix mail log analysis - found some issues that way too. Regularly generating these reports, especially the error reports would help us figure out what we need to improve. We are clearly sending/redelivering A LOT more mail than we're receiving so bumping our smtp process count would help.
5. Grouping the logs by type of service would also help look at group/service trending and issues. especially if an issue is only popping up on one box.
Just some initial thoughts.
-sv
On Thu, 14 Jan 2010, Seth Vidal wrote:
I did a little spelunking around our system and I have some suggestions for the logging infrastructure. We have enough hosts and complexity that log analysis will help us know when something is misconfigured or flapping in a weird way.
- logs in /var/log/hosts on log1 are not consistently named - sometimes
they are being reported with ips, sometimes with short hostname, sometimes with fqdn. It needs to be made consistent
Now that we control reverse lookups this should be easy.
- we need to make sure we cleanup old logs from the above, too.
I asked smooge to look into this this morening :)
- the structure of the log dir doesn't seem to match what we'd normally
see in /var/log on any host. They are being logged as a different dir per day, which is great, but it'd be good if rsyslog was putting in the same file structure as a normal set of logs so normal log analysis tools will work on it
Where would /var/log/messages on bastion from 2009-03-01 exist?
- Grouping the logs by type of service would also help look at
group/service trending and issues. especially if an issue is only popping up on one box.
We can probably do this with symlinks
-Mike
On Thu, 14 Jan 2010, Mike McGrath wrote:
Now that we control reverse lookups this should be easy.
- we need to make sure we cleanup old logs from the above, too.
I asked smooge to look into this this morening :)
- the structure of the log dir doesn't seem to match what we'd normally
see in /var/log on any host. They are being logged as a different dir per day, which is great, but it'd be good if rsyslog was putting in the same file structure as a normal set of logs so normal log analysis tools will work on it
Where would /var/log/messages on bastion from 2009-03-01 exist?
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
- Grouping the logs by type of service would also help look at
group/service trending and issues. especially if an issue is only popping up on one box.
We can probably do this with symlinks
Not really - you'd want one merged /var/log structure for all of the app servers, for example.
-sv
On Thu, 14 Jan 2010, Seth Vidal wrote:
On Thu, 14 Jan 2010, Mike McGrath wrote:
Now that we control reverse lookups this should be easy.
- we need to make sure we cleanup old logs from the above, too.
I asked smooge to look into this this morening :)
- the structure of the log dir doesn't seem to match what we'd normally
see in /var/log on any host. They are being logged as a different dir per day, which is great, but it'd be good if rsyslog was putting in the same file structure as a normal set of logs so normal log analysis tools will work on it
Where would /var/log/messages on bastion from 2009-03-01 exist?
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
So /var/log/hosts/bastion03/2009/03/01/var/log/messages ?
/me is confused :-/
-Mike
On Thu, 14 Jan 2010, Mike McGrath wrote:
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
So /var/log/hosts/bastion03/2009/03/01/var/log/messages ?
/me is confused :-/
no
/var/log/hosts/bastion02/2009/03/01/messages
The idea is that any given day directory has a dir/file structure that matches what an admin expects to see in /var/log on the local machine.
That also means that tools which expect the structure of /var/log will find the same file structure in any given day dir.
-sv
On Thu, Jan 14, 2010 at 12:29:17PM -0500, Seth Vidal wrote:
On Thu, 14 Jan 2010, Mike McGrath wrote:
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
So /var/log/hosts/bastion03/2009/03/01/var/log/messages ?
/me is confused :-/
no
/var/log/hosts/bastion02/2009/03/01/messages
The idea is that any given day directory has a dir/file structure that matches what an admin expects to see in /var/log on the local machine.
That also means that tools which expect the structure of /var/log will find the same file structure in any given day dir.
Do give a shout when and if the dir structure changes, esp. if you think it might affect the stats I pull from the HTTP proxies' logs on log1.
On Thu, 14 Jan 2010, Paul W. Frields wrote:
On Thu, Jan 14, 2010 at 12:29:17PM -0500, Seth Vidal wrote:
On Thu, 14 Jan 2010, Mike McGrath wrote:
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
So /var/log/hosts/bastion03/2009/03/01/var/log/messages ?
/me is confused :-/
no
/var/log/hosts/bastion02/2009/03/01/messages
The idea is that any given day directory has a dir/file structure that matches what an admin expects to see in /var/log on the local machine.
That also means that tools which expect the structure of /var/log will find the same file structure in any given day dir.
Do give a shout when and if the dir structure changes, esp. if you think it might affect the stats I pull from the HTTP proxies' logs on log1.
This has nothing to do with http logs
-sv
On Thu, Jan 14, 2010 at 04:31:47PM -0500, Seth Vidal wrote:
On Thu, 14 Jan 2010, Paul W. Frields wrote:
On Thu, Jan 14, 2010 at 12:29:17PM -0500, Seth Vidal wrote:
On Thu, 14 Jan 2010, Mike McGrath wrote:
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
So /var/log/hosts/bastion03/2009/03/01/var/log/messages ?
/me is confused :-/
no
/var/log/hosts/bastion02/2009/03/01/messages
The idea is that any given day directory has a dir/file structure that matches what an admin expects to see in /var/log on the local machine.
That also means that tools which expect the structure of /var/log will find the same file structure in any given day dir.
Do give a shout when and if the dir structure changes, esp. if you think it might affect the stats I pull from the HTTP proxies' logs on log1.
This has nothing to do with http logs
Thanks for the clarification!
On Fri, Jan 15, 2010 at 7:31 AM, Seth Vidal skvidal@fedoraproject.org wrote:
On Thu, 14 Jan 2010, Paul W. Frields wrote:
On Thu, Jan 14, 2010 at 12:29:17PM -0500, Seth Vidal wrote:
On Thu, 14 Jan 2010, Mike McGrath wrote:
Where? On log1? /var/log/hosts/bastion02/2009/03/01/messages
what I'm proposing is that in any given day the dir structure look the same as what we would normally find in /var/log on any given machine.
So /var/log/hosts/bastion03/2009/03/01/var/log/messages ?
/me is confused :-/
no
/var/log/hosts/bastion02/2009/03/01/messages
The idea is that any given day directory has a dir/file structure that matches what an admin expects to see in /var/log on the local machine.
That also means that tools which expect the structure of /var/log will find the same file structure in any given day dir.
Do give a shout when and if the dir structure changes, esp. if you think it might affect the stats I pull from the HTTP proxies' logs on log1.
This has nothing to do with http logs
Surely under this suggestion it should be /var/log/hosts/app01/2009/03/01/httpd/error_log ?
infrastructure@lists.fedoraproject.org