Linux Format forums Forum Index Linux Format forums
Help, discussion, magazine feedback and more
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

[SOLVED] Infuriating network problem

 
Post new topic   Reply to topic    Linux Format forums Forum Index -> Help!
View previous topic :: View next topic  
Author Message
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Sun Jul 11, 2010 4:15 pm    Post subject: [SOLVED] Infuriating network problem Reply with quote

I've recently updated my home server from OpenSuse 10.3 to Lucid Server.

I replaced the motherboard with an Intel DQ35JO that I had lying around. I've dropped in a Core 2 Quad and 4Gb RAM.

Additionally I added a 3Ware/AMCC 9650-2LP RAID controller for the disks, which runs off one of the PCI-E x1 slots.

From the "old" server I brought across an Intel Pro 1000 PT Dual Gigabit PCI-E Server Adaptor which is utilising the mobo's PCI-E x16 slot. On the old motherboard (an Asus) this worked without issue.

The problem I'm having is that the NIC seems to be going into some kind of sleep mode once other computers on the network disconnect. Booting up any of the machines directly attached to the same gigabit hub (a Netgear ProSafe GS116) reinstates the connection.

The server logs don't show any indication that that the network link is dropping at any point - it just seems to be waiting for a signal from any LAN-connected device. If I try connect using any wireless devices through the wireless router (Netgear DGN2000) I get no response.

Once it's up and running it works flawlessly - but I didn't have any of these problems under OpenSuse with the older motherboard.

It helps - the hub is connected to the router by a single cable. All other wired network connections are made through the gigabit hub. Internet and wireless links are made through the router via the single link.

The server is used for web (public facing development server), emails (SMTP/IMAP), NFS/Samba (internal network only) and VPN.

It really is driving me mad - I'm considering replacing the motherboard to see if that has any effect so any help you guys could give would be REALLY appreciated.

T.

Edit: If forgot to mention it's using the 1.0.2-k2 driver. I've downloaded the latest e1000e driver 1.2.8 - I'll install that later and see if it makes a difference. I'll post the result on here in case anyone else has a similar problem....
_________________
If at first you don't succeed, call it v1.0


Last edited by OnlyTheTony on Tue Jul 20, 2010 9:59 am; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
Dutch_Master
LXF regular


Joined: Tue Mar 27, 2007 2:49 am
Posts: 2430

PostPosted: Mon Jul 12, 2010 1:05 am    Post subject: Reply with quote

Things to consider:
1) static IP, no DHCP
2) longer lease times
3) new kernel

My tuppence Smile
Back to top
View user's profile Send private message
ollie
Moderator


Joined: Mon Jul 25, 2005 12:26 pm
Posts: 2749
Location: Bathurst NSW Australia

PostPosted: Mon Jul 12, 2010 12:14 pm    Post subject: Reply with quote

Check the BIOS for power settings and check the power management settings in YaST to ensure the network Wake On LAN (WOL) is turned off. This is what shuts down the ethernet connection.

Ref: http://www.lesswatts.org/tips/ethernet.php
Back to top
View user's profile Send private message
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Mon Jul 12, 2010 1:47 pm    Post subject: Reply with quote

Thanks for your answers guys.

It turns out it was much more simple(?).

Intel motherboards have a bios-based "lights out" management system "Intel ME" which prioritises the onboard LAN adapter - for obvious reasons. Once I disabled "Intel ME" and switched the onboard LAN off it worked a treat. The connection has been fine ever since!

Can't believe I wasted a week trying to fix that!!!
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Tue Jul 13, 2010 1:51 pm    Post subject: Reply with quote

Okay.. I was wrong.

Even updating the driver to 1.1.2 (1.2.8 wouldn't build) hasn't solved the problem. Shortly after network connections are removed (either imap connection or NFS) the server's network connection just drops. There's nothing in the syslogs or even dmeg. Only re-establishing a connection from another desktop machine restarts it!

Dutch master - thanks for your input but it's already running on a static IP and I've updated the kernel to the latest ones in the repos.

The network hardware is the same as I used under opensuse 10.3 - so it's either an Ubuntu bug or a problem with the motherboard (which is nearly 2 years old so it's possible).

I'm open to any other suggestions here!!
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
wyliecoyoteuk
LXF regular


Joined: Sun Apr 10, 2005 11:41 pm
Posts: 3440
Location: Birmingham, UK

PostPosted: Tue Jul 13, 2010 3:07 pm    Post subject: Reply with quote

Could be a keepalive issue?
http://en.wikipedia.org/wiki/Keepalive
_________________
The sig between the asterisks is so cool that only REALLY COOL people can even see it!

*************** ************
Back to top
View user's profile Send private message
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Wed Jul 14, 2010 3:57 pm    Post subject: Reply with quote

It could be.

I've been doing some research and there were a lot of threads about the e1000e driver closing the connection - as of yet nobody's posted a solution.

To see whether it's an OS or driver issue I've deactivated the (expensive) Intel adapter and I'm trying the onboard gigabit LAN to see if that maintains the connection overnight.

If it is a keepalive issue what would I look for and how could I get around it?
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
nelz
Site admin


Joined: Mon Apr 04, 2005 12:52 pm
Posts: 8450
Location: Warrington, UK

PostPosted: Wed Jul 14, 2010 4:16 pm    Post subject: Reply with quote

Why are you using an external driver and not the e1000e driver in the kernel?
_________________
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
Back to top
View user's profile Send private message
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Wed Jul 14, 2010 8:08 pm    Post subject: Reply with quote

Because the e1000e driver in the kernel was outdated so I installed a new version in the hope it would solve the timeout problem.

It didn't.

The thing is the problem still exists whether I use the Intel LAN adapter or the realtek onboard - which makes me suspect it's an issue with Ubuntu rather than any of the hardware or drivers.

I'm considering ditching Ubuntu for something like CentOS to see if this removes the problem.

It's a total pain - I never had this problem under opensuse 10.3 - the only reason I "upgraded" was because the install failed and I thought I'd take the opportunity to rebuild. I wish I hadn't.

If it's any help - dmesg returns:

[37402.040022] NETDEV WATCHDOG: eth3 (r8169): transmit queue 0 timed out

and the last few lines are:

[37402.080072] r8169: eth3: link up
[37426.080071] r8169: eth3: link up
[37456.080069] r8169: eth3: link up
[37498.080070] r8169: eth3: link up
[37540.080057] r8169: eth3: link up
[37582.080064] r8169: eth3: link up
[37624.080063] r8169: eth3: link up
[37666.080065] r8169: eth3: link up
[37708.080068] r8169: eth3: link up
[37750.080066] r8169: eth3: link up

As you can see it's not reporting the link as being down - just constantly coming back up.

My internet connection keeps dropping for some weird reason and I'm also wondering if the two are linked. I'm frustrated and confused.
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
wyliecoyoteuk
LXF regular


Joined: Sun Apr 10, 2005 11:41 pm
Posts: 3440
Location: Birmingham, UK

PostPosted: Wed Jul 14, 2010 8:42 pm    Post subject: Reply with quote

Seem to be a lot of posts saying that this is an APIC related bug.

https://bugs.launchpad.net/ubuntu/+source/grub/+bug/574281
_________________
The sig between the asterisks is so cool that only REALLY COOL people can even see it!

*************** ************
Back to top
View user's profile Send private message
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Wed Jul 14, 2010 8:59 pm    Post subject: Reply with quote

Wylie, doing more digging I've come across that too. I've added "noapic" to the boot parameters and restarted. I've also reinstated the Intel card - I'll see how that goes....

Edit: Further research indicates that kernel 2.3.34 doesn't have this problem - so just waiting for that to hit the repos now.
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Thu Jul 15, 2010 11:38 am    Post subject: Reply with quote

Still doing it!

I've decided to backup everything and switch distros to CentOS over the weekend as the errors I'm getting on Ubuntu don't seem to be present on that!

Fingers crossed...
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Sun Jul 18, 2010 11:37 am    Post subject: Reply with quote

After 2 x motherboards, 2 x distros and several nights of sitting around until 2am sobbing I may have found the culprit.

It was nothing to do with the server at all - it would appear to be a problem with my desktop machine. The lan was always activated and didn't appear to be sending a disconnect signal to the server which consequently hung waiting for a response.

I've replaced the onboard LAN with a PCIe x 4 dual Marvell lan adapter - let's see if this solves the problem.

Oh, and I went back to Ubuntu because CentOS, whilst good, was far too slow on my hardware.
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
OnlyTheTony
LXF regular


Joined: Mon Jan 08, 2007 11:51 am
Posts: 303

PostPosted: Tue Jul 20, 2010 10:01 am    Post subject: Reply with quote

*****SOLVED*****

It turns out that it was the LAN adapter on the desktop PC that was causing the problem. Had no trouble with network connectivity since changing the onboard LAN for a PCIe card.

Thanks to everyone who offered potential solutions.
_________________
If at first you don't succeed, call it v1.0
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
View previous topic :: View next topic  
Display posts from previous:   
Post new topic   Reply to topic    Linux Format forums Forum Index -> Help! All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Linux Format forums topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group


Copyright 2011 Future Publishing, all rights reserved.


Web hosting by UKFast