Saturday, 30 May 2009

Analyze a blue screen

I was having fun the other day with some virtual server on my laptop when I noticed I was late for a meeting so i quickly shutdown the laptop and just as it was getting close to finishing it blue screen... so I let it finish creating the memory dump and off to the meeting I when.

Later that day when I have five minutes I once again booted up the laptop and started to have a look at what caused my blue screen.

to analyze a blue screen there are simple steps
1) download Debugging Tools for Windows plus the Symbols Pack if working offline or set symbol path to http://msdl.microsoft.com/download/symbols

2)open the dump file and run !analyze -v or kb for shorter output

3)switch to processor 1 from 0 using ~1 or however many processors you have.


So once you've installed the debug tools for windows you need the symbol pack or if your connected you can use the online symbols http://msdl.microsoft.com/download/symbols
I always like to use the online ones as i know these are more up to date and saves me needing another 200mb to 600mb of disk space.

I then opened up the memory.dmp normally located under c:\windows or c:\winnt depending on the version of windows you have or it maybe under another directory if you changed the install location or memory dump location, anyway the default is %SystemRoot%\MEMORY.DMP


Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available

Symbol search path is: http://msdl.microsoft.com/download/symbols;C:\Windows\Symbols SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Vista Kernel Version 6000 MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 6000.16830.x86fre.vista_gdr.090302-1506
Machine Name:
Kernel base = 0x82000000 PsLoadedModuleList = 0x82111e10
Debug session time: Thu May 28 16:40:54.534 2009 (GMT+2)
System Uptime: 1 days 7:51:45.536
Loading Kernel Symbols
...............................................................
................................................................
..........................................................
Loading User Symbols

Loading unloaded module list
...............................................
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck A, {0, 1b, 0, 8202915c}

Probably caused by : ndis.sys ( ndis!ndisAcquireMiniportPnPEventLock+60 )

Followup: MachineOwner
---------

As per the prompt I type !analyze -V
and now I get the processes that where running at the moment of the blue screen
in this example the cause that you can see bellow was ndisAcquireMiniportPnPEventLock casting my mind back to the point when i was turning off the laptop i realized i had picked it up from the docking station and the network cards was change as a result just seconds before the blue screen and this was the cause.


*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00000000, memory referenced
Arg2: 0000001b, IRQL
Arg3: 00000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 8202915c, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: 00000000

CURRENT_IRQL: 1b

FAULTING_IP:
nt!KeWaitForSingleObject+1b5
8202915c 803902 cmp byte ptr [ecx],2

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0xA

PROCESS_NAME: System

TRAP_FRAME: a2e2da94 -- (.trap 0xffffffffa2e2da94)
ErrCode = 00000000
eax=00000000 ebx=a654ee30 ecx=00000000 edx=82132300 esi=a654ed78 edi=a654ee00
eip=8202915c esp=a2e2db08 ebp=a2e2db58 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
nt!KeWaitForSingleObject+0x1b5:
8202915c 803902 cmp byte ptr [ecx],2 ds:0023:00000000=??
Resetting default scope

LAST_CONTROL_TRANSFER: from 8202915c to 8208fdc4

STACK_TEXT:
a2e2da94 8202915c badb0d00 82132300 82090fe6 nt!KiTrap0E+0x2ac
a2e2db58 81e0ed7b 00000000 00000000 00000000 nt!KeWaitForSingleObject+0x1b5
a2e2db84 81eda107 00b520e8 a2e2dbf8 85b520e8 ndis!ndisAcquireMiniportPnPEventLock+0x60
a2e2dc20 81e2b231 85b520e8 00000000 00000000 ndis!ndisPnPNotifyAllTransports+0xa2
a2e2dca4 81ee7749 85b520e8 00000000 00000000 ndis!ndisDevicePnPEventNotifyFiltersAndAllTransports+0xc5
a2e2dcf8 81ee7b5f 8549bdb8 8549be4c 00000004 ndis!ndisSetPower+0x5ef
a2e2dd20 82050b86 8549be4c 83e4db30 00000000 ndis!ndisPowerDispatch+0x1a3
a2e2dd7c 8222553c 87166db0 a2e26680 00000000 nt!PopIrpWorker+0x40f
a2e2ddc0 820915fe 82050773 87166db0 00000000 nt!PspSystemThreadStartup+0x9d
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16


STACK_COMMAND: kb

FOLLOWUP_IP:
ndis!ndisAcquireMiniportPnPEventLock+60
81e0ed7b 8b4dfc mov ecx,dword ptr [ebp-4]

SYMBOL_STACK_INDEX: 2

SYMBOL_NAME: ndis!ndisAcquireMiniportPnPEventLock+60

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: ndis

IMAGE_NAME: ndis.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 4549b2fd

FAILURE_BUCKET_ID: 0xA_ndis!ndisAcquireMiniportPnPEventLock+60

BUCKET_ID: 0xA_ndis!ndisAcquireMiniportPnPEventLock+60

Followup: MachineOwner

Still I wasn't 100% sure this was the only problem as I'm luck enough to have a dual core laptop so I needed to check the other processors in case they where running something at that time as well, so using the ~1 command I switched to the other core, by the way processor count from zero up so second processor is 1.

I ran the !analyze -V again

1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00000000, memory referenced
Arg2: 0000001b, IRQL
Arg3: 00000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 8202915c, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: 00000000

CURRENT_IRQL: 0

FAULTING_IP:
nt!KeWaitForSingleObject+1b5
8202915c 803902 cmp byte ptr [ecx],2

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0xA

PROCESS_NAME: System

LAST_CONTROL_TRANSFER: from 823a94a3 to 8208191a

STACK_TEXT:
88757928 823a94a3 ffd050f0 00000040 8431d648 nt!READ_REGISTER_ULONG+0x6
88757948 823a98e5 88757988 8206f94d 00000000 hal!HalpQueryHpetCount+0x4b
88757950 8206f94d 00000000 820709e5 00000001 hal!HalpHpetQueryPerformanceCounter+0x1d
88757958 820709e5 00000001 0000003c 8875c2a4 nt!EtwpGetPerfCounter+0x8
88757988 8206f5e2 0000003c 887579d0 887579b0 nt!EtwpReserveTraceBuffer+0xce
88757a1c 8206f41b 00040007 00000000 0000002b nt!EtwpTraceMessageVa+0x187
88757a40 8d36456e 00040007 ffffffff 0000002b nt!WmiTraceMessage+0x22
88757a68 8d36555c 00040007 ffffffff 00000020 smb!WPP_SF__guid_+0x20
88757aa8 8d36593f 848594e8 00000002 00000000 smb!SmbBatchedSetBindingInfo+0x152
88757ac0 8c339a32 84425868 84425848 87b137f8 smb!SmbAddressDeletion+0x5d
88757aec 8c339f01 8c33c1a0 84425828 00000000 TDI!TdiNotifyPnpClientList+0x132
88757b10 8c33a2f4 84ac0850 00000000 8ee95338 TDI!TdiExecuteRequest+0x175
88757b48 8c33a547 00425828 0000000c 88757bd4 TDI!TdiHandleSerializedRequest+0x1aa
88757b58 8ee8e11a 84425828 00000010 88757c98 TDI!TdiDeregisterNetAddress+0xf
88757bd4 8ee8e513 85009270 00000000 874b3938 tdx!TdxProcessAddressChangeRoutine+0x22e
88757bf0 829a62a6 00000000 88757c98 88757ca0 tdx!TdxNaAddressChangeEvent+0x7d
88757c58 8eec8460 88757c8c 823a4f00 85b0d908 NETIO!NsiParameterChange+0x73
88757cf8 8eec9860 846438c0 8749e9e4 88757d2c tcpip!IppNotifyAddressChangeAtPassive+0x12c
88757d08 829a14d1 846438c0 820fde7c 873bae58 tcpip!IppCompartmentNotificationWorker+0x11
88757d2c 8218c87c 873bae58 8749e9e4 8749d610 NETIO!NetiopIoWorkItemRoutine+0x2f
88757d44 82078fc0 8749d610 00000000 83e9d828 nt!IopProcessWorkItem+0x2d
88757d7c 8222553c 8749d610 8875c680 00000000 nt!ExpWorkerThread+0xfd
88757dc0 820915fe 82078ec3 00000001 00000000 nt!PspSystemThreadStartup+0x9d
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16


STACK_COMMAND: kb

FOLLOWUP_IP:
smb!WPP_SF__guid_+20
8d36456e 83c420 add esp,20h

SYMBOL_STACK_INDEX: 7

SYMBOL_NAME: smb!WPP_SF__guid_+20

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: smb

IMAGE_NAME: smb.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 45b6cc3e

FAILURE_BUCKET_ID: 0xA_smb!WPP_SF__guid_+20

BUCKET_ID: 0xA_smb!WPP_SF__guid_+20

Followup: MachineOwner
---------


From the second processor I could only see the SMB process or (simple message block) feeling happy that it was network card I left it as I unplugged the network card too fast.

However the steps are the same for any debug on windows and with server remember to check all the processors on your system.

Now to recap
1) download Debugging Tools for Windows plus the Symbols Pack if working offline or set symbol path to http://msdl.microsoft.com/download/symbols

2)open the dump file and run !analyze -v or kb for shorter output

3)switch to processor 1 from 0 using ~1 or however many processors you have.

Note: processes starting with NT is system kernal
NDIS is windows libery for network drivers

Sadly knowing what cause your blue screen doesn't always help you as it might be something like a driver that hasn't been updated yet so your still left waiting... however at least you know what your waiting for.

I hope after reading this you'll fear the blue screen a little less and even see it as a challenge not something to be scared of.

Saturday, 16 May 2009

Going Green

Most companies don't have any form of energy policy yet covering computers and there operation, a few companies have basic policy of turning off the workstation however this is just a start, and most employees don't follow it closely.

So here is how to begin, you can improve the energy rating of your network.

Consolidation of servers coupled with cloud computing is an affective way to reduce power consumption by reducing the number physical devices but this isn't all you can do.

So I'm going to save you sometime and give you a few points where you can make changes to reduce the energy consumption of your network.

consider replacing all older hardware with more energy efficient hardware such as stolid state drives for laptops, where possible replace workstations with solid stat drives or change over to terminal based sessions as this negates the need for local drives and reduces memory requirement thus saving energy and also offers better security as there is no data stored locally if the workstation is stolen.

Disable all but the most basic of screen savers as this heavy graphical application increasing the load on the graphic card CPU and boost the energy consumption.

Allow inactive devices, laptops and workstations to sleep or hibernate by policy.

In the server farms enabling dynamic processor switching can also save a large amount of energy as few of us use the CPU at 90% all of the time.

Consolidate switches and disable inactive ports for both power and security reasons.

If all these points are followed you could lower the total energy consumption by 30 to 40 percent.

Wednesday, 6 May 2009

Cisco logical interfaces

Cisco routers just like the switches support VLAN and you can put many of them on to one physical interface and here is how it can be done.

Remove the IP address from the physical interface, and turn it on,

no ip address
no shutdown

Create a logical interface to be assigned to one of the VLANs

interface fastethernet 0/0.X

You can change the ‘fastethernet’ to the type you have and the ‘0/0’ with the interface number that you are using.
X represent the logical interface number since this has no real value I tend to use the number of the VLAN so that its easier to follow.
For example, for the logical interface that you will use for VLAN 5 use ‘int fastethernet 0/0.5'. This way, you will easily know which interface refers to which VLAN.

Assign the logical interface to a VLAN number

encapsulation XXX Y where XXX is the encapsulation type you are using for the VLANs (ex: isl or dot1q which is 802.1Q) most commonly used one is dot1q and Y is the VLAN number that this logical interface will be assigned to.

example
interface fastethernet0/0.5
encapsulation dot1q 5


Now you have the interface but still no IP
Assign an IP address to the logical interface is easy its the same as assigning IP to physical interface

ip address 192.168.2.254 255.255.255.0

Now repeat the steps for each VLAN that you want, I've created three bellow as an example I've created for VLAN 5,10 and 15

interface fastethernet0/0.5
ip address 192.168.5.254 255.255.255.0
encapsulation dot1q 5


interface fastethernet0/0.10
ip address 192.168.10.254 255.255.255.0
encapsulation dot1q 10


interface fastethernet0/0.15
ip address 192.168.15.254 255.255.255.0
encapsulation dot1q 15


Configure static or dynamic routing in the way you need it.
you treat the logical interfaces the exact same way you treat the physical interfaces when doing the routing, so really this isn't that hard.

If you like some VLANs (ie, networks) not to participate in the routing, you can either not include them in the routing protocol or not assign a logical interface for them.

Configure access-lists in the way you find appropriate to filter the traffic going from one VLAN to another and apply them to the logical interfaces the same way you apply them to physical interfaces, this might be that you don't want them to see one another at all or just one way depending on what you want.

Common one is that management vlan can see the others but others cannot see managment vlan or one another except on some needed services.

some things not to leave out or forget about is...

If you plan to let routing updates go through the router from one VLAN to another, it is necessary to turn off split-horizon. Split-horizon technology forbids the update coming from one interface to go out the same interface. By the way its unlikely you even had it turned on but you can check to be sure.

no ip split-horizon

Don't forget without the access-lists, there would not be much point of doing VLANs and inter-VLAN routing because without the VLANs everyone would be able to communicate with everyone else.

Lastly nearly all switches support trunks on FastEthernet, and do not support the older Ethernet with 10Mbps.