Kwakkelflap: tools for the IT pro

Monday, April 14, 2008

Weather Station Crash

I was checking the backlinks to my website when I noticed this link.

I ran the software successfully for a month but within days of going on holiday the weather station software would inexplicably randomly crash every couple of hours but the next day it seemed to sort itself. This was short lived as within 12 hours of leaving to start my holiday it crashed and remained so until I returned. On my return I tried a number of things to improve the reliability including daily re-boots of the computer but it did not stop the intermittent crashes. I searched the internet and found a program called Watchdog - O - Matic from Kwakkelflap. This program monitors running programs looking for problems. I tried the trial version and found it successfully restarted the program after a crash. I purchased the full version and have been very pleased [with] it. I could now run my weather station reliably 24 hours a day.


I think it's a great example of someone using Watchdog - O - Matic.

Labels:

Saturday, February 02, 2008

Vista Reliability Monitor crash information

I'm sure everyone has programs crashing on their computer. Our Watchdog – O – Matic application can help out. But how can you check which programs crashed, and when?

Windows Vista has a program called 'Performance and Reliability Monitor' which tracks the crashes on your system. You can track how stable your computer is, based on the number of crashes. The graph like you see below includes information on various system failures.



So all you need to do is check the Reliability Monitor to discover which programs are crashing and create a new watchdog for each crashing program. With Watchdog – O – Matic these programs won't crash anymore so you won't see these items in the Reliability Monitor. This proves that your system will be more reliable and stable.

Labels: , ,

Tuesday, January 22, 2008

Watchdog - O - Matic demo

I just finished creating a demo for Watchdog - O - Matic. It's very small and only shows the basic functionality. But it should give the user an idea of what to expect from the program.

You can see the demo here.

I will be creating more demos in the near future. I think they are an excellent tool to demonstrate the application.

Labels:

Monday, October 01, 2007

Debug message

It's only natural that I run Watchdog - O - Matic all the time and watch as many different programs as possible. I use my own program, and it helps me to stress test the application.

One of the programs I'm checking is Windows Live Messenger 8.1. I noticed that messenger would have some strange debug messages from time to time: “ThumbPosition got called! That never happens.”. Apparently it does happen. It happens every time I use the scroll bar to scroll my contact list.



It's funny to see some of the debug messages of the programs you're watching.

Labels:

Monday, September 24, 2007

Weird crash problem

The new Watchdog – O – Matic version has been released for a while now. I noticed I had a strange crash problem in the new version when closing the program. It only happened on Vista 64 bit. No problems on XP 32 and 64 bit, and no problems on Vista 32 bit.

So I fired up ye olde debugger to locate the problem. The crash seemed very erratic. It crashed at a certain point for no obvious reason, and when I disabled that code, it crashed somewhere else. The Microsoft debugger couldn't help me find the problem. Then I stumbled on an innocent looking line of code at the beginning of the program:

m_pszRegistryKey = "SOFTWARE\\Kwakkelflap\\Watchdog";


This is a string that's part of the CWinApp class that you have to initialize in your application if you want to read something from the registry. It seems that in Vista x64 there is a problem freeing the memory allocated by CWinApp when you close the program causing the crash. So I simply disabled this and everything works like a charm.

Things like this prove once more that a program like Watchdog – O – Matic is very useful for everyone. I bet there are a lot of crashes and unexpected system behavior out there caused by the shift to Vista (and 64 bit operating systems).

Labels: ,

Thursday, September 13, 2007

Small progress update

Things are looking good for the new Watchdog - O - Matic release. I've received 1 bug report. Most users shouldn't encounter it and I'll be releasing a fix later this week (probably in the weekend). The 64 bit update of Service - O - Matic and Sniff - O - Matic are moving along nicely. I'll probably be able to release Sniff - O - Matic 1.07 in a few days. Service - O - Matic will have to wait cause I also need to implement some new features.

One of the disadvantages of the 64 bit versions is that the setup program is more than 2 times as big. The Watchdog - O - Matic trial went from 2 MB to 4.3 MB. Still very small compared to other downloads if you ask me (download takes less than a minute with a decent connection). But my server bandwidth usage is through the roof. I've contacted my web host and they assured me that it won't be any problem. Fingers crossed.

Labels: ,

Sunday, September 02, 2007

Watchdog - O - Matic 5.01 released

It took a while, but I'm finally ready with the new Watchdog – O – Matic release. I did a lot of testing on several operating systems, cause the number of changes I made in this version is huge.

I switched to a new compiler and created 64bit versions of the application. This required more work and changes than I thought, so thorough testing was needed. I also moved some basic features from the professional to the standard version making the standard version a more complete product. It didn't make sense to leave these features out.

All of this has some influence on the standard version. The biggest change is I no longer support Windows 95, 98 and Me. Supporting these old Windows versions was keeping some cool features out of the standard version. And our web visitor logs shows that the number of people still using these versions is very low.

What's up next? Creating 64bit versions for Sniff – O – Matic and Service – O – Matic. Wish me luck.

Labels:

Thursday, August 09, 2007

x64 Conversion

I'm still working on 64bit versions of Watchdog - O - Matic. Right now, I have a beta version that's working on my Vista 64bit and XP 64bit. But there are still some problems on other test machines that need to be corrected.

All in all, creating a 64bit version isn't that hard, except if you have to do some low level stuff. One of the things I had problems with was the system to detect the command line parameters of a running program. You see, when the watchdog checks a running program, it need to know the parameters. Otherwise the program will not restart with the same parameters if it crashed. Windows has a function GetCommandLine() which returns these parameters of the current process. So in order to know the parameters of another process, I need to write in the target process memory, create a remote thread that executes the GetCommandLine() function, wait for the target process to execute the remote thread and read the target process memory to handle the result. You can imagine that creating a 64bit function, and running it in a 32bit application is lots of fun.

One of the biggest challenges however is creating a 64bit disassembler so we can mail disassembly info of the crash. This is a huge task without much gain, so I think that the first 64bit versions won't have this option on board. The primary concern is creating a 64bit version so people with a 64bit operating system can actually use the watchdog.

If anyone is willing to test the beta version, drop me a line.

Labels: ,

Friday, July 20, 2007

x64 Versions and setup

I'm porting our Watchdog application to VC++ 2005. It's a lot of work, and the end result should be no difference for the user. But it's a step I'd have to take eventually.

One advantage of this port is that I'll be able to create 64bit versions of my applications. Some people think it's to early to worry about 64bit, but I don't believe it is. Every new PC sold nowadays is capable of 64bit. And I see a rise in people using a 64bit operating system. Also, it doesn't hurt to be prepared. It's something every misv will have to do eventually, so why not do this early on.

One thing I don't want is a separate download for 32 and 64 bit versions. The installer should check the operating system and install the 32bit or 64bit version. Maybe create an option to install the 32bit version on a 64bit system for compatibility. So I checked Inno setup and apparently they have everything I need (again). Simply set that x86 and x64 are allowed and your setup will automatically detect a 32 or 64bit install. Then, separate the 32bit and 64bit files and add a check with the IsWin64 function to install the correct file. That's all. The setup itself stays 32bit, but that shouldn't be a problem.

It will take some time to create the 64bit versions, and I'll need to test the new system on 32bit and 64bit operating systems. But you can expect 64bit versions of the Kwakkelflap programs in the near future.

Labels: , ,

Tuesday, July 17, 2007

Try, catch or miss?

I'm porting my applications from VC++ 6.0 to VC++ 2K5. Vista and VC++ 6.0 are not the best friends. I just finished porting Fping, where I had some linking problems cause I overloaded the printf() function. In the end, I simply renamed the function and each call to printf(). Next on my list was a simple MFC GUI application: the crash test application that comes with Watchdog – O – Matic.

The intent of this program is to generate first chance and second chance exceptions. To generate a first chance exception I have this code:

CDialog* pDialog = NULL;
try
{
pDialog->Detach();
}
catch(CException* e)
{
e->Delete();
}
catch(...)
{
}


Everything works fine when compiled with VC++ 6.0. The exception is handled. But when I compile this with VC++ 2K5, the exception isn't handled, and I have a second chance exception (a crash). Vista displays a dialog box, and Watchdog – O – Matic detects a crash instead of a first chance exception. I'm still looking for a solution. Hope there aren't many of these problems when converting...

Labels: ,

Tuesday, May 08, 2007

Finding crash information using the MAP file

This is an article previously published on the code project. It might be a little outdated, but it's still a good read.

Introduction

Programming neat applications is one thing. But when a user informs you your software has crashed, you know it's best to fix this before adding other features. If you're lucky enough, the user will have a crash address. This will go a long way in solving the problem. But how can you determine what went wrong, using this crash address?

Creating a MAP file

Well first of all, you'll need a MAP file. If you don't have one, it will be nearly impossible to find where your application crashed using the crash address. So first, I'll show you how to create a good MAP file. For this, I will create a new project (MAPFILE). You can do the same, or adjust your own project. I create a new project using the Win32 Application option in VC++ 6.0, selecting the 'typical "Hello Word!" application' to keep the size of the MAP file reasonable for explanation.

Once created we need to adjust the project settings for the release version. In the C/C++ tab, select "Line Numbers Only" for Debug Info.



Many people forget this, but you'll need this option if you want a good MAP file. This will not affect your release in any way. Next is the Link tab. Here you need to select the "Generate mapfile" option. Also, type the switches /MAPINFO:LINES and /MAPINFO:EXPORTS in the Project Options edit box.



Now, you're ready to compile and link your project. After linking, you will find a .map file in your intermediate directory (together with your exe).

Reading the MAP file

After all this dull work, now comes the neat part: how to read the MAP file. We'll do this by using a crash example. So first: how to crash your application. I did this by adding these two lines at the end of the InitInstance() function:

char* pEmpty = NULL;
*pEmpty = 'x'; // This is line 119


I'm sure you can find other instructions which will crash your application. Now recompile and link. If you start the application, it will crash and you'll get a message like this: 'The instruction at "0x004011a1" referenced memory at "0x00000000". The memory could not be "Written".' .

Now, it's time to open the MAP file with notepad or something similar. You MAP file will look like this:

The top of the MAP file contains the module name, the timestamp indicating the link of the project, and the preferred load address (which will probably be 0x00400000 unless you're using a dll). After the header comes the section information that shows which sections the linker brought in from the various OBJ and LIB files.

MAPFILE

Timestamp is 3df6394d (Tue Dec 10 19:58:21 2002)

Preferred load address is 00400000

Start Length Name Class
0001:00000000 000038feH .text CODE
0002:00000000 000000f4H .idata$5 DATA
0002:000000f8 00000394H .rdata DATA
0002:0000048c 00000028H .idata$2 DATA
0002:000004b4 00000014H .idata$3 DATA
0002:000004c8 000000f4H .idata$4 DATA
0002:000005bc 0000040aH .idata$6 DATA
0002:000009c6 00000000H .edata DATA
0003:00000000 00000004H .CRT$XCA DATA
0003:00000004 00000004H .CRT$XCZ DATA
0003:00000008 00000004H .CRT$XIA DATA
0003:0000000c 00000004H .CRT$XIC DATA
0003:00000010 00000004H .CRT$XIZ DATA
0003:00000014 00000004H .CRT$XPA DATA
0003:00000018 00000004H .CRT$XPZ DATA
0003:0000001c 00000004H .CRT$XTA DATA
0003:00000020 00000004H .CRT$XTZ DATA
0003:00000030 00002490H .data DATA
0003:000024c0 000005fcH .bss DATA
0004:00000000 00000250H .rsrc$01 DATA
0004:00000250 00000720H .rsrc$02 DATA


After the section information, you get the public function information. Notice the "public" part. If you have static-declared C functions, they won't show up in the MAP file. Fortunately, the line numbers will still reflect the static functions. The important parts of the public function information are the function names and the information in the Rva+Base column, which is the starting address of the function.

Address Publics by Value Rva+Base Lib:Object

0001:00000000 _WinMain@16 00401000 f MAPFILE.obj
0001:000000c0 ?MyRegisterClass@@YAGPAUHINSTANCE__@@@Z 004010c0 f MAPFILE.obj
0001:00000150 ?InitInstance@@YAHPAUHINSTANCE__@@H@Z 00401150 f MAPFILE.obj
0001:000001b0 ?WndProc@@YGJPAUHWND__@@IIJ@Z 004011b0 f MAPFILE.obj
0001:00000310 ?About@@YGJPAUHWND__@@IIJ@Z 00401310 f MAPFILE.obj
0001:00000350 _WinMainCRTStartup 00401350 f LIBC:wincrt0.obj
0001:00000446 __amsg_exit 00401446 f LIBC:wincrt0.obj
0001:0000048f __cinit 0040148f f LIBC:crt0dat.obj
0001:000004bc _exit 004014bc f LIBC:crt0dat.obj
0001:000004cd __exit 004014cd f LIBC:crt0dat.obj
0001:00000591 __XcptFilter 00401591 f LIBC:winxfltr.obj
0001:00000715 __wincmdln 00401715 f LIBC:wincmdln.obj
//SNIPPED FOR BETTER READING
0003:00002ab4 __FPinit 00408ab4
0003:00002ab8 __acmdln 00408ab8

entry point at 0001:00000350

Static symbols

0001:000035d0 LeadUp1 004045d0 f LIBC:memmove.obj
0001:000035fc LeadUp2 004045fc f LIBC:memmove.obj
//SNIPPED FOR BETTER READING
0001:00000577 __initterm 00401577 f LIBC:crt0dat.obj
0001:0000046b _fast_error_exit 0040146b f LIBC:wincrt0.obj


The public function part is followed by the line information (you got this if you used the /MAPINFO:LINES in the Link tab and selected the "Line numbers" in the C/C++ tab). After this, you will get the export information if your project contains exported functions and you included /MAPINFO:EXPORTS in the link tab.

Line numbers for .\Release\MAPFILE.obj(F:\MAPFILE\MAPFILE.cpp) segment .text

24 0001:00000000 30 0001:00000004 31 0001:0000001b 32 0001:00000027
35 0001:0000002d 53 0001:00000041 40 0001:00000047 43 0001:00000050
45 0001:00000077 47 0001:00000088 48 0001:0000008f 52 0001:000000ad
53 0001:000000b3 71 0001:000000c0 80 0001:000000c3 81 0001:000000c8
82 0001:000000ff 86 0001:00000114 88 0001:00000135 89 0001:00000145
102 0001:00000150 108 0001:00000155 110 0001:00000188 122 0001:0000018d
115 0001:0000018e 116 0001:0000019a 119 0001:000001a1 121 0001:000001a8
122 0001:000001ae 135 0001:000001b0 143 0001:000001cc 172 0001:000001ee
175 0001:0000020d 149 0001:00000216 157 0001:0000022c 175 0001:00000248
154 0001:00000251 174 0001:0000025f 175 0001:00000261 151 0001:0000026a
174 0001:00000287 175 0001:00000289 161 0001:00000294 164 0001:000002a8
165 0001:000002b6 166 0001:000002d8 174 0001:000002e7 175 0001:000002e9
169 0001:000002f2 174 0001:000002fa 175 0001:000002fc 179 0001:00000310
186 0001:0000031e 193 0001:0000032e 194 0001:00000330 188 0001:00000333
183 0001:00000344 194 0001:00000349


Now we will look up where the crash occurred. First, we'll determine which function contains the crash address. Look in the "Rva+Base" column and search the first function with an address bigger than the crash address. The preceding entry in the MAP file is the function that had the crash. In our example our crash address is 0x004011a1. This is between 0x00401150 and 0x004011b0 so we know the crash function is ?InitInstance@@YAHPAUHINSTANCE__@@H@Z . Any function name that starts with a question mark is a C++ decorated name. To translate the name, pass it as a command-line parameter to the Platform SDK program UNDNAME.EXE (in the bin dir). You won't need to do this most of the time as you might figure it out just by looking at it (here: InitInstance() in MAPFILE.obj).

This is a big step for bug tracking. But it gets even better: we can find out on which line the crash occurred! We need to do some basic hexadecimal mathematics, so people whom can't do this without a calculator: now is the time to use it. The first step is the following calculation: crash_address - preferred_load_address - 0x1000
Addresses are offsets from the beginning of the first code section, se we need to do this calculation. Subtracting the preferred load address is logical, but why do we need to substract another 0x1000? The crash address is an offset from the beginning of the code section, but the first part of the binary isn't the code section! The first part of the binary is the Portable Executable (PE), which is 0x1000 bytes long. Mystery solved. In our example, this is: 0x004011a1 - 0x00400000 - 0x1000 = 0x1a1

Now it's time to look in the line information section of the MAP file. The lines are shown like this: 30 0001:00000004. The first number is the line number, the second number is the offset from the beginning of the code section in which this line occurred. If we want to look for our line number, we just have to do the same thing we did for the function: determine the first occurrence of a bigger offset than the one we just calculated. The crash occurred in the preceding entry. In our example: 0x1a1 is before 0x1a8. So our crash occurred on line 119 in MAPFILE.CPP.

Keeping track of MAP files

Each release had it's own MAP file. It's not a bad idea to include the MAP file with the exe distribution. This way, you can be certain you have the correct MAP file for this exe. You could keep every MAP file with every exe on your system, but we all know this might give some troubles later on. The MAP file doesn't contain any information you wouldn't want the user to see (unless maybe class and function names ?) . A user would have no use with it, but at least you can ask for the MAP file if you don't have a copy yourself.

Labels: ,

Wednesday, April 04, 2007

Comparing to the competition

I was checking the most popular download sites to see if they updated our Watchdog - O - Matic program when I saw I had a new competitor. They've been selling a shareware program that tries to do the same thing as Watchdog - O - Matic since January. So I was wondering how their program performed. It's only natural that competitors check each other’s program and compare theirs with it. What's not so frequent is that someone posts test results and their thoughts about it online. Yes, I know this might help my competitors fix their problems. I'm such a helpful guy. I will try to give you an honest opinion, but I might be biased sometimes.

It's easy to see that their program is made with mine as an example. Several features in the application are named the same way. And even parts of the manual and website are copied and changed a little.

E.g. I had an option before 5.00 (removed because it's obsolete in the new version):
"Don't start if already running but watch the active process"
Their option is:
"Don't start another copy if already running"

They do have a nice looking manual though. This is a part where I still need some improvements. Personally, I think it's not necessary to spend my time writing huge manuals IF you create a GUI that's easy to use. This is why I have been adding tooltips so the user doesn't have to refer to the manual all the time.

As far as I can tell, the new program doesn't actually 'debug' the target, but is watching the CPU and Memory activity. This is similar to other competitors, and I'm expecting the same problems with regards to detecting crashes. I configured the program to check my own crash application, and these are the results.

Crash in the main thread: a crash dialog appears requiring a user action, and a 'hanging application' is detected. The crash application is freezed by windows so that's why the program detects a not responding application. The program tries to close our application, but is unable to do that due to the dialog (even with the terminate option enabled). So our crash test program just sits there. It crashed, and it stays dead while the program is waiting to restart the crashed application.



They acknowledge the problem in their forum, and suggest that the user should disable the crash dialogs (although this doesn't work with the steps they suggest). Users shouldn't change settings like these if it can be avoided. Certainly if it is a setting outside of your own application (e.g. Windows).

A crash in a different thread is not detected. Apparently the program only checks if the main thread is running. Unfortunately, this isn't enough. You need to dig a lot deeper if you want to detect problems in multithreaded applications. Otherwise, you won't even notice that there is a problem and the target application isn't running like it should.

The 'not responding check' is working. The program checks the CPU usage and the message pump of the main thread, which is sufficient. This is exactly how we check for hanging applications.

CPU usage: when an application is using 100% on a dual core, then the program measures a 50% usage so no problem is detected. I haven't tried the detection on a single core CPU, but I suspect that this will work. I do wonder what will happen if your application is using 100% for a limited time, which might be normal behaviour. It's probably an 'all or nothing' option that can't be tuned to detect normal from abnormal CPU usage.

Memory usage works, although the program seems to check the virtual memory size instead of real memory usage.

While using the program, I discovered some strange things in the user interface. E.g. viewing your application statistics displays the following screen, which is probably a debug window?



It's not all bad. They do have an interesting feature that's been on my todo list for a while. This will be an incentive to implement new features and clean up that todo list of mine.

Labels:

Tuesday, April 03, 2007

Watchdog - O - Matic 5.00 released

The new Watchdog - O - Matic release is ready. A lot has changed in the interface. It should be easier to create new software watchdogs and use the program in general. The previous version had grown completely out of proportion. I noticed many users didn't understand the system. And I have to admit, taking a step back looking at it that it was indeed confusing.

I moved some features around as well. The regular version can now watch programs automatically, and I've added a cool scheduling feature in the professional version. I also added more tooltips for different options.

I noticed the improvements when editing the manual. I was able to remove al the 'notes' and 'tips' scattered throughout the manual. These where items where a user had to pay attention, or something might have gone wrong. So it's a good thing I don't need those anymore.

The downside? The system isn't directly compatible with the older versions. Therefore, we disabled the auto-update to 5.0. If you have an older version, and you want to update to 5.0, drop me a mail at support@kwakkelflap.com with your purchase information. I will send you a new download link as soon as possible.

Labels:

Friday, March 23, 2007

Spring Cleaning

Right now, I'm working on a new version of Watchdog - O - Matic. While I was implementing a new feature, I saw some things I wanted to change for a long time but never got around doing it. Particularly the UI and the way users create watchdogs was not clear and could be a lot easier. So I'm rewriting a lot of code, implementing almost everything in a different way. At almost 40000 lines of code right now, this is nothing to scoff at. Luckily the core debugging code can remain unchanged. The major overhaul is nearly complete, but I still need to implement the new features. I'm eager to release this version though, as it is a huge improvement for the user. Deadline is a month from now (this includes testing), and due to the huge improvements, we'll move to version 5.0 .

Labels: ,