• About Me

    • Patrick Burma

  • Polls

    Would You Ever use .NET with Mono on Mac or Linux?


    View Results

    Loading ... Loading ...

I recently wrote a script to use with TestTrack. The program uses C# WebRequest method to post commands to the TestTrack CGI. The purpose is be able to setup a way to automatically publish certain TestTrack reports to a directory or web site.

An interesting thing I discovered when writing this program is there does not seem to be a very good way to parse HTML. I asked one of the experienced developers here who suggested trying to treat the HTML response like XML. That didn’t seem to work since HTML is not structured like XML, its a little too loosey goosey.
I did some searching around the web and found a guy who had written his own HTML parsing classes. That looked interesting, but seemed way to complex for what I wanted to do. Int he end it looked to me like if someone was looking to parse HTML in a very structured and proper way then creating your own classes is really the best solution.

I ended up converting the HTTP response into a string then tearing the string apart until I had what I wanted. A trick I learned was to use the Replace() function to switch out some semi unique characters in the response like a and replace it with a “^” which then makes it easy to Split() the string into an array. I can then clean up the array strings easily. I used a lot of .IndexOf() and Remove() string functions as well. Starting with the HTML from an entire page I would remove everything before what I wanted, everything after what I wanted then clean up whats in between or Split() into an array if needed.

That approach works, but it was very messy/ugly/kludgey. I see how XHTML might make this whole act easier. My first attempt was to use Regex, and after I developed I migraine I decided to just start ripping apart the strings. This is probably a more lowbrow approach but more appealing to my senses then regex.

You can see the TestTrack Report Publishing script here.

TestTrack_Automated_Report_Publishing

Something to say?