Wednesday, May 27, 2009

c++: regular expression tester with boost

#include 
#include 
#include   // point this to your Boost.Regex lib

using namespace std;

int main( ) 
{
    std::string s, sre;
    boost::regex re;

    while(true)
    {
        cout << "Expression: ";
        cin >> sre;
        if (sre == "quit")
        {
            break;
        }
        cout << "String:     ";
        cin >> s;

        try
        {
            // Set up the regular expression for case-insensitivity
            re.assign(sre, boost::regex_constants::icase);
        }
        catch (boost::regex_error& e)
        {
            cout << sre << " is not a valid regular expression: \""
            << e.what() << "\"" << endl;
            continue;
        }
        if (boost::regex_match(s, re))
        {
            cout << re << " matches " << s << endl;
        }
    }
}

apache: worker vs prefork models

Apache's Prefork Model

This Multi-Processing Module (MPM) implements a non-threaded, pre-forking web server that handles requests in a manner similar to Apache 1.3. It is appropriate for sites that need to avoid threading for compatibility with non-thread-safe libraries. It is also the best MPM for isolating each request, so that a problem with a single request will not affect any other.
[http://httpd.apache.org/docs/2.0/mod/prefork.html]



Apache's Worker Model

Multi-Processing Module (MPM) implements a hybrid multi-threaded multi-process web server.
This Multi-Processing Module (MPM) implements a hybrid multi-process multi-threaded server. By using threads to serve requests, it is able to serve a large number of requests with less system resources than a process-based server. Yet it retains much of the stability of a process-based server by keeping multiple processes available, each with many threads.

The most important directives used to control this MPM are ThreadsPerChild, which controls the number of threads deployed by each child process and MaxClients, which controls the maximum total number of threads that may be launched.
[http://httpd.apache.org/docs/2.0/mod/worker.html]




For example, sites that need a great deal of scalability can choose to use a threaded MPM like worker, while sites requiring stability or compatibility with older software can use a prefork.
[http://httpd.apache.org/docs/2.0/mpm.html]


I compiled 2 different versions of apache 2.2.4 on Solaris 10 (06/06, on a crappy U10, but...) one using the prefork MPM (compile --with-mpm=prefork) and the other using the worker MPM (compile --with-mpm=worker). Prefork is supposed to generally be better for single or dual cpu systems, and worker is supposed to be generally better for multi-CPU systems.

So for this setup, the worker MPM was almost twice as fast as the prefork.
I'm going to run these same tests on a multi-cpu server and see what the results look like.
[http://www.camelrichard.org/apache-prefork-vs-worker]




On most Unixes, the worker MPM results in considerable performance enhancements over the prefork MPM, and it results in much greater scalability, since threads are a lot cheaper (less memory and CPU to create and run) than forked processes.
[http://www.onlamp.com/pub/a/apache/2004/06/17/apacheckbk.html]

Tuesday, May 26, 2009

c++: a simple boost::regex example

void piMozyAuthUnitTest::testBoostRegex1()
{
    std::string s, sre;
    boost::regex re;
    boost::cmatch matches;

    while(true)
    {
        cout << "Expression: ";
        cin >> sre;
        if (sre == "quit")
            break;
        

        cout << "String:     ";
        cin >> s;

        try
        {
            // Assignment and construction initialize the FSM used
            // for regexp parsing
            re = sre;
        }
        catch (boost::regex_error& e)
        {
            cout << sre << " is not a valid regular expression: \""
            << e.what() << "\"" << endl;
            continue;
        }
        // if (boost::regex_match(s.begin(), s.end(), re))
        if (boost::regex_match(s.c_str(), matches, re))
        {
            // matches[0] contains the original string.  matches[n]
            // contains a sub_match object for each matching
            // subexpression
            for (int i = 1; i < matches.size(); i++)
            {
                // sub_match::first and sub_match::second are iterators that
                // refer to the first and one past the last chars of the
                // matching subexpression
                string match(matches[i].first, matches[i].second);
                cout << "\tmatches[" << i << "] = " << match << endl;
            }
        }
        else
        {
            cout << "The regexp \"" << re << "\" does not match \"" << s << "\"" << endl;
        }
    }

}

c++ : simple string substitution using find_first_of() and replace()

// loginURL: replace "id" with "auth" in hostname (so that "id.domain" becomes "auth.domain")
// tested and working!

string x = loginURL; 
size_t pos1 = x.find_first_of("//") + 2;
size_t pos2 = x.find_first_of(".");
size_t len = pos2-pos1;

x.replace(pos1, len, "auth");