Friday, December 31, 2010

groovy: get your classpath right!

Here’s some basic, basic groovy code with a simple SQL query to retrieve some data from an oracle DB:
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:oracle:thin:@hostname:1521:testdb", "user", "pass", "oracle.jdbc.driver.OracleDriver")

def query = "select * from tablename where name='Beta 1'"

row = sql.firstRow(query)

println row

However, despite various attempts, I couldn’t even establish a DB connection because of this annoyingly persistent exception:
Caught: java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver
        at sql1.run(sql1.groovy:13)

So of course, I immediately googled the rather-descriptive error message, and quickly learnt that Groovy didn't know where to find the oracle jdbc driver. With a few more searches, I discovered that the missing driver was located inside a jar that was called something like “ojdbc14.jar” or something.

I found where the jar was on my file system:
”C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0” contained ojdbc14-10.2.0.1.0.jar that seemed like a good candidate for the jar containing the required oracle jdbc driver. Seemed like I’m making progress, good.

Now how could I tell Groovy/Java to search this path? I figured that I had to include the jar’s file-system path into my Java CLASSPATH, so that groovy would know where to find missing classes.

Okay, so I added “C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0” to my CLASSPATH environment variable, but that didn’t seem to work.

Still no good :-(

Hmmm, “groovy –h” indicates that I can specify a classpath on the command line with the –cp or –classpath parameters. So let’s try that:
$ groovy -classpath C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0  sql1.groovy
Caught: java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver
at sql1.run(sql1.groovy:2)

Dammit work already! :P

I realized that I must be missing something very basic. It had to be a total n00b Java/Groovy problem. C’mon it can’t really be so hard to specify a path can it?!?

So I started googling about setting the java CLASSPATH correctly and then I stumbled on why it wasn’t’ working for me:


JARs on the classpath
Java compiler and run-time can search for classes not only in separate files, but also in `JAR' archives. A JAR file can maintain its own directory structure, and Java follows exactly the same rules as for searching in ordinary directories. Specifically, `directory name = package name'. Because a JAR is itself a directory, to include a JAR file in the class search path, the path must reference the JAR itself, not the directory that contains the JAR. This is a very common error.

Suppose I have a JAR jarclasses.jar in directory /jarclasses. The Java compiler look for classes in this jar, we need to specify:

javac -classpath /jarclasses/jarclasses.jar
and not merely the directory jarclasses.

Thank you, http://www.roseindia.net/java/java-classpath.shtml, for saving me from a lot of frustration!
Voila! Finally it works!
$ groovy -classpath C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0\ojdbc14-10.2.0.1.0.jar sql1.groovy
[NAME:Summary, PATH:Snapshot, MAPPINGKEY:26, VERSION:v9]

It’s even easier with the CLASSPATH environment variable set correctly like this:
CLASSPATH=C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0\ojdbc14-10.2.0.1.0.jar;

Now I don’t even have to type the –classpath parameter out on the command line each time:
$ set | findstr CLASSPATH
CLASSPATH=C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0\ojdbc14-10.2.0.1.0.jar;C:\Program Files\Java\jre6\lib\ext\QTJava.zip;.

$ groovy sql1.groovy
[NAME:Summary, PATH:Snapshot, MAPPINGKEY:26, VERSION:v9]

Tuesday, December 28, 2010

windows: setting environment variables from the command line

The standard route is rather slow and painful, even on Win7:
“MyComputer | Properties | Advanced System Settings | Environment Variables”… that’s a lot of clicks just to get to the point where you can add/edit an environment variable.

Thankfully, there’s a much easier alternative via the command line.

But first, did you know where the environment variables are stored on the file system? Until now I thought they were in some system config file, but it turns out they're actually stored in the registry itself (which technically is a system config file).
  • The logged-in user’s env variables go here:
    HKEY_CURRENT_USER\Environment
  • The machine-wide env variables go here:
    HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Environment
The setx utility lets you change env variables for either the user or machine. The good old set command let’s you add/edit env variables too, but there are a few major differences from setx:
  • setx will permanently change the value of an env variable. The changes made by set last only for the current session.
  • Changes made by setx are not immediately visible – that is, they are not available to the current session. However, set’s changes are immediately visible.
  • You can’t delete env variables with setx, like you can with set. For example:
    set someVar=
    will delete the environment variable ‘someVar’ for the current session, but you can’t do something like this with setx.

    Instead, you have to use the “reg” utility to delete environment variables (that may or may not have been created with setx):
    REG delete HKCU\Environment /V someVar
Okay, so do you create a permanent environment variable via setx? It’s pretty straightforward:
  • To set an env variable for the current user:
    SETX <Variable> <Value>
  • To set an env variable for the machine (i.e. globally):
    SETX <Variable> <Value> -m
Example:
setx  CLASSPATH  C:\Users\ambars\.m2\repository\com\oracle\ojdbc14\10.2.0.1.0\ojdbc14-10.2.0.1.0.jar;C:\Program Files\Java\jre6\lib\ext\QTJava.zip;.

source: http://ss64.com/nt/setx.html

Wednesday, November 24, 2010

performance appraisals: the critical incident method

What Is The Critical Incident Method of Performance Appraisal?

The critical incident method of performance appraisal involved identifying and describing specific events (or incidents) where the employee did something really well or something that needs improvement. It's a technique based on the description of the event, and does not rely on the assignment of ratings or rankings, although it is occasionally coupled with a ratings type system.

The use of critical incidents is more demanding of the manager since it requires more than ticking off things on a form -- the manager must actually write things out. On the other hand critical incidents can be exceedingly useful in helping employees improve since the information in them is more detailed and specific than in methods that involve rating employees.

Some managers encourage employees to record their own critical incidents (where the employee excelled, situations that did not go well). That's an interesting variation that places more responsibility with the employee, and also does not require the manager to have been present when the incident occurred.

Generally, it's important that incidents be recorded AS THEY OCCUR, and not written at or around the annual performance review. Delaying the recording of critical incident reports (either good incidents or not so good) means a loss of detail and accuracy.

 

source

Tuesday, November 2, 2010

coming soon! the awesome galaxy tab!

a thing of beauty...
The iPad seems to be taking forever to arrive on Indian shores. The so-called "Global iPad Launch" (that happened in May 2010) left out India egregiously and Apple fanboys here have been waiting ever since. Nearly 6 months on, there isn't so much as an iPad launch date on the Apple India website. So India doesn't seem to figure much in Apple's plans.

Which is GREAT as far as Desi Droidheads are concerned :)

Samsung has always given India a high priority, and the launch of the awesome Android2.2-powered Galaxy Tab is no exception. According to Samsung Mobile India, the Galaxy Tab is launching on Friday, 10th November - almost a week from now!

The make-or-break metric for a country like India is price, and this too seems to be in the favor of the Galaxy Tab. It's priced at Rs 38,000. This isn't exactly cheap, but it's not exorbitant either.

the Layar reality browser in action
The iPad 64GB (with WiFi and 3G) version costs $829 in the US (which roughly translates to Rs 37,000), so it will definitely retail for more than 40k, if-and-when it gets here. In fact, a quick ebay.in search reveals that the iPad 64GB (+WiFi +3G) is available in India  (US purchases being re-sold here, presumably) for around Rs. 50,000. The better price point for the Galaxy Tab, coupled with its early launch should well entrench Samsung on Indian soil to take on the iPad (assuming it even gets here :)

Check out this nice side-by-side comparison of the two tablets by PCWorld. And here's a beautiful tech spec comparison that might help you decide which is better for you.

In my mind, the Galaxy Tab is a clear winner thanks to its:
  • higher pixel density
  • front-and-rear cameras
    (hence video-calling, videoconferencing ability, which the iPad sorely lacks)
  • expandable microSD storage
  • smaller, lighter frame
    (hence a more comfortable form-factor)
  • better price point

Way to go, Samsung!

(Galaxy Tab pictures courtesy of Samsung)

Monday, November 1, 2010

android: already leaving iphone in the dust?

I couldn't be happier for my favorite mobile platform.

According to this news article on the Wall Street Journal, as far as Q3 2010 sales go, Android has finally leapt ahead of all other smartphone platforms.

According to that same article, Mr. Jobs isn't happy and is railing Android over how developers need to release various versions of apps to support the multitude of Android variants in the market. Well, yeah, that is a bit of an issue, but some work around it by just releasing for the very first Android releases: Donut (1.6) or cupcake (1.5). This way you're a bit limited by features, but you're assured of forward-compatibility. Also, I don't think too many developers are really bothered by this issue, at least going by the explosive growth of the Android Market :)

Ever since I got my Eclair (Android 2.1) smartphone a couple of months ago, I've known that this is going to be the future of mobile computing. I bet Samsung's Galaxy Tab (which has generated huge buzz in pre-release) will only consolidate my hopes for the platform.

Woohoo, Android!

Saturday, October 16, 2010

windows: which processes have loaded xyz.dll?

Example: find all processes that have loaded msvcrt.dll:
C:\Users\ambars>tasklist /m /fi "modules eq msvcrt.dll"


Image Name                     PID Modules                                     
========================= ======== ============================================
csrss.exe                      748 ntdll.dll, CSRSRV.dll, basesrv.DLL,         
                                   USP10.dll, msvcrt.dll, sxssrv.DLL, sxs.dll, 
                                   RPCRT4.dll, CRYPTBASE.dll                   

wininit.exe                    800 ntdll.dll, kernel32.dll, KERNELBASE.dll,    
                                   USER32.dll, GDI32.dll, LPK.dll, USP10.dll,  
                                   msvcrt.dll, RPCRT4.dll, sechost.dll,        
                    
csrss.exe                      808 ntdll.dll, CSRSRV.dll, basesrv.DLL,              
                                   USP10.dll, msvcrt.dll, sxssrv.DLL, sxs.dll, 
                                   RPCRT4.dll, CRYPTBASE.dll  
[SNIP/]

Thursday, October 14, 2010

vim: how to record and replay macros

To record a macro:
  1. Start recording: press ‘q’
  2. Choose a macro register: press ‘a’ to select ‘a’ as a location to save the macro to. You will see “recording” at the bottom left of the vim window.
  3. Perform editing actions: for example, suppose you want to delete any line containing the string, “Stage:” You can do this by pressing:
    Esc
    /Stage:
    dd
    

  4. Stop recording: press ‘q’
To replay a macro:
  1. Choose a macro register: In our case, we want the macro we just saved to register ‘a’.
  2. Repeat the saved macro: by pressing “@[register_name]” which in our case is:
    @a

  3. Multiple-repeat: press “[count]@[register_name]”, for example:
    8@a

linux: which process is listening on port X?

Discovered a new tool, ss, to view "socket statistics. From the man page:

Name
ss - another utility to investigate sockets

Synopsis
ss [options] [ FILTER ]

Description
ss is used to dump socket statistics. It allows showing information similar to netstat. It can display more TCP information than state than other tools.

[root@g2aqa3br1.qai ~]# ss -t
State      Recv-Q Send-Q      Local Address:Port          Peer Address:Port
ESTAB      0      0               127.0.0.1:56227            127.0.0.1:6802
ESTAB      0      0               127.0.0.1:56228            127.0.0.1:6802
ESTAB      0      0            172.29.8.131:38140          10.230.6.27:ldaps
ESTAB      0      0            172.29.8.131:38142          10.230.6.27:ldaps

reference: http://linux.die.net/man/8/ss


 


[root@g2aqa3br1.qai ~]# netstat -plunt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 ::ffff:172.29.8.131:1098    :::*                        LISTEN      18572/java
tcp        0      0 :::1099                     :::*                        LISTEN      18572/java
tcp        0      0 :::80                       :::*                        LISTEN      26695/httpd
tcp        0      0 :::22                       :::*                        LISTEN      7327/sshd
tcp        0      0 :::443                      :::*                        LISTEN      26695/httpd
udp     2616      0 0.0.0.0:514                 0.0.0.0:*                               6898/syslogd
[root@g2aqa3br1.qai ~]# ps 26695
PID TTY      STAT   TIME COMMAND
26695 ?        SNs    0:00 /opt/ec/apache2/bin/httpd -d /opt/ec/apache2 
-f /opt/ec/broker/conf/httpd.conf -k start -DSSL


reference:
http://www.cyberciti.biz/faq/find-out-which-service-listening-specific-port

Wednesday, October 13, 2010

linux: my usual ~/.bashrc file

This is the typical ~/.bashrc file I use, especially on cygwin/mintty on Windows boxes. It's work-in-progress and keeps evolving.

Note the handy little shell function, tailHelpAlert, that was designed to run on cygwin, on the Windows box where HelpAlert is running. It determines which is the correct (i.e. most recent) HelpAlert log, and then tails it. You don't have to know which build of HelpAlert you're running, or which temp directory the logs are going to right now.

Another handy little utility is the diskhoggers alias which is a cute little bit of nixcraft (works only on RPM-based distros, of course) to determine which packages (i.e. RPMs) are hogging the most disk space. Very handy when you're critically short of HDD real estate and want to remove junk and clutter quickly.

 

# ################
# My Section:
# ################

# for setting history length see HISTSIZE and HISTFILESIZE in bash(1)
export HISTSIZE=5000
export HISTFILESIZE=2000
export HISTLENGTH=50000

# don't put duplicate lines in the history. See bash(1) for more options
# ... or force ignoredups and ignorespace
export HISTCONTROL=ignoredups:ignorespace


# http://serverfault.com/questions/72456/stop-bash-tab-completion-from-thinking-i-want-to-cd-into-svn-directories

# Stop bash tab completion from thinking I want to cd into .svn directories
export FIGNORE=svn



alias ls='ls -hF --color=tty'                 # classify files in colour
alias dir='ls --color=auto --format=vertical'
alias vdir='ls --color=auto --format=long'
alias ll='ls -l'                              # long list
alias la='ls -A'                              # all but . and ..
alias l='ls -CF'                              #
# shortcut to see which RPMs are taking the most disk space
alias diskhoggers='rpm -qa --qf "%10{SIZE}\t%{NAME}\n" | sort -n'

# shell prompt
export PS1="[\e[2;33m\u@mintty\e[m \e[0;33m\t\e[m \w] \$ "


#function to tail the correct HA logs automatically
function tailHelpAlert
{

# NOTE: correct the following logs path for your system:
g2aLogsDir=/cygdrive/c/Users/ambars/AppData/Local/Temp/CitrixLogs/GoToAssist/

#find out the correct build number
build_dir=`cd $g2aLogsDir; ls -1t | head -n 1`

cd $g2aLogsDir/$build_dir

targetdir=`ls -1t | head -n 1`

cd $targetdir
tail -f GoToAssist*

}

Friday, September 24, 2010

favorite shell prompt in BASH

 

Here’s my typical prompt in mintty/cygwin:

[ambars@mintty 11:54:28 ~] $

 

And here’s the code: simply append to ~/.bashrc:

export PS1="[\u@mintty \e[0;33m\t\e[m \w] \$ "

Source:
Bash Shell PS1: 10 Examples to Make Your Linux Prompt like Angelina Jolie

Thursday, September 23, 2010

windows: powertools that replace the plain old netstat command

  • currports: powerful, easy-to-use and free! Can filter processes. A perfect replacement for port explorer.
  • tcpview (sysinternals)
  • procmon (sysinternals)
  • port explorer (Trialware, old favorite. Development has long stopped since the parent company seems to be dead. Also redundant now, thanks to the above free options)

Sunday, September 5, 2010

network basics: how NAT works

source: http://en.wikipedia.org/wiki/Port_address_translation
 

In A Nutshell : Example

  • A host at private IP address 192.168.0.2 on the private network may ask for a connection to a remote host on the public network.
  • The initial packet has the address 192.168.0.2:15345.
  • The PAT device (which we assume has a public IP of 1.2.3.4) may arbitrarily translate this source address:port pair to 1.2.3.4:16529, making an entry in its internal table that port 16529 being used for a connection by 192.168.0.2 on the private network, with port 15345.
  • When a packet is received from the public network by the PAT device for address 1.2.3.4:16529 the packet is forwarded to 192.168.0.2:15345.

Port Address Translation (PAT)

Port Address Translation (PAT) is a feature of a network device that translates TCP or UDP communications made between hosts on a private network and hosts on a public network. It allows a single public IP address to be used by many hosts on a private network, which is usually a Local Area Network or LAN.
A PAT device transparently modifies IP packets as they pass through it. The modifications make all the packets which it sends to the public network from the multiple hosts on the private network appear to originate from a single host, (the PAT device) on the public network.

Translation of the Endpoint

With PAT, all communication sent to external hosts actually contain the external IP address and port information of the PAT device instead of internal host IPs or port numbers.
  • When a computer on the private (internal) network sends a packet to the external network, the PAT device replaces the internal IP address in the source field of the packet header (sender's address) with the external IP address of the PAT device. It then assigns the connection a port number from a pool of available ports, inserting this port number in the source port field (much like the post office box number), and forwards the packet to the external network. The PAT device then makes an entry in a translation table containing the internal IP address, original source port, and the translated source port. Subsequent packets from the same connection are translated to the same port number.
  • The computer receiving a packet that has undergone PAT establishes a connection to the port and IP address specified in the altered packet, oblivious to the fact that the supplied address is being translated (analogous to using a post office box number).
  • A packet coming from the external network is mapped to a corresponding internal IP address and port number from the translation table, replacing the external IP address and port number in the incoming packet header (similar to the translation from post office box number to street address). The packet is then forwarded over the inside network. Otherwise, if the destination port number of the incoming packet is not found in the translation table, the packet is dropped or rejected because the PAT device doesn't know where to send it.
PAT will only translate IP addresses and ports of its internal hosts, hiding the true endpoint of an internal host on a private network.

Friday, August 13, 2010

gdb: source code location mapping on startup

When you run gdb, you usually have to add source code directories using the “dir” command. This is okay if your code lies within a single directory, but if the code you’re debugging jumps across files in various directories, you’ll end up running “dir” several times, and this can be quite cumbersome.
The solution isn’t as simple as specifying the top-level (root) directory where your source code is checked out, because gdb won’t try to find matching source files recursively in the directory tree. gdb only checks the current directory (i.e. where you launched gdb from), and other directories specified via “dir” for matching source files.
Fortunately, there’s an elegant and simple solution to this problem:
To add directories to gdb automatically, so that you don't have to point out the source code dirs manually each time you start gdb, just specify the “substitute-path” like this in your .gdbinit file:
set substitute-path 
/sandbox/builds/appframework_dev/  /data/source/branches/appframework_dev

This tells gdb that the source files that it was initially expecting at “/sandbox/builds/appframework_dev” (which is the location of the source code on the build machine, where you got the binaries that you are debugging), are mapped to the local directory (on your test machine) at “/data/source/branches/appframework_dev”

see this relevant discussion on stackoverflow for this and other approaches to this problem

EDIT: it appears that I had already blogged this little nugget of gdb goodness last year, along with a few more interesting gdb tidbits

Thursday, August 5, 2010

python: calling c++ functions from python

 

Why would you want to do this? To test CPP APIs, SDKs – by consuming API/SDK code in Python. If you already have an automation test suite in Python, you can write and maintain test cases easily in Python, calling C++ functions as required, instead of writing C++ code to consume C++ libraries – which can get painful since the effort and time to write/maintain C++ is much higher than Python in my experience.

There are at least three different ways to create a python binding to a CPP library:

 

According to this discussion on stackoverflow, ctypes has a few advantages (including simplicity) over the other options:

ctypes has the advantage that you don't need to satisfy any compile time dependency on python, and your binding will work on any python that has ctypes, not just the one it was compiled against.

c++: generic pointers and void *

When a variable is declared as being a pointer to type void it is known as a generic pointer. A pointer to void can store an address to any data type. Since you cannot have a variable of type void, the pointer will not point to any data and therefore cannot be dereferenced. It is still a pointer though, to use it you just have to cast it to another kind of pointer first. Hence the term Generic pointer.
This is very useful when you want a pointer to point to data of different types at different times.
Here is some code using a void pointer:
int main()
{
    int i;
    char c;
    void *the_data;

    i = 6;
    c = 'a';

    the_data = &i;
    printf("the_data points to int %d\n", *(int*)the_data);

    the_data = &c;
    printf("the_data points to char %c\n", *(char*) the_data);

    return 0;
}
source

Tuesday, August 3, 2010

c++: const functions

Declaring a member function with the const keyword specifies that the function is a "read-only" function that does not modify the object for which it is called.
To declare a constant member function, place the const keyword after the closing parenthesis of the argument list. The const keyword is required in both the declaration and the definition. A constant member function cannot modify any data members or call any member functions that aren't constant.
// constant_member_function.cpp
class Date
{
 public:
   Date( int mn, int dy, int yr );
   int getMonth() const;     // A read-only function
   void setMonth( int mn );  // A write function can't be const

 private:
   int month;
};

int Date::getMonth() const
{
   return month;        // Doesn't modify anything
}


Monday, August 2, 2010

python: understanding the "with" statement/keyword

Consider some typical boilerplate code that opens a file (whose filename is passed as a command-line argument) in read-only mode, reads the file line-by-line and performs some operation on each line thus read:

fileIN = open(sys.argv[1], "r")
line = fileIN.readline()
while line:
    [do something with line]
    line = fileIN.readline()

From Python 2.5 onwards, this can be done more easily via the "with" statement:
with open(sys.argv[1], "r") as fileIN: 
    for line in fileIN: 
        [do something with line] 

The "with" statement reduces the line count from 5 to 3, and also makes the code more readable and human-friendly. It's more pythonic and intuitive.

source

HTTP: Chunked Encoding

In chunked encoding, the content is broken up into a number of chunks; each of which is prefixed by its size in bytes. A zero size chunk indicates the end of the response message. If a server is using chunked encoding it must set the Transfer-Encoding header to "chunked".

Chunked-encoding is not the same as Content-Encoding header. The Content-Encoding header is an entity-body header. since transfer-encodings are a property of the message, not of the entity-body. ("Entity-body" refers to the body or payload [e.g. a JPG image] of an HTTP request [e.g. POST or PUT request] or response).

 

Q: When is chunked encoding really useful?

A: Chunked encoding is useful when a large amount of data is being returned to the client and the total size of the response may not be known until the request has been fully processed. An example of this is generating an HTML table of results from a database query. If you wanted to use the Content-Length header you would have to buffer the whole result set before calculating the total content size. However, with chunked encoding you could just write the data one row at a time back to the client. At the end, you could write a zero-sized chunk when the end of the SQL query is reached.

This is the HTTP header that is sent by the server:

Transfer-Encoding: chunked

In the HTTP 1.1 specification, chunked is the only encoding method supported by the "Transfer-Encoding" header.

 

source

apache: what are .htaccess files typically used for?

.htaccess files allow us to make configuration changes on a per-directory basis. .htaccess files work in Apache Web Server on both Linux/Unix and Windows operating system. In Apache, the format of .htaccess is the same as the server's global configuration file.

There are several things that developers, site owners and webmasters can do with .htaccess files, for example:

  • Prevent directory browsing
  • Redirect visitors from one page or directory to another
  • Password protection for sensitive directories
  • URL-Rewriting (e.g. rewriting long, unwieldy URLs to shorter ones)
  • Change the default index page of a directory
  • Prevent hot-linking of images from your website
  • per-directory cache control

 

 sources:

http://www.bloghash.com/2006/11/beginners-guide-to-htaccess-file-with-examples/

http://en.wikipedia.org/wiki/Htaccess

Saturday, July 31, 2010

python: using the glob module to perform simple wildcard expansion on a filesystem

import glob
glob.glob('c:\\music\\_singles\\*.mp3')

['c:\\music\\_singles\\a_time_long_forgotten_con.mp3',
'c:\\music\\_singles\\hellraiser.mp3',
'c:\\music\\_singles\\kairo.mp3',
'c:\\music\\_singles\\long_way_home1.mp3',
'c:\\music\\_singles\\sidewinder.mp3',
'c:\\music\\_singles\\spinning.mp3']
glob.glob('c:\\music\\_singles\\s*.mp3')

['c:\\music\\_singles\\sidewinder.mp3',
'c:\\music\\_singles\\spinning.mp3']



You can get a list of all of those with a single call to glob, by using two wildcards at once. One wildcard is the "*.mp3" (to match .mp3 files), and one wildcard is within the directory path itself, to match any subdirectory within "c:\music" like this:



glob.glob('c:\\music\\*\\*.mp3')



source

web services vs REST

The fundamental difference, therefore, between REST and document-style Web services is how the service consumer knows what to expect out of the service. Web services have contracts, defined in WSDL. Since Web services focus on the service, rather than on the resource, the consumer has clear visibility into the behavior of the various operations of the service, whereas in REST's resource-oriented perspective, we have visibility into the resources, but the behavior is implicit, since there is no contract that governs the behavior of each URI-identified resource.

For a very good introductory article on REST, check out the article here.

Thursday, July 29, 2010

how much physical memory is supported by 64 bit editions of Windows 7?

While the maximum RAM limit for 32-bit Windows 7 editions is 4GB, when it comes to the 64-bit editions, the amount of memory that the OS can address depends on which edition you are running.

Here are the upper RAM limits for the different editions of Windows 7:

  • Starter: 8GB
  • Home Basic: 8GB
  • Home Premium: 16GB
  • Professional: 192GB
  • Enterprise: 192GB
  • Ultimate: 192GB

source

Saturday, July 24, 2010

python: sharing global variables across modules

The canonical way to share information across modules within a single program is to create a special configuration module (often called config or cfg). Just import the configuration module in all modules of your application; the module then becomes available as a global name. Because there is only one instance of each module, any changes made to the module object get reflected everywhere.

For example:

File: config.py

x = 0  # Default value of the 'x' configuration setting

File: mod.py
import config 
config.x = 1

File: main.py
import config
import mod
print config.x

Module variables are also often used to implement the Singleton design pattern, for the same reason.

source

python: local vs global scope

What are the rules for local and global variables in Python?


In Python, variables that are only referenced (and not assigned a value) inside a function are implicitly global.

If a variable is assigned a new value anywhere within the function’s body, it’s assumed to be a local.

If a variable is ever assigned a new value inside the function, the variable is implicitly local, and you need to explicitly declare it with the "global" keyword if you want to use the global context.

python: if __name__ == "__main__"

This is a simple "trick" so that python files can be used as both importable/reusable modules and also as standalone python programs.

From another perspective, this answers the common interview question,
"What considerations do you take while writing a python module? Or do you just write a module like a regular python program and then import it blindly? Don't you have to take some extra care/precautions while writing a python module?"


See this link for a simple and illustrative example:
http://pyfaq.infogami.com/tutor-what-is-if-name-main-for

linux: my ~/.screenrc file

This is the .screenrc file that has evolved as my current favorite over time - both for cygwin and native linux shells.

# the default shell when creating a new screen window
# ~/.bashrc is invoked whenever creating a new screen window
# so all my bash aliases are available from within screen too
shell -${SHELL}


# The shelltitle specified is an auto-title that would expect the prompt and the typed command to look something like the following:
#  $ fortune
# (it looks after the '$ ' for the command name). The window status would show the name "fortune" while the command was running, and revert to "bash" upon completion.
shelltitle "$ |bash"


#avoid startup message
startup_message off


hardstatus alwayslastline


#hardstatus string '%{= mK}%-Lw%{= KW}%50>%n%f* %t%{= mK}%+Lw%< %{= kG}%-=%D %d %M %Y %c:%s%{-}'
hardstatus string '%{= kG}[ %{G}%H %{g}][%= %{= kw}%?%-Lw%?%{r}(%{W}%n*%f%t%?(%u)%?%{r})%{w}%?%+Lw%?%?%= %{g}][%{B} %d/%m %{W}%c %{g}]'


# Use the function keys to immediately switch to corresponding windows
# (instead of having to press ^a0, ^a1 etc - which also still work, btw)
bindkey -k k1 select 1
bindkey -k k2 select 2
bindkey -k k3 select 3
bindkey -k k4 select 4
bindkey -k k5 select 5
bindkey -k k6 select 6
bindkey -k k7 select 7
bindkey -k k8 select 8
bindkey -k k9 select 9
bindkey -k k; select 10
bindkey -k F1 select 11
bindkey -k F2 select 12



# go to next/previous window using ctrl+ right and left arrows
# this interferes with word-back and word-ahead on ubuntu, hence disabled:
# bindkey ^[[1;5D prev
# bindkey ^[[1;5C next
# set the scrollback buffer to hold the last 5000 lines
defscrollback 5000
sources:

Thursday, July 22, 2010

python: how to get size of an object

use sys.getsizeof:

>>> import sys
>>> x = 2
>>> sys.getsizeof(x)
14
>>> sys.getsizeof(sys.getsizeof)
32
>>> sys.getsizeof('this')
38
>>> sys.getsizeof('this also')
48



source

Wednesday, June 23, 2010

python challenge: room 3

python challenge: room 3

here's my solution:
import re

infile = open('c:\\python26\\MyProgs\\inputfile_room3.txt' , 'r')
instring = infile.read()
infile.close()


p = re.compile( r'[^A-Z][A-Z]{3}(?P<answer>[a-z])[A-Z]{3}[^A-Z]') 
print p.findall(instring)


#solution is "linkedlist"



solution url:
http://www.pythonchallenge.com/pc/def/linkedlist.php

Tuesday, June 22, 2010

python: list comprehensions - examples

There are excellent examples from Guido's site, perfect for a quick review of "list comprehensions":


>>> vec = [ 2, 4, 6 ]
>>> [ 3*x for x in vec ]
[6, 12, 18]
>>> [ 3*x for x in vec if x > 3 ]
[12, 18]
>>> [ [x , " squared is " , x**2] for x in vec if x > 1 ]
[[2, ' squared is ', 4], [4, ' squared is ', 16], [6, ' squared is ', 36]]

>>> #tuple generation
...
>>> [ (x, x**3) for x in vec ]
[(2, 8), (4, 64), (6, 216)]

>>> #generate a list whose elements are teh values of pi, from 1 to 6 decimal places
...
>>> [ str( round(22/7.0 , i) ) for i in range(1,7) ]
['3.1', '3.14', '3.143', '3.1429', '3.14286', '3.142857']

python challenge: room 2

challenge from room 2



my solution:
import re

infile = open('c:\\python26\\MyProgs\\inputfile.txt' , 'r')
instring = infile.read()
infile.close()

#find all non-word characters ...
p = re.compile('\W|_') 

# ... and remove them from instring, 
# so that only word characters are left in instring
print p.sub('', instring) 



official solution

python: to do

  • finish pythonchallenge
  • learn python 3.0
  • http://www.techinterviews.com/python-interview-questions-and-answers
  • http://www.geekinterview.com/Interview-Questions/Programming/Python
  • list comprehension
  • filter
  • lambda
  • map
  • list generators / generators

Wednesday, June 2, 2010

questions to ask the interviewer!

WORK METHODOLOGY AND WORK STYLE

  • What kind of software engineering methodology is
    practised here?
    Agile, Iterative, Waterfall?

  • Do you
    have something like a "Quick Response Team" that monitors and handles
    critical/production issues?

  • Is quality control here more
    proactive or reactive? Do you have PCVs? PreCheckinVerifications?
    Do
    testers have test cases ready before the dev drop? Do testers
    automate
    some of these major test cases prior to dev drop? Do
    testers automate at all?

  • What is the timeframe for a
    typical major release/milestone? (We used to have a significant
    milestone in 6-8 weeks in Pi - minor milestones would be done in 1-2
    weeks, depending on urgency of the bugfix/release).

  • Level
    of whitebox testing: Do testers here look at code? Do you
    encourage them to?

  • Can QAs attend dev design
    discussions
    ? If not, then can the QA get a design dump as soon as
    design is finalized?

  • Do Devs write their own unit
    tests
    for each feature that they develop? Does QA accept the dev
    drop only after all unit tests have passed?

  • What kind of
    source control do you use? CVS, SVN, Git, Mercurial, Bazaar?

  • How
    do you feel about documentation?
    What kind/level of
    documentation do you follow for:
    - business requirements
    - software requirements specifications (SRS)
    - software design spec
    - coding (code commenting standards, wiki pages, etc)
    - testing
    (overall test plan for a given team+milestone, feature test spec, test
    suites, test cases)
    - defects (defects db,

CULTURE AND ENVIRONMENT
  • TRICKY, HARD-HITTING QUESTION: Do you encourage
    employees to work on technical projects of their own? Do you support
    them by giving them, say, half a day per week to pursue their personal
    projects? Otherwise how do you expect employees to be innovative and
    passionate about technology? (i.e. citing examples of projects on
    DechoLabs - the URL Security Tester that would pass null and bogus
    parameters to the URLs in the hope of causing a crash e.g.)

    Another example is the TestLabAutomation initiative (Django project) that Biju and I had started.

  • How seriously does your company take CSR (Corporate Social Responsibility)? What initiatives and programs does your company currently have for this?


  • Do programmers have a quiet work environment? If I want to have a discussion with my cube-mate, do I have to schedule a meeting in a meeting room?

  • How is the work environment here? Rigid/formal or open/relaxed/casual?

  • What are the working hours like? How strictly are work hours enforced?

  • What are the recreational facilities? What do employees do here when they're not working? Is there a TT table? Is there a gym?



references:
http://stackoverflow.com/questions/329289/really-wow-them-in-the-interview

Tuesday, May 4, 2010

awk: kill all pimgr jobs in one go

Unfortunately, the /usr/local/pi/pimgr/bin/pimgr-jobs utility doesn't provide a "kill-all" option that automagically kills all the jobs that are currently scheduled to run.

Here's some simple scriptfoo that does the trick rather well:
for i in `gawk --re-interval '$1 ~ /^[0-9]{1,4}/ {print $1}' 1`
do 
./pimgr-jobs --jobid=$i  kill-delete
done


wrt this excellent link, plain old awk doesn't support braces/curlybrackets for denoting the number of occurrences. So you have to use gawk instead, with the '--re-interval' option as shown above. Whoa.


Powered by ScribeFire.

Friday, April 30, 2010

linux: how to see which packages are taking the most disk space

Here's a little scriptfoo that's very useful when you're running out of disk space. When you need to reclaim hard drive real estate urgently by removing those useless "Engineering and Scientific" packages, run diskhoggers. Of course, it only works on RPM-based systems, so it's useless on Ubuntu/Debian etc, but there should be equivalents (e.g. using dpkg and/or synaptic apt-get) on those systems too.

alias diskhoggers='rpm -qa --qf "%10{SIZE}\t%{NAME}\n" | sort -n' 


Here's an example run on my system:

[root@noumPC ~]# diskhoggers | tail -n 30
15370931 bcel
15585349 gutenprint
15615584 gnome-applets
16021416 gcc
16574947 webkitgtk
16742780 nautilus
17773959 vim-common
18113087 firefox
18147692 libgweather
19693980 libicu
20174285 python
20250203 ghostscript
22082141 fedora-release-notes
22312309 kernel-devel
22371021 python-lxml
24697430 xulrunner
25032963 foomatic-db-ppds
25455524 libpurple
26376798 eclipse-jdt
29624088 perl
32984533 eclipse-platform
46820396 qt-x11
49233851 google-chrome-beta
49484310 valgrind
50371329 libgcj
80240654 kernel
84381565 kernel
85572888 java-1.6.0-openjdk
111799012 glibc-common
246952321 java-1.5.0-gcj-javadoc

linux: extremely useful BASH scriptfoo for investigating coretraces:

grep -E "(#0 |#1 )" [[FILENAME]]  | grep -vE "(kernel_vsyscall|pthread_cond_timedwait|nanosleep)"
[/name_of_coretrace_file][/filename]

Example:
[root@domain ~]# grep -E "(#0 |#1 )" coretrace_4.2_upgrade_crash | grep -vE "(kernel_vsyscall|pthread_cond_timedwait|nanosleep)"<br />#0  0x0040abc8 in _IO_vfscanf_internal () from /lib/libc.so.6<br />#1  0x0041a451 in vsscanf () from /lib/libc.so.6<br />#1  0x003afdeb in read () from /lib/libpthread.so.0<br />#1  0x003afdeb in read () from /lib/libpthread.so.0<br />#1  0x003afa0e in __lll_mutex_lock_wait () from /lib/libpthread.so.0<br />#0  0x00130244 in _fini () from /lib/libSegFault.so<br />#1  0x0011e5b2 in _dl_fini () from /lib/ld-linux.so.2<br />#1  0x003afdeb in read () from /lib/libpthread.so.0<br />#1  0x003afdeb in read () from /lib/libpthread.so.0<br />#1  0x003afdeb in read () from /lib/libpthread.so.0<br />#0  0x0040f948 in _IO_vfscanf_internal () from /lib/libc.so.6<br />#1  0x0041a451 in vsscanf () from /lib/libc.so.6<br />#1  0x003ea3d6 in kill () from /lib/libc.so.6<br />#1  0x003af365 in sem_timedwait () from /lib/libpthread.so.0<br />


Now you know which are the interesting threads - the ones that are not doing kernel_vsyscall() or pthread_cond_timedwait() or nanosleep() are printed as the output of this command-chain.

Thursday, April 29, 2010

vim: my ~/.vimrc file

set number
set incsearch
set shiftwidth=4
set tabstop=4
set autoindent
set smartindent
set paste
set expandtab
set showcmd
 
" set search highlighting on
set hls
 
set scrolloff=2
 
" Quit without fuss on :Q
:command -nargs=0 Quit :qa!

" Write without fuss on :W
:command -nargs=0 Write :w 
 
 
" fix the vim+backspace problem in cygwin - might NOT be needed on native linux shells!
set backspace=indent,eol,start
 
" set syntax highlighting on (for all possible file types)
syntax on
 
" always show current cursor position (row, column) at bottom right
set ruler
 
" choose colors that look good on a dark background, if possible
" set background=dark

" set more suitable colors for the line numbers
highlight LineNr gui=NONE guifg=black    guibg=grey 
highlight LineNr cterm=NONE ctermfg=darkgrey  ctermbg=grey


" This highlights the background in a subtle red for text that goes over the 80 column limit
" http://stackoverflow.com/questions/235439/vim-80-column-layout-concerns
" press F3 to toggle 80 column overlength highlighting
let ColHL='off'
highlight OverLength ctermbg=darkred ctermfg=white guibg=#592929
" match OverLength /\%81v.\+/

function! Toggle80ColumnHighlight()
    if g:ColHL == 'on'
        match OverLength //
        let g:ColHL='off'
    elseif g:ColHL == 'off'
        match OverLength /\%81v.\+/
        let g:ColHL='on'
    endif
endfunction

nnoremap  :call Toggle80ColumnHighlight()

" mark text after column 80 ( >= vim7.3 )
" set colorcolumn=80


" function to show color scheme in use
" source: http://stackoverflow.com/questions/2419624/how-to-tell-which-colorscheme-a-vim-session-currently-uses
function! ShowColorSchemeName()
    try
        echo g:colors_name
    catch /^Vim:E121/
        echo "default
    endtry
endfunction

" set a better search highlight colors
" http://stackoverflow.com/questions/7103173/vim-how-to-change-the-highlight-color-for-search-hits-and-quickfix-selection
highlight Search cterm=NONE ctermfg=white ctermbg=darkblue
highlight Search gui=NONE   guifg=white  guibg=darkblue
" set better incremental search highlight colors
highlight IncSearch cterm=NONE ctermfg=darkgreen ctermbg=grey
highlight IncSearch gui=NONE   guifg=darkgreen  guibg=grey

Sunday, April 11, 2010

linux: how to safely add an ssh-agent with default key upon login

# safely add key to ssh-agent on login

test=`ps -ef | grep ssh-agent | grep -v grep | awk '{print $2}' | xargs`

if [ "$test" = "" ]
then
# there is no agent running
if [ -e "$HOME/agent.sh" ]
then
# remove the old file
rm -f $HOME/agent.sh
fi
# start a new agent
ssh-agent | grep -v echo &> $HOME/agent.sh
ssh-add /root/.ssh/id_rsa
fi

test -e $HOME/agent.sh && source $HOME/agent.sh



Adapted from: http://drupal.star.bnl.gov/STAR/blog-entry/jeromel/2009/feb/06/how-safely-start-ssh-agent-bashrc

Friday, April 9, 2010

linux: mutt is so much better than mail

Here's how to send a email from the linux command line, attach files to it, and have it sent to multiple recipients! Much better than having to paste the contents of a config file or text file into the body of an email and then use the plain old "mail" command.

echo | mutt  -s interesting_logs_please_check  -a syslogs.tar.gz admin@domain.org, user1@domain.org, user2@domain.org


references:

http://www.shelldorado.com/articles/mailattachments.html
(excellent)

"Multiple recipients may also be specified by separating each address with the , delimiter"
(tested this and it works)
http://www.freebsd.org/doc/handbook/mail-agents.html

Thursday, March 11, 2010

awk: regular expressions and group submatch capture

b=`ssh root@registration.authinfra.net 'rpm -qg pi'`

echo $b | awk 'match($0, "pi-multihome-[[:digit:]].[[:digit:]]-([[:digit:]]*)", a) {print a[1]}'

OUTPUT: 125324

Monday, March 8, 2010

vim: mapping the F2 key to a sequence of commands

Macros are a great way to automate tasks in vim. You can record a sequence of keystrokes, assign them to a register, and then playback the macro saved in that register any number of times.

However, sometimes you come across a macro that is so useful that you use it frequently. Perhaps you use it across several vim sessions. You wish you could create a keyboard shortcut for this macro, and then save it in your ~/.vimrc file for anytime usage.

This is how you can do that, using vim’s map command:
map <F2> 0i\<Esc>A\n\<Esc>j

This mapping will bind the F2 key to perform this sequence of operations:

number operation vim command
1 go to the hard BOL (beginning of line) 0
2 insert the '\' character i\
3 go back to command mode <Esc>
4 append \n\ to the end of the line A\n\
5 go back to command mode <Esc>
6 go down one line (so that you can just press F2 again on the next line, to continue) j

Thursday, March 4, 2010

gdb: set a conditional breakpoint

(gdb) break LinkedList<int>::remove
Breakpoint 1 at 0x29fa0: file main.cc, line 52.
(gdb)
(gdb) condition 1 item_to_remove==1


Another way to accomplish this:

break LinkedList<int>::remove if (item_to_remove==1)


source:
http://www.cs.cmu.edu/~gilpin/tutorial/#3.4

Friday, February 12, 2010

python: call function stored in string variable

In the following, op1 and op2 are strings containing function names (e.g. 'stop', 'enable', 'disable') to be invoked:

def start():
    print 'INFO|start()|starting identity %s.%s' %(name1, domain)
def stop():
    print 'INFO|start()|starting identity %s.%s' %(name1, domain)
def enable():
    print 'INFO|enable()|enabling identity %s.%s' %(name1, domain)
def disable():
    print 'INFO|disable()|disabling identity %s.%s' %(name1, domain)

function_mapper = { 'stop':stop,
                    'start':start,
                    'enable':enable,
                    'disable':disable
                  }


def main():
    newlist = ['start', 'disable', 'enable']
    print '\n\n', newlist

    for op2 in newlist:
    for op1 in newlist:
    function_mapper[op1]()
    function_mapper[op2]()
    print "\n"

    
if __name__ == '__main__':
sys.exit( main() )

python: make an HTTP request with any kind of HTTP method (put, delete, head) using urllib2

Sometimes you don't want to sit and migrate all your existing codebase from urllib/urllib2 to httplib/httplib2, just because the former doesn't support HTTP PUT or HTTP DELETE methods out-of-the-box. Well, here's a workaround for just that problem:
import urllib2
opener = urllib2.build_opener(urllib2.HTTPHandler)
request = urllib2.Request('http://example.org', data='your_put_data')
request.add_header('Content-Type', 'your/contenttype')
request.get_method = lambda: 'PUT'
url = opener.open(request)


Thanks to stackoverflow yet again for solving another programming conundrum :)

Wednesday, February 3, 2010

linux:debugging with gdb uses sigtrap internally

SIGTRAP is used as a mechanism for a debugger to be notified when the
process it's debugging hits a breakpoint.

A typical way for something like GDB to use it would be something like
this:

- The user asks gdb to set a breakpoint at a certain address in the
  target process.  gdb uses ptrace to replace the instruction at that
  address with the "int3" instruction, which generates a debug
  exception.  It also uses ptrace to ask that the process be stopped
  when SIGTRAP is raised.
- When the target process hits that address, the exception is
  generated.  The kernel treats this as raising a SIGTRAP signal.  The
  process is stopped and gdb is notified.
- gdb lets the user examine the state of the target process.  When the
  user is ready to continue, gdb replaces the int3 with the instruction
  that had originally been there, and uses ptrace to tell the kernel to
  restart the target process from that instruction.  AFAIK it would also
  normally tell the kernel not to deliver the SIGTRAP signal to the
  process, since by default that would kill it.  
So it would normally be
  irrelevant how you are handling SIGTRAP (SIG_IGN or SIG_DFL or a
  handler) because the target will never know it occurred.

(Discovered this information thanks to bug 31715)

source