Category Archives: Work

QXmpp 0.3.0 Release

I’m glad to release the next version of QXmpp, version 0.3.0. It comes packed with a number new features, bug fixes and improvements.

Here is a list of new features in QXmpp 0.3.0, please look at the changelog for exhaustive list:

  • XEP-0153: vCard-Based Avatars
  • XEP-0202: Entity Time
  • Managers for all the XEPs
  • More examples:
    example_9_vCard: vCard handling
    GuiClient: Graphical chat client, test bench for QXmpp functionalities
    example_8_server: Server capability
  • Server framework: Yes you can write a server now look at the example example_8_server
  • Add support for DNS SRV lookups, meaning you can connect to nearly all servers using only a JID and a password. No need to explicitly specify the server information.
  • Add QXMPP_VERSION and QXmppVersion() for compile and run time version checks.
  • Improve code documentation coverage and quality.
  • Completely remove dependency on QtGui, making it easier to write console applications.

QXmpp 0.3.0 Release Details:

Project Page | Changelog | Readme | API Documentation | Download

As usual, thanks to the authors, community and the users who have been driving the project.

About QXmpp:

QXmpp is a cross-platform C++ XMPP client (and server!) library based on Qt and C++. It is an open source project licensed under a permissive license LGPL. As of today, the project is around two years old.

Regular Expressions

Regular Expressions (commonly referred to as RegEx or RegExp) are coded strings which represent text patterns. These can be used to search and replace patterns in the texts. A pattern defined by a regular expression can be used in the regular expression processors like grep, sed, awk, Notepad++, Visual Studio etc. for performing search, replace or modifications.

Programming languages also support regular expressions. It is more commonly found in scripting languages like Perl, Python, Ruby etc. but the compiled languages Java and C++ (using Boost, Qt) also support it.

Regular Expressions Syntax and Rules

A regular expression is a string and contains text characters. A few of the text characters have special meaning. These special characters perform various operations like grouping, quantification, NOT etc. Rest of the characters are normal and mean what they are.

List of special characters:  ” \  [  ]  ^ –  ?  .  *  +  |  (  )  $  /  {  }  %  <  >

To use these characters literally in a regex one has to escape them using a backslash (\) or enclose in quotes.

Normal Text

  • a matches a
  • hello matches hello
  • . a special character. It matches any character except a newline
  • \. matches .

Groups

Round brackets or parentheses () are used to create a group.

  • (a|b|c) matches a or b or c
  • (colour|color) matches colour or color

Ranges

Square brackets [] are used for specifying a range of characters.

  • [abc] matches a or b or c
  • [^abc] matches any character except a, b and c
  • [a-c] matches a or b or c, range from a to c
  • [A-C] matches A or B or C, range from A to C
  • [a-cA-C] matches a or b or c or A or B or C
  • [0-5] matches a digit from 0 to 5
  • [a-zA-Z0-9] matches one character of alphanumeric text

Quantifiers

It specifies how many times something is repeated.

  • a* zero or more, matches null, a, aa, aaa, aaaa, …
  • a+ one or more, matches a, aa, aaa, aaaa, …
  • a? zero or one, matches null or a
  • a{3} matches aaa
  • a{3,} three and more aaa, aaaa, aaaaa, …
  • a{3,6} 3 to 6 matches aaa, aaaa, aaaaa, aaaaaa

Eg.
[a-zA-Z0-9]{8} matches alphanumeric text of length 8

Shorthands

Shorthands exist for commonly used character classes. For example digits character class has a shorthand \d. It is short for [0-9]. Each lowercase shorthand character has an associated uppercase shorthand character with the opposite meaning. Thus [\D] matches any character that is not a digit, and is equivalent to [^\d].

  • \d a digit, short for [0-9]
  • \D a non-digit, short for [^0-9]
  • \w a word character, short for [a-zA-Z_0-9]
  • \W a non-word character equivalent to [^a-zA-Z_0-9]
  • \s a whitespace character, short for [ \t\n\x0b\r\f] matches any whitespace character. This includes spaces, tabs, and line breaks.
  • \S [^\s]

Anchors

  • ^ start of the subject, ^a makes sure that a occurs at the the beginning of the subject
  • $ end of the subject, a$ makes sure that a occurs at the the end of the subject
  • \b word boundaries, \ba\b makes sure that a is a whole word in the subject
  • \B nonboundaries, \Ba\B makes sure that a is not a whole word

Examples

  • .at matches any three-character string ending with at e.g. hat, cat and bat
  • [hc]at matches hat or cat
  • [hc]?at matches hat or cat or at
  • 0[xX][A-Fa-f0-9]+ matches hexadecimal number e.g. 0x2AB7
  • [A-Za-z_][A-Za-z_0-9]* identifier in a programming language
  • cal[ae]nd[ae]r matches misspelled calender
  • colou?r matches color or colour
  • gr[ae]u matches grey or gray
  • \d+ matches positive integers
  • -\d+ matches negative integers
  • -{0,1}\d+ matches integers
  • \d*\.{0,1}\d+ matches positive real numbers

Debugging Console for GuiClient/QXmpp

QXmpp repository contains many examples. One of them, GuiClient, is a full-fledged graphical XMPP client. I developed it for testing out the functionality and usability of the QXmpp API. This example supports most of the XEPs (XMPP Extension Protocols) supported by QXmpp.

To ease up the debugging, I have added a console dialog to display all the sent/received stanzas. Have a look at the Debugging Console.

 

qxmpp-debugging-console

Debugging Console

 

  • Settings button can be used to launch this dialog.
  • User can control the logging by using the Enable Check Box.
  • Sent and Received messages are color coded to differentiate between these.
  • The Clear button clears the text area.

Implementing XOR Encryption

At times, you need a simple encryption and decryption functionality to secure sensitive information. In my case, I needed to store passwords on disk to implement the “Remember Password” functionality of the GuiClient example of QXmpp.

Instead of going for a 3rd party library where you just need a very basic crypto functionality I will suggest using XOR encryption. XOR encryption is pretty easy to implement. I will also present my implementation of this algorithm using Qt C++.

XOR (Exclusive OR) Encryption or XOR Cipher is a simple symmetric encryption algorithm. It operates according the principle that XORing a data twice with the same key results in the same data.

The first XOR of the data and key gives the encrypted data. Then the decryption involves XORing the encrypted data with the same key.

EncryptedData  = Data ^ Key
Data =  EncryptedData ^ Key

Data: data to be encrypted
Key: secret key or password
EncryptedData: data after encryption
^ represents Exclusive-OR (XOR) operation

Example:

Let us use two binary numbers for Data and Key. The 

Data  = 01101
Key   = 10101

Encrypt:

EncryptedData = Data ^ Key
EncryptedData = 01101 ^ 10101
EncryptedData = 11000

Decrypt:

Data = EncryptedData ^ Key
Data = 11000 ^ 10101
Data = 01101

Implementation of XOR Encryption in Qt C++:

QByteArray calculateXor(const QByteArray& data, const QByteArray& key)
{
 if(key.isEmpty())
   return data;

 QByteArray result;
 for(int i = 0 , j = 0; i < data.length(); ++i , ++j)
 {
   if(j == key.length())
     j = 0;// repeat the key if key.length() < data.length()
   result.append(data.at(i) ^ key.at(j));
 }
 return result;
}

Serving HTML Documentation from Google Code / SVN

I found a nice way of hosting documentation on Google Code. Google Code can be easily setup to host HTML Documentation generated from Doxygen, Javadoc, RDoc etc.

It is always good to have an online documentation of a project. Everyone can refer to it on web. Developers can easily direct to any piece (class, function etc) of the documentation by using the respective link in their communication.

The documentation and the project together should be hosted at the same location. Having two piece solution, one for project and other for documentation result in a poor usability experience for developers and users. One URL to the project and everything (Issue tracker, Wiki, Revision control, Downloads and Documentation) lying there sounds great!

Google Code doesn’t provide explicit hosting for the documentation. Using the method described in this article completes the project hosting on Google Code by providing a way to serve HTML from it.

I have been using this method for my project QXmpp since 0.2.0 release. Have a look at the outcome:

The key idea behind this solution is that Subversion (svn) can render the HTML/CSS files if proper mime-type of the files have been set. In Google Code when you view the raw file, SVN will use its mime-type to render it accordingly.

  • URL to view raw files under svn: http://qxmpp.googlecode.com/svn/
  • URL of the documentation: http://qxmpp.googlecode.com/svn/doc/HEAD/html/index.html

The mime-type should be set correctly. In my case, doxygen generates following type of files:

  • *.css  = svn:mime-type=text/css
  • *.html = svn:mime-type=text/html
  • *.js   = svn:mime-type=text/javascript
  • *.gif  = svn:mime-type=image/gif
  • *.png  = svn:mime-type=image/png

The auto-props feature of SVN can be used to set the mime-types automatically. Once auto-props things are defined in the SVN client config file, the client will automatically set the mime-types.

You can see that the documentation files are stored under SVN. Therefore the documentation inherently gets version control. The documentation gets versioned the way source code is tagged/versioned. You can host documentation for all the versions of your project. For example the documentation of 0.2.0 release and the bleeding-edge SVN HEAD QXmpp-0.2.0 and QXmpp-HEAD.

Steps to serve documentation from Google Code Subversion:

  1. Generate the documentation locally.
  2. Assign the correct svn:mime-type as per the file extension. This step is not required if you are using auto-props feature of SVN.
  3. Check in the documentation files at the desired path.
  4. Load the path to documentation in your browser to see the outcome. The path should be like http://[project-name].googlecode.com/svn/[svn-path]
  5. Create a script and add an entry of it in the crontab/scheduler to automate it.

Advice:

  • Use auto-props
  • Choose the path of doc outside of trunk to keep your ohloh stats clean and other developers happy.
  • Update the documentation daily for the HEAD version.

I will conclude and mention that it is good to have a documentation. The documentation should be served from the project hosting site itself to have a one piece solution and better user experience. This method of serving documentation can be used until Google comes up with an explicit solution. The versioning of the documentation is automatically done. You get documentation of all the versions of your software.

References:

C++ Surprise: switch-case declaration-without-initialization

I have been programming in C++ for a long time and it keeps surprising me. C++ Programming Language is full of surprises. Lately, I found an interesting one.

The Surprise!

I never declare new variables/objects in the case statements. To me, it is not allowed. If I ever need new variables/objects in a case statement, I use braces. Braces define a valid scope for the new variables, the new variables are not valid outside the braces. Without using the braces, variables declared in a case statement are visible in the succeeding cases as well. And these declarations can be skipped if the switch jumps to those cases.

The surprise is that declaration is possible! But only for a very specific case, the declaration without initialization. int var; is such example.

The declaration without initialization is possible only for POD types (plain old data, collection of basic types, C structs based on basic types, pointers, enums etc.). Therefore, to be precise, the declaration in a case statement is possible only for POD types and without initialization.

Declaration without initialization [Allowed, Surprise!]

  • int count;
  • float length;
  • int* ptr;

Declaration with initialization [Not allowed as expected]

  • int count = 20;
  • float length = 6.7;
  • int* ptr = 0;
  • std::string str;       // involves call to default constructor (initialization)
  • std::string str2(“manjeet”);

Example

In the following code. I have marked the statements Valid/Invalid as per g++ and MSVC. Let us look at the unexpected and expected statements.

std:string str1("test");

switch(i)
{
case 0:
 int var1;                  // VALID

 int var2 = 22;             // INVALID

 int var2;                  // VALID
 var2 = 22;                 // VALID

 str1 = "test";             // VALID, defined before switch statement
 std::string str2("test");  // INVALID
 break;
case 1:
 break;
case 2:
 break;
default:
 break;
}

Surprise:

  • Line#6: Valid, int var1;
  • Line#8: Invalid, int var2 = 22; if Line#6 is valid then this should also be valid.

Expected:

  • Line#11 & 13: Valid, usual assignments.
  • Line#14: Invalid, declaration of an object. It is not allowed, very much expected.

Explanation

As per the C++ Standard ISO/IEC-14882-2003 section 6.7.3

It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps from a point where a local variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has POD type (3.9) and is declared without an initializer (8.5).

Therefore according to the above rule, jumping past a declaration with initialization is not allowed. And the only exception to this rule could be a declaration of a POD types. Because POD types can be declared without initialization.

int a;  // declaration without initialization

And for non-POD types declaration-without-initialition is never possible. The object can’t be declared without initializing because the constructor will always be called.

std::string str;    // declaration that includes initialization, constructor is called

Rationale Behind the Rule

The question pops up to the mind.

Why is jumping past a declaration-without-initialization allowed?

I don’t have the answer. But it might have something to do with followings:

  • All the initializations of the variables and objects are done at the compile time. Whereas assignments during the runtime of the program.
  • If initialization doesn’t takes place the destructor should not be called. But the destruction always takes place when the object goes out of scope. Destruction without construction doesn’t sound good. Therefore the execution should not jump an initialization.

Conclusion

  • Declaration of POD types without initialization is allowed in a case statement.
  • Declaration of non-Pod types can only be done in the braces only.
  • In a switch-case, all the case statements are in the same scope.
  • The switch-case is nothing but a collection of goto and labels.
  • goto-label jump is not allowed if jump skips declaration with initialization.
  • Good practice would be to always use braces after case statements if declarations are involved.

References

  • C++ Standard ISO/IEC-14882-2003
  • http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=171
  • http://www.fnal.gov/docs/working-groups/fpcltf/Pkg/ISOcxx/doc/POD.html
  • http://www.parashift.com/c++-faq-lite/intrinsic-types.html#faq-26.7

XML Namespaces

Namespaces

Namespaces are used to prevent name conflicts or name collisions by containing the names under another names. This another name, i.e. name of the abstract container, is called namespace.

Example:

  • farming:crop
  • image_editing:crop

Here crop identifier has more than one meaning. Usage of this identifier without a namespaces is ambiguous. By using proper namespaces or containing them in separate namespaces (or categories), farming and image_editing gives unique names and hence prevents the naming collisions.

More Examples:

  • computer_keyboard:key
  • cryptography:key
  • lock:key
  • surface:flat
  • real_estate:flat

In Programming Languages

Namespaces exist in programming languages also. Wherein these are used to prevent collisions between the identifiers such as variables, classes, functions etc.

For example, two libraries lib1 and lib2, might use the same name for a function which returns version of the library say getVersion().

A call to this function will create ambiguity in an environment where both the libraries are used. This can be prevented by containing these functions in separate namespaces.

namespace lib1   // defines the function in the namespace lib1
{
 string getVersion();
}

namespace lib2   // defines the function in the namespace lib2
{
 string getVersion();
}

Now users can make following calls without any disambiguation.

//
// unambiguous calls by using qualified names
//
string lib1Version = lib1::getVersion(); // points to lib1

string lib2Version = lib2::getVersion(); // points to lib2

XML Namespaces

Coming back to the topic of XML Namespaces. Identifiers in XML  such as elements and attributes can also face name conflicts. In the following example the first element represents key-press on a keyboard. Whereas the second element is a representation of key in the context of cryptography.

<key>
 <name>D</name>
</key>

<key>
 <private>AdfgkloPjad</private>
 <public>hpkrpioHml</public>
</key>

Declaration

A namespace is declared by using the XML reserved attribute xmlns . The declaration can be done on any element of the XML document. Two syntaxes are available to define the attributes:

  • xmlns:prefix=URI
  • xmlns=URI

The first one associates the prefix with URI and prefix: can be prefixed to the elements or attributes in the scope of declaration of the namespace. The second method declares a default namespace and all the elements and attributes in the scope of declaration are contained in the given URI. Lets look at both the methods with examples.

xmlns:prefix=URI, Defining Prefix based Namespaces

XML attribute

xmlns:lock=”http://www.dahiya.co.in/lock-key”

defines a namespace lock and associates it with the URI=”http://www.dahiya.co.in/lock-key”. The prefix lock: can now be used in the current and its child elements.

<lock:key xmlns:lock="http://www.dahiya.co.in/lock-key">
 <lock:name>D</lock:name>
</lock:key>

<crypto:key xmlns:crypto="http://www.dahiya.co.in/crypto-key">
 <crypto:private>AdfgkloPjad</crypto:private>
 <crypto:public>hpkrpioHml</crypto:public>
</crypto:key>

The declaration can be done in the parent elements as well because the declaration has a scope of current elements and children.

<root xmlns:lock="http://www.dahiya.co.in/lock-key" xmlns:crypto="http://www.dahiya.co.in/crypto-key">

<lock:key>
 <lock:name>D</lock:name>
</lock:key>

<crypto:key>
 <crypto:private>ASKDIMBJ</crypto:private>
 <crypto:public>ASKDIMBJ</crypto:public>
</crypto:key>

</root>

xmlns=URI, Default Namespaces

XML attribute

xmlns=”http://www.dahiya.co.in/lock”

defines the default namespace for the elements and attributes that lack any namespace prefixes. This default namespace has a scope of current and child elements.

<key xmlns="http://www.dahiya.co.in/lock-key">
 <name>D</name>
</key>

<key xmlns="http://www.dahiya.co.in/crypto-key">
 <private>ASKDIMBJ</private>
 <public>ASKDIMBJ</public>
</key>

Redefining Namespaces

Namespace can be redefined by using the same syntaxes with new URI.

  • xmlns=New-URI
  • xmlns:prefix=New-URI

Unsetting Namespaces

Unsetting of the namespaces can be done by using empty URIs.

  • xmlns=””
  • xmlns:prefix=””

Note that after unsetting the prefix based namespace, the prefix should not be used.

Namespace Names

The XML specifications doesn’t say that the namespace names must be URIs. But the use of URLs (type of URI) in the form of http:// is quite common. It is not required that the given URL should host anything. The whole idea is to have a unique namespace. Domain name is a unique ID, namespaces created using domain name will come out unique. In the above examples I have used my domain to create namespaces.

  • http://www.dahiya.co.in/lock-key
  • http://www.dahiya.co.in/crypto-key

A little off the topic but worth mentioning, Java has a convention of using reversed-domain-name to name the packages. The motive is to have a unique name for every package. This prevents packages name conflicts if everyone follows the convention.

  • com.sun.team1.network
  • com.sun.team2.network
  • com.apple.quicktime.v2
  • edu.cmu.cs.bovik.cheese

XML Namespaces in Real Use

XMPP uses XML namespaces. Following IQ stanzas use default namespace to differentiate various type of IQ stanzas.

<iq type='get' from='romeo@montague.net/orchard' to='plays.shakespeare.lit' id='info1'>
 xmlns='http://jabber.org/protocol/disco#info'/>

<iq type='get' from='romeo@montague.net/orchard' to='juliet@capulet.com/balcony' id='version_1'/>

<iq from='romeo@montague.net/orchard' id='last1' to='juliet@capulet.com' type='get'>
 <query xmlns='jabber:iq:last'/>
</iq>

QXmpp 0.2.0 Release

Last year, I founded an open source project QXmpp. It is an XMPP client library based on Qt. It is licensed under a permissive license LGPL. The project is now more than a year old. The very first public release QXmpp 0.1.0 was made on June 14, 2009. And last Sunday, we released QXmpp 0.2.0.

QXmpp 0.2.0 comes with numerous features (many XEPs and new authentication schemes), many bug fixes, architectural improvements and Doxygen documentation. Have a look at the Changelog for a complete list of new features and changes in this release.

Thanks to the authors, group and our users who have contributed in the form of patches, bug reports and suggestions.

QXmpp 0.2.0 Release:

Project Page | Changelog | Readme | API Documentation | Download