Monday, July 30, 2012

implementation of Redis strings(sds.c)


Hacking Strings(copied from redis.io, added two functions to this article)

The implementation of Redis strings is contained in sds.c (sds stands for Simple Dynamic Strings).
The C structure sdshdr declared in sds.h represents a Redis string:
struct sdshdr {
    long len;
    long free;
    char buf[];
};
The buf character array stores the actual string.
The len field stores the length of buf. This makes obtaining the length of a Redis string an O(1) operation.
The free field stores the number of additional bytes available for use.
Together the len and free field can be thought of as holding the metadata of the buf character array.

Creating Redis Strings

A new data type named sds is defined in sds.h to be a synonym for a character pointer:
typedef char *sds;
There are two inline function in sds.h, those are sdslen and sdsavail, these two functions are used to get Redis string's length and available space.


static inline size_t sdslen(const sds s) {
    struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr))); // get the sds structure address
    return sh->len;
}



static inline size_t sdsavail(const sds s) {
    struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr)));
    return sh->free;
}                                                        



sdsnewlen function defined in sds.c creates a new Redis String:
sds sdsnewlen(const void *init, size_t initlen) {
    struct sdshdr *sh;

    sh = zmalloc(sizeof(struct sdshdr)+initlen+1);
#ifdef SDS_ABORT_ON_OOM
    if (sh == NULL) sdsOomAbort();
#else
    if (sh == NULL) return NULL;
#endif
    sh->len = initlen;
    sh->free = 0;
    if (initlen) {
        if (init) memcpy(sh->buf, init, initlen);
        else memset(sh->buf,0,initlen);
    }
    sh->buf[initlen] = '\0';
    return (char*)sh->buf;
}
Remember a Redis string is a variable of type struct sdshdr. But sdsnewlen returns a character pointer!!
That's a trick and needs some explanation.
Suppose I create a Redis string using sdsnewlen like below:
sdsnewlen("redis", 5); 
// client should call sdsnew function instead of this, 
// since sdsnew will calculate the length of the init string for us
// reduce the risk of getting the wrong size
This creates a new variable of type struct sdshdr allocating memory for len and free fields as well as for the buf character array.
sh = zmalloc(sizeof(struct sdshdr)+initlen+1); // initlen is length of init argument.
After sdsnewlen successfully creates a Redis string the result is something like:
-----------
|5|0|redis|
-----------
^        ^
sh      sh->buf
sdsnewlen returns sh->buf to the caller.
What do you do if you need to free the Redis string pointed by sh?
You want the pointer sh but you only have the pointer sh->buf.
Can you get the pointer sh from sh->buf?
Yes. Pointer arithmetic. Notice from the above ASCII art that if you subtract the size of two longs from sh->buf you get the pointer sh.
The sizeof two longs happens to be the size of struct sdshdr.
Look at sdslen function and see this trick at work:

size_t sdslen(const sds s) {
    struct sdshdr *sh = (void*) (s-(sizeof(struct sdshdr)));
    return sh->len;
}
Knowing this trick you could easily go through the rest of the functions in sds.c.
The Redis string implementation is hidden behind an interface that accepts only character pointers. The users of Redis strings need not care about how its implemented and treat Redis strings as a character pointer.



Thursday, July 26, 2012

git branch(翻译, 未完)

Git branch

几乎所有的版本控制系统都会支持分支(branch),这意味着我们可以通过创建一个新的分支来进行一些feature的开发,同时避免把主干代码给搞乱。很多版本控制系统创建分支是一个很吃力的过程,通常需要创建一个新的目录,对于大型的项目来说,光建立一个分支就会把人给整残废了。

What a Branch is?

为了更好地理解git分支管理,我们首先需要了解一下git是怎样来存储数据的。git在存储数据时,并不是只存每次变化的部分,而是存储了一系列的快照。当你commit的时候,git存储了一个commit对象,这个对象包括了一个指针,这个指针指向了staged area内容的快照,还有commit的作者,commit时候写的message,0个或者多个指向本次commit的parent commit。

Tuesday, July 24, 2012

open a new tab in gnome terminal

#!/bin/sh

DB1BUGS="mysql xxx"
DB1HW="mysql xxx"
LiveBUGS="mysql xxx"
LiveHW="mysql xxx"
LocalBUGS="mysql xxx"
LocalHW="mysql xxx"

CMD="gnome-terminal --maximize"

for cmd in DB1HW DB1BUGS LiveHW LiveBUGS LocalHW LocalBUGS
    do  
        eval cmd_str='$'$cmd;
        CMD="$CMD --tab --title $cmd -e \"$cmd_str\""
    done                                                                                                                                                      

eval $CMD

exit

Monday, July 16, 2012

Friday, July 13, 2012

import

import可以有一下几种形式:
  • import A (推荐都使用这种方式)
        将A导入到当前的namespace
        变种可以时 import A.B.C.D 如果是这种情况 ABC必须是Package 而D只能是一个模块 或者时一个Pakcage

  • from A import B
        将B导入到当前的namespace 当改变B时 对B作出的修改不会影响到A.B

  • X = __import__('X') (运行时才知道X的名称)

  • from A import *
        将A中所有的public name引入到当前的namespace public name在这里指的是name明称不以_(下划线)开头的那些name


import vs loading

一个模块不管被imoprt多少次 它只会被load一次 如果有执行代码在这个模块里面 那么只有第一次被加载的时候会运行 之后import则不会去运行未包装的执行代码



What does python do to import a Module?
1, check sys.modules to see if module is already imported, if that's the case, python will use the existing module object as is.
2, otherwise, create a new , empty module object
3, insert that module in sys.modules
4, Load the module code object
5, Execute the module code object in the new module's namespace. All variables assigned by the code will be available via the module object.


Importing * From a Package

The only solution is for the package author to provide an explicit index of the package. The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it, if they don’t see a use for importing * from their package.


If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py. It also includes any submodules of the package that were explicitly loaded by previous import statements. Consider this code:
import sound.effects.echo
import sound.effects.surround
from sound.effects import *
In this example, the echo and surround modules are imported in the current namespace because they are defined in the sound.effects package when the from...import statement is executed. (This also works when __all__ is defined.)

Thursday, July 12, 2012

upgrate bugzilla to 4.2 from 3.6

Today I decided to upgrade my local bugzilla from bz3.6 to bz 4.2, and got lots of errors and headaches for sure. The following are the bugs:

1, xmlrpc support are disabled, this is caused by failed to load required modules, to fix this is very easy, i just ran install all modules, may you are interesting with how i found this, we are using bugzilla xmlrpc for updating our data, after failed several times, and double checked our code, we thought it's bugzilla installation issues.

2, DB schema didn't match, first i delete record in bz_schema table, then ran checksetup.pl, then some errors were emerging, basically they were foreign key constrain issues, so I deleted those foreign keys from related tables, believe me, that's a huge task

3, there were some invalid data in db already, so if that's true, correct those are delete them


Monday, July 9, 2012

opera interview

General:
1, Regular expression
2, how to retrieve the main body of a html page
3, write a B(tree, *, +)
4, how to implement a long HTTP connection if proxy doesn't support connection header


PY related:
1, import details
2, yield details
3, meta programming


Mysql related:
1, mysql index