Wednesday, September 12, 2012

安装puppet

最近在学习使用puppet来进行配置管理主机操作系统是centos 所以安装puppet还是比较顺利的有现成的package可以使用直接几个yum install就搞定了当然这边需要几个ruby的依赖主要是ssl，xmlrpc/client 遇到的主要问题是： dnsdomainname: Unknown host 原因：puppet agent默认会搜寻一个名为puppet的服务器可以加一条puppet DNS CNAME记录或者在所有的agent上指定puppet master主机名解决方案 1，把本机的hostname加入到hosts中去 2，加puppet到127.0.0.1 另外可以使用dnsdomainname命令来检查是否主机自身的hostname不能被解析到

Monday, August 27, 2012

virsh connect remote vm host in mac osx box

Just steps here:

1, brew install libvirt
2, generate ssh key, and copy your public key to your remote vm host
3, virsh -c qemu+ssh://name@host/system

There is an issue when connect remote debian/ubuntu host, an error 'hangout event in socket', since mac assume libvirt socket resides in /usr/local/lib, but actually it's in /var/run/libvirt/libvirt-sock, so when connect to debian/ubuntu kvm host, pass a variable socket,

qemu+ssh://user@remote-host/system?socket=/var/run/libvirt/libvirt-sock

Bingo, that's it Refer: http://jedi.be/blog/2011/09/13/libvirt-fog-provider/

Wednesday, August 8, 2012

custom template filter

Django comes with many filters and tags, but we can create new filters/tags ourselves, and make them available in template using

{% load %}

first create a new directory templatetags at the same level with views.py and models.py, don't forget to touch a new file __init__.py in that new created directory, then you can create a new module/file which will be the name used as filter's name later(be careful to pick a name to avoid collision with exist filter names). For Example, create a new file in templatetags directory as extra_filters.py, then you can load it in your template as:

{% load extra_filters %}

In order to make your new created filter valid, the module must contain a module level variable register which is a template.Library instance,

from django import template
register = template.Library()

And you should register the filter by: if you prefer to use decorator way, you can omit the name='cut' and the filter's name will be the function's name by default, here that's cut.

register.filter('cut', cut) # or

@resigter.filter(name='cut')
def cut(value, args):
    pass

Filters and auto-escaping

when creating custom filters, give some thought to how the filter will interact with django's auto escaping. Three types of strings can be passed around inside the template code,

raw string

Python native str/unicode types, if django's auto escaping is on, they will be escaped, otherwise keep unchanged

Safe string

They are of type SafeString/SafeUnicode, they are share the same parent SafeData. They commonly used for output includes raw HTML, and the necessary escape has been done.

String marked as 'needing escaping'

These strings are always be escaped regardless of autoescaping is on or not. But they will be escaped only once.

The template filters will fall into two situations:
1, It won't introduce HTML unsafe characters, django will take care of it, all you need to do is set is_safe to True,

@register.filter(is_safe=True)
def my_filter(value):
    return value

This flag tells django that if a safe string is passed in, the result will be a safe one, otherwise django will escape the result for you. Be careful if this flag is set to True, the result will be a string instead of other python types such as bool, list, etc.

2, Alternatively, your filter code can manually take care of any necessary escaping. This is necessary when you're introducing new HTML markup into the result. You want to mark the output as safe from further escaping so that your HTML markup isn't escaped further, so you'll need to handle the input yourself.

To mark the output as a safe string, use django.utils.safestring.mark_safe().

Be careful, though. You need to do more than just mark the output as safe. You need to ensure it really is safe, and what you do depends on whether auto-escaping is in effect. The idea is to write filters than can operate in templates where auto-escaping is either on or off in order to make things easier for your template authors.

In order for your filter to know the current auto-escaping state, set the needs_autoescape flag to True when you register your filter function. (If you don't specify this flag, it defaults to False). This flag tells Django that your filter function wants to be passed an extra keyword argument, called autoescape, that is True if auto-escaping is in effect and False otherwise.

For example, let's write a filter that emphasizes the first character of a string:

from django.utils.html import conditional_escape
from django.utils.safestring import mark_safe

@register.filter(needs_autoescape=True)
def initial_letter_filter(text, autoescape=None):
    first, other = text[0], text[1:]
    if autoescape:
        esc = conditional_escape
    else:
        esc = lambda x: x
    result = '<strong>%s</strong>%s' % (esc(first), esc(other))
    return mark_safe(result)

The needs_autoescape flag and the autoescape keyword argument mean that our function will know whether automatic escaping is in effect when the filter is called. We use autoescape to decide whether the input data needs to be passed throughdjango.utils.html.conditional_escape or not. (In the latter case, we just use the identity function as the "escape" function.) The conditional_escape() function is like escape() except it only escapes input that is not a SafeData instance. If a SafeData instance is passed to conditional_escape(), the data is returned unchanged.

Finally, in the above example, we remember to mark the result as safe so that our HTML is inserted directly into the template without further escaping.

There's no need to worry about the is_safe flag in this case (although including it wouldn't hurt anything). Whenever you manually handle the auto-escaping issues and return a safe string, the is_safe flag won't change anything either way.

ps: https://docs.djangoproject.com/en/1.4/howto/custom-template-tags/

Monday, August 6, 2012

python virtualenv

There are already lots of articles telling us why we need virtualenv to provide a separate environment(sandbox), such as different projects may depends on different libraries, or you don't have the write permission to the /path/to/python/site-packages directory.

I'm using pip to install virtualenv, it's pretty easy,

pip install virutalenv

then create an sandbox for my project, with the option --no-site-packages to isolate from main site-packages,

virutalenv --no-site-packages social-map

finally, i cloned my project directly into social-map. To active this environment,

source social-map/bin/active

if you want to deactive this environment, just type deactive. I don't know how to deactive the envrionment before, what i did was exit the terminator, then open a new one, very silly...

we can check the settings as `which python` etc. If you check the active script in ~/your/env/bin/, you can find this, virtual will change your path, and put the virtualenv's path at the beginning of the path, it will change your terminer's PS1 also.

VIRTUAL_ENV="/Users/pengphy/Codes/virtualenvs/social-map"
PATH="$VIRTUAL_ENV/bin:$PATH"

http://www.doughellmann.com/articles/pythonmagazine/completely-different/2008-02-ipython-and-virtualenv/index.html

http://mitchfournier.com/2010/06/25/getting-started-with-virtualenv-isolated-python-environments/

Wednesday, August 1, 2012

Using Redis Data Types

当我们需要更好地使用Redis时, 需要了解一下Redis支持的数据类型. Redis常被认为是常驻内存的key-value存储, 事实上, Redis将所有的数据都存在内存里, 而且为了持久化存储, 会把数据写入到磁盘(Redis会根据有多少个key值修改过, 将数据写到磁盘上, 默认的是, 1分钟内1000+keys修改过, 或者15分钟内9-keys被修改), 可是, Redis并不仅仅是一种key-value存储. Redis支持五种数据结构, 在这五种数据结构中, 只有一种是典型的key-value结构. 理解这些数据结构, 并且利用这些数据结构来建模解决问题, 对于高效地利用Redis至关重要.

Reids支持内建的数据类型, 使得开发人员可以更有意义地构建数据, 这和大多数其它的NoSQL/key-value存储解决方案不同. 利用这些内建的数据类型, 在处理数据类型相关的操作时为Redis带来了意想不到的好处, 处理这些操作时, 会比在Redis之外处理更有效, 也更快.

在深入了解这些数据结构之前, 我们需要先了解一些东西, 并且在设计key的结构时记住这些.

定义的key空间需要保持一致, key可以是任意的字符, 我们可以利用一些分隔符来划分出一些命名空间. 常见的例子就是用冒号来分割, cache:project:319:tasks.
定义key的时候, 尽量地找一个合适程度, 并且含义明确的key.

Strings
The simplest data type in Redis is a string. Strings are also the typical (and frequently the sole) data type in other key-value storage engines. You can store strings of any kind, including binary data. You might, for example, want to cache image data for avatars in a social network. The only thing you need to keep in mind is that a specific value inside Redis shouldn’t go beyond 512MB of data.
EG, set users:leto "{name: leto, planet: dune, likes: [spice]}"

Lists
Lists in Redis are ordered lists of binary safe strings, implemented on the idea of a linked list. 给定一个index, 找某个特定的值, 这个操作比较低效, 在头尾操作的效率会比较高. You might want to use lists in order to implement structures such as queues, a recipe for which we’ll look into later in the book.

Hashes
Much like traditional hash tables, hashes in Redis store several fields and their values inside a specific key. Hashes are a perfect option to map complex objects inside Redis, by using fields for object attributes (example fields for a car object might be “color”, “brand”, “license plate”). EG.
hset users:goku powerlevel 9000
hget users:goku powerlevel
Sets and Sorted Sets
Sets in Redis are an unordered collection of binary-safe strings. Elements in a given set can have no duplicates. For instance, if you try to add an element wheel to a set twice, Redis will ignore the second operation. Sets allow you to perform typical set operations such as intersections and unions.
While these might look similar to lists, their implementation is quite different and they are suited to different needs due to the different operations they make available. 内存使用上, 比list用的更多.

Sorted sets are a particular case of the set implementation that are defined by a score in addition to the typical binary-safe string. This score allows you to retrieve an ordered list of elements by using the ZRANGE command.

BIG O

Redis documentation tells us the Big O notation for each of its commands. It also tells us what the factors are that influence the performance. Let's look at some examples.

The fastest anything can be is O(1) which is a constant. Whether we are dealing with 5 items or 5 million, you'll get the same performance. The sismember command, which tells us if a value belongs to a set, is O(1). sismember is a powerful command, and its performance characteristics are a big reason for that. A number of Redis commands are O(1).

Logarithmic, or O(log(N)), is the next fastest possibility because it needs to scan through smaller and smaller partitions. Using this type of divide and conquer approach, a very large number of items quickly gets broken down in a few iterations. zadd is a O(log(N)) command, where N is the number of elements already in the set.

Next we have linear commands, or O(N). Looking for a non-indexed row in a table is an O(N) operation. So is using the ltrim command. However, in the case of ltrim, N isn't the number of elements in the list, but rather the elements being removed. Using ltrim to remove 1 item from a list of millions will be faster than using ltrim to remove 10 items from a list of thousands. (Though they'll probably both be so fast that you wouldn't be able to time it.)

zremrangebyscore which removes elements from a sorted set with a score between a mini- mum and a maximum value has a complexity of O(log(N)+M). This makes it a mix. By reading the documentation we see that N is the number of total elements in the set and M is the num- ber of elements to be removed. In other words, the number of elements that'll get removed is probably going to be more significant, in terms of performance, than the total number of elements in the list.

The sort command, which we'll discuss in greater detail in the next chapter has a complexity of O(N+M*log(M)). From its performance characteristic, you can probably tell that this is one of Redis' most complex commands.

There are a number of other complexities, the two remaining common ones are O(Nˆ2) and O(CˆN). The larger N is, the worse these perform relative to a smaller N. None of Redis' commands have this type of complexity.

Monday, July 30, 2012

implementation of Redis strings(sds.c)

Hacking Strings(copied from redis.io, added two functions to this article)

The implementation of Redis strings is contained in sds.c (sds stands for Simple Dynamic Strings).

The C structure sdshdr declared in sds.h represents a Redis string:

struct sdshdr {
    long len;
    long free;
    char buf[];
};

The buf character array stores the actual string.

The len field stores the length of buf. This makes obtaining the length of a Redis string an O(1) operation.

The free field stores the number of additional bytes available for use.

Together the len and free field can be thought of as holding the metadata of the buf character array.

Creating Redis Strings

A new data type named sds is defined in sds.h to be a synonym for a character pointer:

typedef char *sds;

There are two inline function in sds.h, those are sdslen and sdsavail, these two functions are used to get Redis string's length and available space.

static inline size_t sdslen(const sds s) {
struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr))); // get the sds structure address
return sh->len;
}

static inline size_t sdsavail(const sds s) {
struct sdshdr *sh = (void*)(s-(sizeof(struct sdshdr)));
return sh->free;
}

sdsnewlen function defined in sds.c creates a new Redis String:

sds sdsnewlen(const void *init, size_t initlen) {
    struct sdshdr *sh;

    sh = zmalloc(sizeof(struct sdshdr)+initlen+1);
#ifdef SDS_ABORT_ON_OOM
    if (sh == NULL) sdsOomAbort();
#else
    if (sh == NULL) return NULL;
#endif
    sh->len = initlen;
    sh->free = 0;
    if (initlen) {
        if (init) memcpy(sh->buf, init, initlen);
        else memset(sh->buf,0,initlen);
    }
    sh->buf[initlen] = '\0';
    return (char*)sh->buf;
}

Remember a Redis string is a variable of type struct sdshdr. But sdsnewlen returns a character pointer!!

That's a trick and needs some explanation.

Suppose I create a Redis string using sdsnewlen like below:

sdsnewlen("redis", 5);

// client should call sdsnew function instead of this,

// since sdsnew will calculate the length of the init string for us

// reduce the risk of getting the wrong size

This creates a new variable of type struct sdshdr allocating memory for len and free fields as well as for the buf character array.

sh = zmalloc(sizeof(struct sdshdr)+initlen+1); // initlen is length of init argument.

After sdsnewlen successfully creates a Redis string the result is something like:

-----------
|5|0|redis|
-----------
^        ^
sh      sh->buf

sdsnewlen returns sh->buf to the caller.

What do you do if you need to free the Redis string pointed by sh?

You want the pointer sh but you only have the pointer sh->buf.

Can you get the pointer sh from sh->buf?

Yes. Pointer arithmetic. Notice from the above ASCII art that if you subtract the size of two longs from sh->buf you get the pointer sh.

The sizeof two longs happens to be the size of struct sdshdr.

Look at sdslen function and see this trick at work:


size_t sdslen(const sds s) {
    struct sdshdr *sh = (void*) (s-(sizeof(struct sdshdr)));
    return sh->len;
}

Knowing this trick you could easily go through the rest of the functions in sds.c.

The Redis string implementation is hidden behind an interface that accepts only character pointers. The users of Redis strings need not care about how its implemented and treat Redis strings as a character pointer.

Thursday, July 26, 2012

git branch(翻译, 未完）

Git branch

几乎所有的版本控制系统都会支持分支(branch)，这意味着我们可以通过创建一个新的分支来进行一些feature的开发，同时避免把主干代码给搞乱。很多版本控制系统创建分支是一个很吃力的过程，通常需要创建一个新的目录，对于大型的项目来说，光建立一个分支就会把人给整残废了。

What a Branch is?

为了更好地理解git分支管理，我们首先需要了解一下git是怎样来存储数据的。git在存储数据时，并不是只存每次变化的部分，而是存储了一系列的快照。当你commit的时候，git存储了一个commit对象，这个对象包括了一个指针，这个指针指向了staged area内容的快照，还有commit的作者，commit时候写的message，0个或者多个指向本次commit的parent commit。

Pengfei Xue